CODICE IDENTIFICATIVO GARA (CIG): 7408062E24 CODICE UNICO DI PROGETTO (CUP): B52F16004600006
Indagine di mercato
per l’affidamento di un contratto inerente la fornitura del servizio di “Design and Development of the SCHEME cloud service per l’infrastruttura di ricerca E-RIHS (European Research Infrastructure for Heritage Science”, ai sensi dell’art. 36, comma 2, lett. b) del D. Lgs. 50/2016 tramite RDO aperta su MEPA
CODICE IDENTIFICATIVO GARA (CIG): 7408062E24 CODICE UNICO DI PROGETTO (CUP): B52F16004600006
Si rende noto che l’Istituto Nazionale di Ottica del CNR (di seguito indicato CNR-INO), in seguito a determina a contrattare prot. Cnr-Ino 2190 del 02/03/2018, nel rispetto dei principi enunciati agli artt. 29 e 30 del D. Lgs. 50/2016, intende procedere ad indagine di mercato finalizzata ad individuare gli Operatori economici da invitare alla successiva procedura di affidamento, tramite procedura negoziata di cui all’art. 36 del D.Lgs. 50/2016, per la fornitura di un servizio di “Design and development of the SCHEME cloud service” per l’infrastruttura E-RIHS PP che risponda a due principali funzioni: SCRE (Semantic Content Retrieval Engine) e THH (The Heritage Hub).
La fornitura da svolgere nell’ambito del progetto “The European Research Infrastructure for Heritage Science Preparatory Phase” E-RIHS PP, Grant Agreement nr. 739503, dovrà rispondere alle seguenti caratteristiche:
Motivation and main goals
E-RIHS is a Research Infrastructure (RI) focused on the preservation of the world’s heritage by providing cross-cutting edge researchers, knowledge, capabilities and infrastructures.
E-RIHS is a publicly funded RI and it is therefore crucial to create a digital hub for reseach and cooperation in Heritage Science, acting as an access point for information, services, good practices, training and virtual meetings for stakeholders to discuss topics of interest. This activity is part of the preparatory phase (as in the G.A, WP10, task 10.3).
SCHEME is a cloud service and the acronym means “Semantic Content retrieval engine for the Heritage hub EmpowErment”. It will serve a double purpose for E-RIHS: on the one hand it must allow the automatic retrieval, curation and reuse (e.g. on the Infrastructure site) of web content fetched from multiple sources and identified as “interesting” through semantic and NLP algorithms; on the other it will be used as a tool to empower the collective work of social communities through the automatic retrieval and sharing of valuable content, gathered from the web and related to topics of interest, and the provision of collaboration and networking tools.
SCHEME must be implemented around two main functionalities:
- a Semantic Content Retrieval Engine (SCRE) and
- a collaboration platform, named The Heritage Hub (THH).
These two services are strongly related, since the automatic retrieval, selection and curation of content of interest will be the main asset through which collaboration is intended to function for the communities involved in the Infrastructure. Herein we describe in details these two services and how they must be integrated to provide the maximum collaboration experience for E-RIHS users.
All SCHEME components (SCRE and THH) must be designed as cloud services, completely accessible, both for their front-end and back-end interfaces, with a modern and standard web browser.
All softwares must be developed preferably by using a whole stack of open source components and libraries: the use of Open Source software will be considered an extra value and it will represent one of the voice (“Extra conditions”) to be evaluated in the technical proposal. The use of specific commercial cloud services (e.g. for semantic analysis) is admitted but must be clearly justified.
SCHEME will be installed on servers, using a high availability strategy, provided by E-RIHS.
Functionalities
The provision of a powerful and engaging collaboration platform (The Heritage Hub/THH) is the core of the present project. It is fundamental to point out that collaboration is centered in SCHEME around topics of
interest: in order to better engage users and to encourage them to visit frequently THH, it is strategic to provide valuable contents, linked to the broad nature of E-RIHS activities. By searching the Internet, one can find a continuous flux of articles of interest published on the websites of research centres, scientific magazines, online newspapers etc; this, together with the ever increasing amount of User Generated Content in Social Networks and personal blogs, represents a goldmine of valuable content that can be republished in THH to feed discussions amongst researchers.
While this strategy is certainly interesting and might represent a practical solution to encourage recurring visits to THH, it is also significantly expensive and slow to be implemented, if all research and content curation is done manually. Paramount to the success of this vision is the ability to automatize and make configurable the whole process of scanning the web in search of content, by selecting it through a semantic engine based on its main topics and by cleaning its HTML code, in order to ease its re- use/publication, on THH and on other websites. Manual intervention should be therefore limited to a minimum: once configuration has been applied, the engine must work seamlessly by searching, selecting and returning valuable articles and posts from the Internet.
The Semantic Content Retrieval Engine (SCRE) is the enabling component in SCHEME that must allow performing automatic and semi-automatic curation of web content, fetched from multiple sources and identified as items of interest through semantic and NLP algorithms.
SCRE must not be implemented as a content management system but should act as a “smart mediator” for the destination platforms and websites (THH to start with), where content will be published. Fetched content must be provided to the destination platforms via APIs, e.g. standard syndication protocols like RSS or ATOM or through REST web services.
SCRE must be highly configurable through a web dashboard, in order to easily allow both the addition of new interesting sources to be monitored and the definition of semantic rules that identify whether or not a piece of content can be of interest for the destination platforms.
When a piece of content is fetched and considered “valuable”, it must be also pre-processed in a way that allows its seamless integration on the destination sites. All decorative elements from the source page (e.g. header, footer, navigation bars and menus, breadcrumbs, advertisement, related articles,..) must be automatically removed, in order to return an HTML code that is as clean as possible: this way, publishing on the destination platforms does not disrupt their presentation and layout.
SCRE must be also designed as a multi-tenant cloud service, in order to allow its integration on different destination platforms. For each one of them, it must be possible to configure individual and independent rules for fetching, filtering and reusing content.
THH on the other hand must be designed as an easy to use but powerful collaboration platform. By mimicking the interactions of people in social networks, it must allow to easily build communities of researchers sharing specific and multiform interests and freely exchanging information in an agile but effective way.
As said, in THH collaboration will revolve around contents of interest that will be mostly acquired automatically through the SCRE service: its semantic features will allow presenting only information related to the main themes of the project. The vision is to stimulate discussions and exchange of ideas around topics of interest. THH therefore must provide services to allow partners to communicate and collaborate, include web forums, document management services, personal messaging, etc. These tools will be also used to promote and collect feedback to ensure the project is tracking the reality of the evolving needs of the communities and the point of view of all stakeholders.
Technical Requirements for SCRE
SCRE must be seen as a smart mediation engine, that allows gathering contents of interest from the web: it should not provide any front-end interface since publishing must be completely based on “output channels”, that is an API layer, based on standard syndication protocols (RSS, ATOM) or JSON/REST web services.
From a functional point of view,the system must support the following features:
Configurable publishing rules
SCRE must provide a configurable logic for the whole publishing workflow. In particular, through the configuration dashboard, operators can:
- define the “sources” of content;
- define the “output channels”, each one with a unique ID, that provide selected content for destination sites;
- associate sources to output channels;
- define for each output channel a set of semantic rules (see below) to specify whether content should be considered for publishing or discarded.
In particular SCRE must support the retrieval of content from at least the following sources:
- Websites through crawling/spidering;
- Syndication feed, via standard protocols (RSS, ATOM);
- Twitter, with all the possible options that its APIs provide (e.g. search of content through keywords, mentions or hashtags, retrieval of posts from single accounts, geographical search, etc);
- Facebook through all the possible options that its APIs provide;
Publishing must be based on API keys in order to ensure that SCRE can only be used by authorized parties.
Semantic rules and classification mechanism
Semantic Rules allow specifying whether or not a piece of content, retrieved from a source, must be maintained and associated to an output channel, or discarded. Their definition can be based at least on the following selection mechanisms:
- Textual keywords, including hashtags, that the fetched text must contain;
- Associated concepts, taken from a widely used and all-compassing controlled dictionary (e.g. an “upper ontology” or a general purpose collection like Wikipedia). Each concept can have associated a weight: in this case the relevancy of the retrieved piece of content is measured by how its text presents concepts that are semantically similar/near to those defined in the rule, by using semantic and NLP algorithms. It does not necessarily work on a direct correspondence: a post can satisfy a rule even if it does not directly contain the concepts defined in it but some other entity that is somewhat (and more or less directly) in association with them. The weight mechanism should allow defining stricter or looser associations between the concepts specified in the rule and those identified in the text;
- Language filtering rules, to allow the selection of posts based on the languages in which they are written. English, French and Italian must be supported, while the availability of filters for German, Spanish, and other languages.
SCRE must also provide classification mechanisms, based on configurable taxonomies. In particular, the service must allow the automatic classification of the retrieved content based on the entries of these ad- hoc taxonomies.
SCRE taxonomies must be also shared with THH.
Multitenancy
SCRE must support independent working groups, where each one of them can define their own publishing rules without interfering with the configurations of the others. It must support the feature through specific user management functionalities that also support role-based access control mechanisms.
Description of the configuration workflow on SCRE
SCRE must be a highly configurable platform: while powerful in features, its back-end interfaces are designed with usability in mind. In particular, it is important to provide a direct feedback to the operator for every operation performed on the platform.
The steps for configuring SCRE can be described as envisioned in the sequence that follows:
- An operator creates an output channel for a specific initiative (e.g. a THH Virtual Folder: see below): the back-end interface provides the details of the endpoint for retrieving the channel content.
- The operator firstly selects the possible web sources for the web content. Secondly, the operator defines the type of the source (e.g. a web site) and all the relevant parameters to configure the content retrieval process (e.g. the URL of the site, the depth of the crawling process, the frequency of the acquisition, etc. A “sample” of the retrieved content can be immediately visualized: if it is ok, the source is saved and associated to the channel. This process is repeated for every interesting source to be analyzed (e.g. a Twitter search). As the operator adds the sources, an immediate feedback of the composition of the channel is visible.
- The operator defines the semantic rules for the output channel. This can be done by:
- defining language filters, that allow fetching only content in a specified language
- specifying keywords that the text must include
- defining a semantic rule with a list of topics that the content must include, directly or indirectly (e.g. with similar concepts) in its text.
Again, once the rules have been specified, the operator should have an immediate feedback of how they apply on the content of the current channel.
- The operator builds a taxonomy as a hierarchy of categories and defines for each one of them a semantic rule based on a list of concepts and an associated weight. If the retrieved content satisfies the rule, it will be automatically classified with the corresponding category.
Technical Requirements for THH
THH must be designed as a user collaboration platform, allowing a community to share specific and multiform interests, to fully and successfully communicate and interact.
THH must allow users to share interests, information and content. The creation of relationships revolves around areas of common interests: all THH functionalities must be designed to ease communication and to allow the exchange of information between users in an agile and friendly way.
The functionalities that the platform must provide are:
- Users must be able to join the platform by directly registering on it or importing their profiles from the most common Social Networks (e.g. Google+ or Facebook). Single sign-on with the aforementioned Social Networks must be also ensured.
- Virtual folders: to ease the grouping of related activities, a customizable and general classification mechanism must be provided. Each content and service of THH can be associated to a Virtual Folder (VF).
- SCRE integration: streams from content channels exposed by SCRE must be imported automatically. They can be also automatically associated to a VF. The main taxonomies used in THH must be synchronised with those defined in the SCRE service.
- General Content Dashboard (GCD), where articles taken from the web through SCRE (but also by editing content directly in THH) will be published. Articles can be classified thematically, allowing THH operators to provide extra categories to those automatically proposed by SCRE. They can be also associated to a VF.
- Personalised Content Dashboard. It allows users to create a personal collection of the articles published in the GCD: the association can be either performed manually (that is, the user explicitly selects the articles of interest) or can be based on simplified semantic rules. The term “simplified” here implies that the process of defining the rule must be intuitive and easy to perform for the final user. Finally, the user must have a personal and private classification mechanism, based on simple mechanisms like tags, to better arrange the content of collection.
- VF Stream: it provides a quick up-to-date view of all content and activities classified for the specific VF.
- Web Forums to allow users to be engaged in online discussions through sequences of messages and related answers: it should be possible to comment or “like” each post. Web Forums can be associated to a specific VF.
- Cloud-based file-system service, where documents (Office files, text and PDF documents, multimedia content like images) can be uploaded and shared with other users. It should be also possible to create and share a Wiki Page, that can be directly created and edited via a web browser.
- Search: all content published on THH, can be searched, by using both a simple and an advanced interface (including text-based search and faceted filtering). It must be also possible to find users, both by their personal information and by concepts, that is by specifying their interests, either defined explicitly in their profiles or "inferred" implicitly according to their activities on the hub platform.
- Messages: to allow direct and private communication between users.
Advanced collaboration features on content, e.g. the possibility to take and share notes directly on web pages of the THH.
Note that the proposal must be written in English only, other languages are not accepted.
Stazione appaltante
Consiglio Nazionale delle Ricerche – Istituto Nazionale di Ottica – Sede di Xxxxxxx Xxxxx Xxxxxx Xxxxx 0
50125 Firenze
Tipologia della procedura
Procedura negoziata di cui all’art. 36 del D.Lgs. 50/2016 con consultazione di almeno cinque operatori economici mediante “Richiesta di Offerta” (RDO) nell’ambito del Mercato Elettronico della Pubblica Amministrazione (Me.P.A.), ai sensi dell’art.36 comma 2 lettera b) del D. Lgs. 50/2016.
Importo a base d’asta
Euro 80.000,00 (Ottantamila/00) non imponibile IVA ai sensi dell’art. 72 del DPR 633/72.
Criterio di aggiudicazione
Per l’aggiudicazione verrà adottato il criterio dell’offerta economicamente più vantaggiosa ai sensi dell’art. 95 del D.Lgs. 50/2016.
Requisiti di partecipazione
Possono partecipare alla procedura gli operatori economici in possesso dei requisiti di ordine generale e di idoneità professionale di cui agli artt. 80 e 83 del D. Lgs. 50/2016 e che non si trovino in alcuna delle cause di esclusione previste dall’art. 80 del medesimo D. Lgs.
Gli operatori economici dovranno essere iscritti nella piattaforma Me.P.A. al Bando “SERVIZI - SERVIZI PER L’INFORMATION AND COMMUNICATION TECHNOLOGY”
Procedura di partecipazione
La manifestazione di interesse, sottoscritta dal legale rappresentante della ditta, dovrà pervenire entro le ore 23:59 del 28.03.2018 tramite posta elettronica certificata al seguente indirizzo xxxxxxxxxx.xxx@xxx.xxx.xx specificando nell’oggetto: “Design and development of the SCHEME cloud services di E-RIHS”, CIG 7408062E24”.
Alla manifestazione di interesse non dovrà essere allegata alcuna offerta economica.
Le manifestazioni di interesse pervenute oltre il termine perentorio di scadenza sopra indicato, saranno automaticamente escluse dalla procedura di selezione.
Non saranno ammesse istanze incomplete o sottoscritte con modalità non conformi a quanto indicato.
Responsabile Unico del Procedimento
Il Responsabile Unico del Procedimento, nominato ai sensi dell’art. 31 del D. Lgs. 50/2016, è il Xxxx. Xxxx Xxxxxxx, tel. 000 0000 000, e-mail: xxxx.xxxxxxx@xxx.xx
Individuazione dell’affidatario
In esito all’indagine di mercato di cui al presente avviso, il CNR-INO, acquisita la disponibilità dei soggetti interessati, avvierà con gli stessi una procedura negoziata ai sensi dell’art. 36 del D. Lgs. 50/2016 e s.m.i. mediante Richiesta di Offerta sul Me.P.A.
Ulteriori informazioni
L’indagine di mercato, di cui al presente avviso, ha lo scopo di favorire la consultazione e la partecipazione di operatori economici, mediante acquisizione di espressa manifestazione di interesse da parte degli stessi ad essere invitati a partecipare alla gara.
Il presente avviso è finalizzato unicamente ad esperire una indagine di mercato e pertanto non costituisce proposta contrattuale, né sollecitazione a presentare offerte e non comporta diritti di prelazione o preferenza, né impegni o vincoli di alcun tipo per il CNR-INO.
Il CNR-INO si riserva, in qualunque momento, di interrompere, revocare, sospendere, modificare la presente procedura e di non aggiudicare la stipula del contratto, qualora ne ravvisi l’opportunità dandone comunicazione alle imprese concorrenti senza che i soggetti istanti possano avanzare alcuna pretesa in relazione al procedimento avviato.
Trattamento dei dati personali
I dati personali, acquisiti dal CNR-INO (titolare del trattamento) saranno utilizzati esclusivamente per il compimento delle attività previste dalla legge e per il raggiungimento delle finalità istituzionali dell’Istituto. Il conferimento dei dati è strettamente funzionale allo svolgimento di tali attività ed il relativo trattamento verrà effettuato, anche mediante l’uso di strumenti informatici, nei modi e limiti necessari al perseguimento di dette finalità. E’ garantito agli interessati l’esercizio dei diritti di cui all’art. 7 del D.Lgs. 196/03.
Pubblicazione avviso
Il presente avviso è pubblicato sul sito istituzionale xxx.xxx.xxx.xx sezione “Gare e Appalti” - “Gare in corso”.
Il Direttore del CNR-INO Dott. Xxxxx Xx Xxxxxx