Data Summary. The main categories of data foreseen to be collected or generated by MULTI-STR3AM are: • Underlying research data: This category encompasses the data, including associated metadata, forming the basis of results and conclusions presented in scientific articles and in any potential patents arising from the project. To remove any limitations to review and validation of results by the scientific community, green open access (self-archiving) will be the preferred model of publication for scientific articles. Additionally, the underlying data will be deposited in an open repository (independent of the project), which will be linked to in the resulting article. • Operational data: This includes raw or curated data arising from the operation of equipment, for example associated with biomass cultivation, fractionation and purification of microalgae components, and routine analyses of the resultant products (e.g., compositional analyses). Data related to the production process will be used to produce guidelines for optimal performances, quality checks and confirmation checks, which will be of use in the project and in future planned production of algae. This category of data is likely to contain commercially sensitive data; careful consideration will be given to which information can be published openly (e.g., for dissemination purposes) and which should be consideration non-open. Some of this data is also of value for scientific or other publications and presentations and will be treated accordingly. • Impact monitoring data: Primarily in WP5, data will be gathered to assess the social, environmental and economic impact of MULTI-STR3AM and to track the performance of the project against the KPIs set out in the proposal. These data include biorefinery process modelling and data gathered on e.g., feedstock, raw materials, energy, waste and emissions to complete life cycle and social life cycle assessments. Such assessments will be performed according to methodology as defined by ISO 14040/44 and the project impacts measured with the help of computer-based tools such as SimaPro v9 (with Ecoinvent v3.5 database, and others). • Documentation relating to instruments and methods: This category covers documentation needed to implement the project and reproduce its results, including SOPs from each partner for their respective processes and details of tools, methods, instruments and software. This section will describe the kinds of data that each work package will be handli...
Data Summary. The main purpose for the data collection/generation of the CarE-Service project is for the description of new circular economy business models in innovative hybrid and electric mobility through advanced reuse and remanufacturing technologies and services. The CarE-Service project will produce several datasets during the lifetime of the project. All the data which will be collected will be relevant to the purposes of the projects, such as the establishment of circular economy business models, the development of the Smart Mobile Modules, the creation of customer-driven products and the development and validation of technical solutions for reused, remanufactured and recycled components and the evaluation of these business models through demonstration and life cycle assessment (LCA). All the collected or generated data will be analyzed and evaluated from a range of methodological perspectives for project development and engineering and scientific purposes. A range of data will be created during the project. These will be available in a variety of easily accessible formats, including Documents (Word) (DOCX), Spreadsheets (Excel) (XLSX, CSV), Presentation files (Power Point) (PPT), PostScript (PDF, XPS), images, audio and video files (JPEG, PNG, GIF, TIFF, WAV, MPEG, AIFF, OGG, AVI, MP4), Technical CAD drawings (DWG), Origin (OPJ), compressed formats (TAR.GZ, MTZ), Program database (PDB, DBS, MDF, NDF), etc. (see Table 5-1). As no comparable data are available for secondary analysis at the moment, it is planned to make our dataset publicly available in a research data repository. Apart from the research team, the dataset will be useful for other research groups working on eco-innovative circular economy business models on large scale demonstration projects. The following table contains all the datasets that will be generated during the project. The expected size of the datasets produced will be between 5MB and 1GB. For every dataset which will be generated for a task, the leading partner of the task will be the Master of Data. The Master of Data will be responsible for the collection of the data from the other partners, the file and sharing actions among the consortium, the creation of the linked metadata files and also the activities for the publish of the data, e.g. on Zenodo platform. Table 5-1: Potential Datasets Potential datasets – Description Format Dissemi- nation level Master of Data WP1 - Requirements for new business models, services, demonstrators an...
Data Summary. (Outbound Translation) Outbound translation will reuse models for MT and Quality estimation from other WPs of the project. Models for detecting problematic words on the source text will be trained using publicly available data for automatic MT post-editing and synthetic training data generated using already trained translation models. Detection of problematic words strongly depends on the underlying MT system used 17 xxxxx://xxx.xxxXxxx.xx.xx/cics/research-storage/standard-storage 18 xxxxx://xxxxxx.xX.xxxx.xx in Outbound Translation and so do the synthetic training data. Because of that, we do not consider this data re-usable for other purposes and thus will only publish the software that can be used to generate the synthetic data for a particular translation system. During user testing and experimental deployment of the Outbound Translation system, detailed logs will be collected. We believe it will be possible to use the logs to compile datasets which might be useful for qual- ity estimation of automatic MT post-editing. In that case, the dataset will be anonymized and published in the LINDAT/CLARIN19 repository.
Data Summary. 2.1 Purpose of data collection and generation The purpose of data collection and generation in the DECIDER project is to characterise drug resistance mechanisms in high-grade serous ovarian cancer (HGSOC) and suggest effective means to overcome them. The project aims to develop a software tool (Oncodash), where all clinically relevant data from a patient can be viewed easily to aid in making treatment decisions. Research data collected and generated in the project will be made available to the wider research community and/or public to the extent that is possible considering ethical, personal data protection and IPR matters. Patient samples and associated clinical, molecular, and imaging data are collected/generated prospectively before surgery and chemotherapy, between treatments and after disease relapse, and from retrospective samples in order to 1) develop computational tools that pinpoint the drug resistance mechanisms, 2) identify patients who are likely to respond poorly to the current standard therapy as early as possible, and 3) suggest effective, personalised therapies to patients. The focus of DECIDER is on the analysis of samples (tissue, ascites, and blood) obtained from HGSOC patients who have given their consent to use their samples and data in the research conducted in DECIDER. This cohort is subsequently called “prospective” because patients are recruited and treated in parallel to the activities in DECIDER. All samples from the prospective cohort are collected during routine procedures conducted in the Turku University Central Hospital (part of the “Wellbeing Services County of Southwest Finland”, TYKS). The key measurement technologies to be used for the samples are various sequencing technologies, in particular whole-genome sequencing (WGS), exome/panel-sequencing (for blood and reference tissue samples only), shallow sequencing (for blood samples only), RNA-seq and DNA methylation sequencing. Another important data layer for prospective patients is digitalised histopathological images from histopathological samples collected during surgeries. A key aim of DECIDER is to produce at least four digitalised hematoxylin and eosin (H&E) stained histopathological images for all consented patients. The histopathological images are used in conjunction with sequencing data to identify drug resistance mechanisms as well as improve diagnosis and predict treatment response. For a subset of the patients, their tumour burden is measured with (18)F- flu...
Data Summary. 2.1 Purpose of data collection and generation The purpose of data collection and generation in the DECIDER project is to characterise drug resistance mechanisms in high-grade serous ovarian cancer (HGSOC) and suggest effective means to overcome them. The project aims to develop a software tool, where all clinically relevant data from a patient can be viewed easily to aid in making treatment decisions. Research data collected and generated in the project will be made available to the wider research community and/or public to the extent that is possible considering ethical, personal data protection and IPR matters Patient samples and associated clinical, molecular and imaging data are collected/generated prospectively before surgery and chemotherapy, between treatments and after disease relapse, and from retrospective samples in order to 1) develop computational tools that pinpoint the drug resistance mechanisms, 2) identify patients who are likely to respond poorly to the current standard therapy as early as possible, and 3) suggest effective, personalised therapies to patients. The focus of DECIDER is on the analysis of samples (tissue, ascites and blood) obtained from HGSOC patients who have given their consent to use of their samples and data in the research conducted in DECIDER. This cohort is called subsequently as “prospective” because patients are recruited and treated in parallel to the activities in DECIDER. All samples from the prospective cohort are collected during routine procedures conducted in Turku University Central Hospital (TUCH). The key measurement technologies to be used for the samples are various sequencing technologies, in particular whole-genome sequencing (WGS), exome/panel-sequencing (for blood and reference tissue samples only), shallow sequencing (for blood samples only), RNA-seq and DNA methylation sequencing. Another important data layer for prospective patients is digitalised histopathological images from histopathological samples collected during surgeries. A key aim of DECIDER is that all consented patients are accompanied with at least four digitalised hematoxylin and eosin (H&E) stained histopathological images. The histopathological images are used in conjunction with sequencing data to identify drug resistance mechanisms as well as improve diagnosis and predict treatment response. For a subset of the patients, their tumour burden is measured with (18)F-fluorodeoxyglucose Positron Emission Tomography - Computed Tomograp...
Data Summary. 2.1. State the purpose of the data collection/generation The data being used in the RESOLVD project will be oriented to improve knowledge on how power flow behaves in the low voltage in presence of distributed renewable generation and high variability on demand. The general purpose of the project RESOLVD is to act (schedule and control) on the low voltage grid in order to increase efficiency. With this aim, data will serve in the following purposes:
Data Summary. 2.1. State the purpose of the data collection/generation The data used in the RESOLVD project has been oriented to improve knowledge on how power flow behaves in the low voltage grid, considering a relevant increase of presence of distributed renewable generation and high variability on demand. The general purpose of the project RESOLVD is to act (schedule and control) on the low voltage grid in order to increase efficiency. With this aim, data has served (and will be in the future) in the following purposes: • Enhance grid observability when monitoring: improve knowledge on demand/generation profiles, power flow computation, etc. • Modelling demand and generation for forecasting purposes: training of machine learning algorithms to forecast demand and generation in specific points of the grid. • Test and performance evaluation of both, technologies developed as part of the RESOLVD solution, and computation of KPIs during project validation: validation of proposed solution and quantification of improvements based on indicators.
Data Summary. In order to provide an overview of the different datasets that are produced over HECARRUS project life cycle, Table 2 presents the details of the data type, origin and format extension. Data types include numerical datasets, computer codes, text data, technical figures, contact lists, survey and workshops data. Primary data correspond to the main output that undergoes the already described confidentiality control, before it is made publicly available. Table 2. Information on the data types that will be used within the project.
Data Summary. In summary, we see that the existence of patterns in which a single conjunct controls (some) agree- ment processes considerably complicates the array of possible strategies for syntactic agreement with (nominal) coordinate structures. In addition to (18-1) and (18-2) we must accommodate a number of further patterns.
Data Summary. Public documents within the consortium are processed and managed by the project coordinator. Other data generated throughout the project (report, data and others) are managed and stored by the team responsible for data generation. The following table shows the data type, the origin of the data for the BUDGET-IT project. Mainly, personal data will be collected. Access to personal data will be restricted to necessary parties within the consortium. Personal Data: ‘’any information relating to an identified or identifiable natural person. An identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as name, an identification number, location data, an online identifier or to one of more factors specific to the physical, physiological, genetic, mental, economic, cultural, or social identity of that natural person’’. Table 1: Data Summary Data type Origin Data control responsibility 1 Partner contacts stakeholders list collection and Open Data Source KHAS Project Management 2 Data from institutional sources (budgets, GEPs) and official statistics Open Data Source Content and handling responsibility of each administrators (partners) will be survey 3 GEAM Survey data results Primary Data Content and handling of result data will be responsibility of each survey administrators (partners) 4 Focus group data Primary data Content and handling of focus group data will be responsibility of each partner 5 Personal data of Participants- workshops training groups and materials Primary data UA/UBS/SSST/UBG The following list highlights the type of data that will be produced, used and made openly available under the guise of the project: • Partner contacts collection and stakeholders list: BUDGET-IT Project partner information and stakeholder list data (such as university contact, contact names and emails and pictures etc.) will be integrated in the public deliverables but only after consent has been secured. • Digital security solutions collection: Data openly available • Online training groups data: The data from online training groups (recordings, protocols and transcriptions) will only be published after obtaining consent from all participants. • Training program for inclusive GEPs data: The data from training program for inclusive GEPs (recordings, protocols and transcriptions) will only be published after obtaining consent from all participants.