Our current working groups are listed below.
Displaying 9 results.
In an ever-changing world, field surveys, inventories and monitoring data are essential for prediction of biodiversity responses to global drivers such as land use and climate change. This knowledge provides the basis for appropriate management. However, field biodiversity data collected across terrestrial, freshwater and marine realms are highly complex and heterogeneous. The successful integration and re-use of such data depends on how FAIR (Findable, Accessible, Interoperable, Reusable) they are.
ADVANCE aims at underpinning rich metadata generation with interoperable metadata standards using semantic artefacts. These are tools allowing humans and machines to locate, access and understand (meta) data, and thus facilitating integration and reuse of biodiversity monitoring data across terrestrial, freshwater and marine realms.
To this end, we revised, adapted and expanded existing metadata standards, thesauri and vocabularies. We focused on the most comprehensive database of biodiversity monitoring schemes in Europe (DaEuMon) as the base for building a metadata schema that implements quality control and complies with the FAIR principles.
In a further step, we will use biodiversity data to test, refine and illustrate the strength of the concept in cases of real use. ADVANCE thus complements semantic artefacts of the Hub Earth & Environment and other initiatives for FAIR biodiversity research, enabling assessments of the relationships between biodiversity across realms and associated environmental conditions. Moreover, it will facilitate future collaborations, joint projects and data-driven studies among biodiversity scientists of the Helmholtz Association and beyond.
A general photovoltaic device and materials data base compliant with FAIR principles is expected to greatly benefit research and development of solar cells. Because data are currently heterogeneous in different labs working on a variety of different materials and cell concepts, database development should be accompanied by ontology development. Based on a recently published literature database for perovskite solar cells, we have started an ontology for these devices and materials which could be extended to further photovoltaic applications. In order to facilitate data management at the lab scale and to allow easy upload of data and metadata to the database, electronic lab notebooks customized for perovskite solar research are developed in cooperation with the NFDI-FAIRmat project.
The seismological community promotes since decades standardisation of formats and services as well as open data policies which are making easy data exchange an asset for this community. Thus, data is made perfectly Findable and Accessible as well as Interoperable and Reusable with enhancements expected for the latter two. The strict and technical domain specific standardisation may complicate the sharing of more exotic data within the domain itself as well as hinder interoperability throughout the earth science community. Within eFAIRs, leveraging on the know-how of the major OBS park operators and seismological data curators within the Helmholtz association, we aim at facilitating integration of special datasets from the ocean floor enhancing interoperability and reusability.
To achieve this goal, in close collaboration with AWI and Geomar, supported by IPGP, the seismological data archive of the GFZ has created special workflows for OBS data curation. In particular, with close interaction with AWI, new datasets have been archived defining a new workflow which is being translated into guidelines for the community. Domain specific software have been modified to allow OBS data inclusion with specific additional metadata. Among these metadata also persistent identifiers of the instruments in use have been included for the first time from the AWI sensor information system. Next steps are going to enlarge the portfolio of keywords and standard vocabularies in use to facilitate data discovery from scientists of different domains. Finally we plan to adopt the developed workflows for OBS data management.
FAIR Workflows to establish IGSN for Samples in the Helmholtz Association” (FAIR WISH) is a joint project between the Helmholtz Centres GFZ, AWI and Hereon funded within the HMC Project Cohorte 2020 of the Helmholtz Metadata Collaboration Platform HMC.
The International Generic Sample Number (IGSN) is a globally unique and persistent identifier (PID) for physical samples and collections with discovery function in the internet. IGSNs enable to directly link data and publications with samples they originate from and thus close one of the last gaps in the full provenance of research results.
FAIR WISH will (1) develop standardised and discipline specific IGSN metadata schemes for different samples types within the research field Earth and Environment (EaE) that are complementing the core IGSN metadata schema; and (2) develop workflows to generate machine-readable IGSN metadata from different states of digitisation (from templates to databases) and to automatically register IGSNs. Our use cases were selected to include the large variety of sample types from different sub-disciplines across the project partners (e.g. terrestrial, marine environments, rock, soil, vegetation, water samples) and represent all states of digitization: from individual scientists, collecting sample descriptions in their field books to digital sample management systems fed by an app that is used in the field.
Imaging the environment is an essential and crucial component in spatial science. This concerns nearly everything between the exploration of the ocean floor and investigating planetary surfaces. In and between both domains, this is applied at various scales – from microscopy through ambient imaging to remote sensing – and provides rich information for science. Due to recent the increasing number data acquisition technologies, advances in imaging capabilities, and number of platforms that provide imagery and related research data, data volume in nature science, and thus also for ocean and planetary research, is further increasing at an exponential rate. Although many datasets have already been collected and analyzed, the systematic, comparable, and transferable description of research data through metadata is still a big challenge in and for both fields. However, these descriptive elements are crucial, to enable efficient (re)use of valuable research data, prepare the scientific domains e.g. for data analytical tasks such as machine learning, big data analytics, but also to improve interdisciplinary science by other research groups not involved directly with the data collection.
In order to achieve more effectiveness and efficiency in managing, interpreting, reusing and publishing imaging data, we here present a project to develop interoperable metadata recommendations in the form of FAIR digital objects (FDOs) for 5D (i.e. x, y, z, time, spatial reference) imagery of Earth and other planet(s). An FDO is a human and machine-readable file format for an entire image set, although it does not contain the actual image data, only references to it through persistent identifiers (FAIR marine images). In addition to these core metadata, further descriptive elements are required to describe and quantify the semantic content of imaging research data. Such semantic components are similarly domain-specific but again synergies are expected between Earth and planetary research.
HELIPORT is a data management solution that aims at making the components and steps of the entire research experiment’s life cycle discoverable, accessible, interoperable and reusable according to the FAIR principles.
Among other information, HELIPORT integrates documentation, scientific workflows, and the final publication of the research results - all via already established solutions for proposal management, electronic lab notebooks, software development and devops tools, and other additional data sources. The integration is accomplished by presenting the researchers with a high-level overview to keep all aspects of the experiment in mind, and automatically exchanging relevant metadata between the experiment’s life cycle steps.
Computational agents can interact with HELIPORT via a REST API that allows access to all components, and landing pages that allow for export of digital objects in various standardized formats and schemas. An overall digital object graph combining the metadata harvested from all sources provides scientists with a visual representation of interactions and relations between their digital objects, as well as their existence in the first place. Through the integrated computational workflow systems, HELIPORT can automate calculations using the collected metadata.
By visualising all aspects of large-scale research experiments, HELIPORT enables deeper insights into a comprehensible data provenance with the chance of raising awareness for data management.
HERMES is an acronym for “ HE lmholtz R ich ME tadata S oftware publication”.
To satisfy the principles of FAIR research software , software sustainability and software citation, research software must be formally published. Publication repositories make this possible and provide published software versions with unique and persistent identifiers. However, software publication is still a tedious, mostly manual process and impedes promoting software to first class research citizenship.
To streamline software publication, this project develops automated workflows to publish research software with rich metadata. Our tooling utilizes continuous integration solutions to retrieve, collate, and process existing metadata in source repositories, and publish them on publication repositories, including checks against existing metadata requirements. To accompany the tooling and enable researchers to easily reuse it, the project also provides comprehensive documentation and templates for widely used CI solutions.
Many, if not most, publication repositories cannot be labeled “research software ready” today (2022). In addition to the deposition workflows, this project cooperates with the upstream InvenioRDM and Dataverse projects. We are working on the necessary bits to achieve full readiness for these and put a nice badge on ‘em.
This project brings together the metadata from three centers (HMGU, UFZ, DLR) from three different domains (Health, Earth & Environment, and Aeronautics, Space & Transport). The environment plays an increasingly important role for human health and efficient linkage with environmental and earth observation data is crucial to quantify human exposures. However, there are currently no harmonized metadata standards for automatically mapping available. Therefore, this project aims to facilitate the linkage of data of the different research fields by generating and enriching interoperable and machine-readable metadata for exemplary data of our three domains and by mapping these metadata so that they can be jointly queried, searched and integrated into HMC. We finalized the conceptualization phase by developing a joint mapping strategy which identified a joint standard (ISO 19115) for our cross-domain metadata and spatial and time coverage as the main mapping criteria. In the ongoing implementation phase, we set up a test instance of the selected platform GeoNetwork, a catalog application which includes metadata editing, search functions, filtering and an interactive web map viewer. We already uploaded our use case metadata (HMGU: children cohorts GINI and LISA, UFZ: drought monitor, DLR: land cover) after converging and enriching to ISO 19115. We are currently testing the full functionality of the tool and uploading additional metadata. By the end of the project, we plan to release the platform to HMC and other researchers working in thematically related fields
Modern science is to a vast extent based on simulation research. With the advances in high-performance computing (HPC) technology, the underlying mathematical models and numerical workflows are steadily growing in complexity.
This complexity gain offers a huge potential for science and society, but simultaneously constitutes a threat for the reproducibility of scientific results. A main challenge in this field is the acquisition and organization of the metadata describing the details of the numerical workflows, which are necessary to replicate numerical experiments, and to explore and compare simulation results. In the recent past, various concepts and tools for metadata handling have been developed in specific scientific domains. It remains unclear to what extent these concepts are transferable to HPC based simulation research, and how to ensure interoperability in the face of the diversity of simulation based scientific applications.
This project aims at developing a generic, cross-domain metadata management framework to foster reproducibility of HPC based simulation science, and to provide workflows and tools for an efficient organization, exploration and visualization of simulation data.
Within the project, we so far did a review of existing approaches from different fields. A plethora of tools around metadata handling and workflows have been developed in the past years. We identified tools and formats like the odML that are useful for our work. The metadata management framework will address all components of simulation research and the corresponding metadata types, including model description, model implementation, data exploration, data analysis, and visualization. We have now developed a general concept to track, store and organize metadata. Next, the required tools within the concept will be developed such that they are applicable both in the Computational Neuroscience and Earth and Environmental Science.