OpenSILEX and SIDURI: ontology-driven information systems for FAIR data management in life sciences
sciencesconf.org:jdev26:726751
Isabelle Alic 1 , Agnes Barnabe 2, 3, 4 , Arnaud Charleroy 1 , Emilie Fernandez 2, 3, 4, @ , Thomas Lacroix 2 , Erwan Le Floch 2, 3, 4 , Valentin Loux 2, 3 , Jonathan Mineau-Cesari 5 , Yvan Roux 1 , Sophie Schbath 2, 3 , Anne Tireau 1
1 : Mathématiques, Informatique et STatistique pour l'Environnement et l'Agronomie
Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement, Institut Agro Montpellier
2 : Université Paris-Saclay, INRAE, MaIAGE, 78350, Jouy-en-Josas, France
INRAE
3 : NRAE, BioinfOmics, MIGALE bioinformatics facility, 78350, Jouy-en-Josas, France
INRAE
4 : Ferments du Futur (US INRAE 1503), 91400, Orsay, France
INRAE
5 : Direction pour la Science Ouverte
Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement
The increasing volume, heterogeneity and complexity of scientific data call for robust, interoperable and FAIR-compliant information systems. In life sciences, and particularly in domains involving multi-scale and multi-source data, the need for integrated platforms supporting both data management and analysis has become critical.
Open-source solutions based on semantic technologies and standardized ontologies offer promising approaches to address these challenges. They enable better data structuring, interoperability and reuse, while fostering collaboration across disciplines and institutions.
In this context, OpenSILEX has been developed as an ontology-driven information system designed to manage, integrate and explore complex scientific datasets. Building upon this framework, domain-specific instances can be deployed to address particular scientific challenges. Siduri is one such instance, developed to support data-driven research in food fermentation within a large collaborative program.
OpenSILEX : combining knowledge graphs and NoSQL databases to build scientific information systems
Yvan ROUX5, Isabelle ALIC5, Arnaud CHARLEROY5, Anne TIREAU5
5 INRAE, MISTEA, 34000, Montpellier, France
Agriculture faces new challenges: improving food security, reducing environmental impacts and adapting to climate change. Agronomic science should meet those challenges by carrying out experiments. Large amounts of diverse data are produced. To analyze those experiment results correctly and to share them, data needs to be well described and contextualized: descriptions of genetic resources, weather events that impacted the experiment, etc. Those related datasets vary from one experiment to another and evolve over time.
Research communities need to combine all the data generated by experiments for analysis purposes. Information systems must manage this diversity of datasets, remain flexible and adaptable, and connect to other data management systems.
OpenSILEX software has been developed to address those needs. It is an open-source software framework designed to build information systems dedicated to a specific research community. More specifically, OpenSILEX consists of a set of interoperable software modules for structuring data, identifying research objects (crops, insects, plots...), describing data acquisition context, and visualizing them. OpenSILEX encourages improving data's context to be able to reproduce the results of experiments or data analysis processes. This allows datasets to be easily retrieved, shared between the research community and reused. OpenSILEX was built with FAIR data management principles in mind and combine two data management technologies. Large experiment datasets (sensor measures and image datasets) and geospatial information are managed by NoSQL (MongoDB) technologies. Contextual data are managed by web semantic methods and technologies (RDF-S, OWL, SPARQL).
OpenSILEX software suite has been used to develop several information systems for specific scientific communities like PHIS (high-throughput plant phenotyping), ENVIBIS (decontamination) or SIDURI (fermented products in food systems). Each information system has been deployed across multiple experimental platforms, research units, or projects.
SIDURI: an integrated data and analysis portal supporting data-driven innovation in food fermentation
Emilie FERNANDEZ 1,2,3, Agnès BARNABE 1,2,3, Erwan LE FLOCH 1,2,3, Thomas LACROIX 1, Jonathan MINEAU-CESARI 4, Sophie SCHBATH 1,2 and Valentin LOUX 1,2
1 Université Paris-Saclay, INRAE, MaIAGE, 78350, Jouy-en-Josas, France 2 INRAE, BioinfOmics, MIGALE bioinformatics facility, 78350, Jouy-en-Josas, France 3 Ferments du Futur (US INRAE 1503), 91400, Orsay, France 4 INRAE, DipSO (Direction pour la Science Ouverte / Directorate for Open Science)
The French Grand Challenge « Ferments du Futur » (FdF) is a public-private partnership ongoing since 2022. It aims to shift from empirical to data and knowledge driven design of fermented foods. Projects funded within FdF generate heterogeneous datasets spanning microbial ecology, sensory and biochemical characterization, bioprocess engineering and host–microorganism interactions. Managing and analysing these data across academic and industrial partners requires robust digital infrastructures.
To address this challenge, the ontology-driven information system OpenSILEX [1] was selected to promote good digital practices in line with the FAIR principles [2]. An FdF instance, called Siduri, has been developed in close collaboration with the OpenSILEX core team. Several features implemented first for Siduri are contributed back to the shared open-source codebase. Current developments focus on the management of genetic resources such as microbial strains, which play a central role in fermentation research. Developed and hosted by the Migale Bioinformatics Core Facility, Siduri centralizes FdF project results and integrates relevant public resources. However organizational solutions are needed and a comprehensive data stewardship agenda has therefore been designed to support the FdF projects along the data life cycle.
The Siduri portal also provides bioinformatics analyses with a catalogue of tools and workflows built from metadata extracted from bio.tools registry, supported by ELIXIR Europe, and the EDAM ontology. This catalogue is shared through a dedicated Galaxy instance that ensures execution and traceability of analyses. Within the Siduri portal, a ShinyProxy server deploys an interactive application that enables users to transfer datasets and their provenance between Siduri and Galaxy via their APIs.
This talk will illustrate how existing open-source software and resources can be extended and integrated to support collaborative, data-driven research in food fermentation.
1. Neveu P, Tireau A, Hilgert N, Nègre V, Mineau-Cesari J, Brichet N, et al. Dealing with multi-source and multi-scale information in plant phenomics: the ontology-driven Phenotyping Hybrid Information System. New Phytol. 2019;221(1):588–601.
2. Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
Type : : Présentation
Thématiques : Développement web
Mots-Clés : Gestion des données