Data Features


Databases

NCBI BioSample (INSDC), GSA BioSample and Viral Genbank.

Geographic Metadata

Increase in exploitable coordinates extracted from metadata.

Host Metadata

Host Metagenomes were harmonized from metadata and associated to organism-based variables.

Environmental Metadata

ENVO terms, IUCN Ecosystem and more coordinate-based variables.

Data Description

Our Dataset is made of multiple tables:

  • Virus-Host-Environment-Geography table
  • Coordinate-based table with variables related to pedosphere, atmosphere, hydrosphere, anthroposphere...

  • Data Lake Architecture

    Our data lake architecture allows for flexible and evolutive tables, without the strict rules of structured SQL-like databases. It relies on PRABI-owned infrastructure.

    Contact Us


    You have questions? comments? Write us!