IRIDA provides a fully featured system for the storage, management, sharing, and analysis of sequencing data and its associated metadata Sequence data can be imported directly form Illumina MiSeq Sequencers into IRIDA’s data storage and management system. The sequence data is organized into projects and access to the data can can be shared with project collaborators. Data can also be shared with other IRIDA instances across the internet. Data can be analyzed directly with IRIDA, exported to file or to the Galaxy workflow system. Read on for more detail on IRIDA’s sequence data management.
Sequencing Data Management
The core of the IRIDA platform is a file storage and management system for next generation sequencing data. IRIDA gives public health institutions an accessible and sophisticated platform to collect, share, manage, and analyze sequencing data and metadata while shielding users from the difficulties of manual data storage and processing workflows. Data stored within the system is grouped by “Projects”. Each project contains a collection of sequencing data and metadata which can be independently shared, managed, and analyzed. This enables curators of surveillance and research data to properly isolate and manage their projects independently without being stuck in a “one size fits all” approach.
Within each project IRIDA stores a collection of “Samples”. Each of these samples contains sequence file data, and the associated contextual information (or metadata). This project-sample-sequence file structure is inspired by the INSDC in order to be compatible with other tools in the genomic epidemiology world.
IRIDA provides a number of methods to get your data into the system. The primary method for uploading sequencing data is via our “MiSeq Uploader” tool. This uploader tool runs directly on a MiSeq sequencer and can automatically upload a completed sequencing run into the IRIDA system. The uploader tool and the IRIDA system will ensure data is categorized into the correct projects and shared with the appropriate users so they can immediately view and analyze new samples.
For smaller data sets, IRIDA allows users to directly upload data into the system via the web interface.
Contextual metadata in IRIDA is treated with the same importance as sequencing data. Storing the sequencing data alongside the contextual metadata allows IRIDA to provide more comprehensive analysis possibilities. This allows users to benefit from the latest-and-greatest whole genome sequencing analysis tools but also tie in the situational data which is required to get a complete picture.
IRIDA provides tools to view and manage bulk collections of contextual information for collections of samples.
In addition to IRIDA’s data management abilities, the platform provides numerous data security control features. Projects in IRIDA are individually administered to ensure data access is only provided to appropriate users. Different access levels are also provided to ensure data can only be modified by users with higher level access, where lower level users can view or run basic analysis tools.
This data security extends to anywhere data from IRIDA is used; be it in IRIDA’s web user interface, external tools communicating with IRIDA’s REST API, IRIDA’s analysis engine, or through remote synchronized projects.