SBIR Banner

You are here

NIH/NCI 341: Development of Metabolomics Data Integration Methods and Software

Fast-Track proposals will be accepted.

Direct to Phase II will not be accepted.

Number of anticipated awards: 2 – 3

Budget (total costs, per award): Phase I: up to $225,000 for up to 9 months

          Phase II: up to $1,500,000 for up to 2 years

PROPOSALS THAT EXCEED THE BUDGET OR PROJECT DURATION LISTED ABOVE MAY NOT BE FUNDED.

Summary

Metabolomics is the study of small molecules participating in cellular metabolism.  Advances in metabolic profiling technologies have made it possible to simultaneously assay hundreds of metabolites, providing insight into an organism’s metabolic status. Several studies suggest that metabolomics may identify novel biomarkers for a diverse range of disease, including cancer. Furthermore, metabolites may play important regulatory roles in disease pathways and even serve as effectors of disease processes. Metabolomics has only recently been applied to epidemiologic studies, some of which are attempting to leverage existing metabolomics data by establishing consortia such as the COnsortium of METabolomics Studies (COMETS).

There is considerable field-wide interest in the development of algorithms and methods to integrate metabolite data across laboratory platforms and analytical technologies, as is currently done for genetic variation by genome-wide association studies and next-generation sequencing. Advances in this area will help lay the foundation to support the application of metabolomics to epidemiology cohorts and consortia by facilitating replication across cohorts, enabling pooled metabolomics analyses across multiple cohorts, and rapidly scaling up sample sizes for metabolomics studies. This topic will help researchers leverage existing resources to easily compare and combine datasets to detect more subtle and complex associations among variables, thereby promoting greater collaboration, efficiency, and return on investment. In turn, it will enhance our opportunities to identify novel cancer biomarkers.

There are several analytical technologies used in metabolomics, including different separation methods [e.g., gas chromatography (GC), liquid chromatography (LC), and capillary electrophoresis (CE)] and multiple detection methods [e.g., mass spectrometry (MS) and nuclear magnetic resonance (NMR)]. Although MS and NMR are the most widely used detection methods, other methods such as ion-mobility spectrometry and electrochemical detection have been used. These detection methods differ in specificity and sensitivity, resulting in the measurement of metabolites specific to the technology. Additionally, laboratories may use the same analytical technologies, but different sample preparation, which results in the measurement of metabolites specific to the sample preparation. Therefore, there can be distinctly different metabolites measured across laboratory platforms using the same analytical technology. Both the differing analytical technologies and laboratory platforms create a complex pool of data that is challenging to integrate/harmonize without valid and reliable methods that are accessible to the research community. This, in turn, limits the ability to pool and leverage existing data for biomarker discovery.

This topic is intended to develop new and innovative bioinformatic methods to integrate metabolite data across laboratory platforms and analytical technologies and ultimately design scalable software tool(s) that apply these methods to automate the integration of metabolite data.

Project Goals

The purpose of this topic is to support the development of new and innovative methods to integrate metabolite data across analytical technologies and laboratory platforms, and in turn, design software tool(s) applying these methods for data integration.

In the short term, this topic aims to 1) develop bioinformatic methods to integrate metabolite data across various laboratory platforms and analytical technologies, including liquid-chromatography-mass spectrometry (LC-MS), gas chromatography-mass spectrometry (GC-MS), and NMR; and 2) develop scalable software tool(s) to automate these methods for use by the cancer and overall public health research communities. Valid and reliable data harmonization of metabolomics data also builds a critical foundation for the longer term goal of integration of metabolomics data with other ‘omics data (e.g., genomics, proteomics, transcriptomics, epigenomics, etc.). The development of methods to integrate a wide range of -omics data will position the research community to better leverage existing data for the discovery of novel cancer biomarkers of etiology, diagnosis, and prognosis.

Responses to this topic are expected to address the development of efficient bioinformatic methods to:

  1. Demonstrate bioinformatic methods for the integration of metabolite data across different laboratory platforms and analytical technologies with high accuracy;
  2. Store metabolite data from the different data sources in databases that can be easily used for data integration and quality control protocols;
  3. Implement valid quality control (QC) checks; and
  4. Appropriately secure data at each stage of transfer and storage.

An essential task for each proposal is the development of bioinformatic tools in the form of scalable software that can be used by the research community at-large to automate complex data integration tasks for metabolomics data sources.

Phase I activities should provide evidence that metabolite data integration bioinformatic methods, using identified metabolite data, have been effectively developed, can be implemented across data inputs from diverse laboratory platforms and at least two analytical technologies, and demonstrate readiness to proceed to Phase II. Additionally, Phase I will be used to demonstrate the framework for scalable software tool(s) that apply the bioinformatic methods to automate the integration of metabolite data.

Phase I Activities and Deliverables

  • Establish a project team including proven expertise in metabolomics analytical technologies, epidemiology, biostatistics/bioinformatics, and computer technology. Additionally, a team including expertise in biochemistry/clinical chemistry is preferred.
  • Develop bioinformatic methods for metabolite data integration for identified metabolites across data inputs from diverse laboratory platforms and at least two analytical technologies (preferably liquid-chromatography-mass spectrometry (LC-MS), gas chromatography-mass spectrometry (GC-MS), and/or NMR).
  • Participate in the development of a collaboration agreement between the offeror, NCI, and NCI-identified third party sources to access relevant input data types for the proposed project. NCI staff will work with established cohort studies and consortia to provide metabolomics data (identified metabolite data) to successful offerors.
  • Develop database formats that support the import and export of individual datasets and “combined” datasets, store structured data from different sources of metabolite data, and are readily used for data integration and QC protocols.
    • Finalize database formats and structure, data collection, transport and importation methods for targeted Phase I data inputs.
  • Provide wireframes and user workflows for the proposed Graphical User Interface (GUI) and software functions that;
    • Support the import and export of individual datasets and “combined” datasets;
    • Implement, script or automate all features and functions of the data integration tool(s);
    • Conduct QC of “combined” datasets.
  • Provide a report including a detailed description and/or technical documentation of the following:
    • Specific approach to metabolite data integration;
    • Specific approach to QC;
    • Data standards for transfer and importation of individual metabolite data and storage of individual and “combined” metabolite data;
    • Data visualization, feedback, and reporting systems for individual and “combined” metabolite data;
    • Technology compatibility matrix for Phase I and Phase II metabolomics data sources by laboratory platform, analytical technology, and identified metabolites (Phase I) / unidentified metabolite peaks (Phase II).
    • Software tool(s);
    • Transparent, documented, and non-proprietary bioinformatic methods; and
    • Description of additional software and hardware required for use of the tool.
    • Finalized database formats and structure, data collection, transport, and importation methods for targeted data inputs; and
    • Funds in budget to present Phase I findings and demonstrate the wireframes and user workflows for the GUI and software functions to an NCI evaluation panel.
  • Develop functional prototype software that integrates data from planned Phase I technology compatibility matrix data sources using automated algorithms and methods.
  • Include funds in the Phase I budget to present project deliverable and the prototype software tools to an NCI panel for evaluation.

Phase II Activities and Deliverables

  • Expand the bioinformatic methods to include unidentified metabolite peaks, in addition to identified metabolite data, and demonstrate metabolite data integration across data inputs from diverse laboratory platforms and at least two analytical technologies (preferably liquid-chromatography-mass spectrometry (LC-MS), gas chromatography-mass spectrometry (GC-MS), and/or NMR).
  • Participate in the development of a collaboration agreement between the offeror, NCI, and NCI-identified third party sources to access relevant input data types for the proposed project. NCI staff will work with established cohort studies and consortia to provide metabolomics data (identified metabolites and unidentified peak data) to successful offerors that would serve to: 1) train and validate the expanded bioinformatic methods; and 2) demonstrate the application of these methods through scalable software to automate complex data integration tasks for metabolomics data sources.
  • Demonstrate usability of scalable software through the following:
    • Beta-test and finalize automated file transfer, database importation protocols, metabolite data integration applications and reporting tools developed in Phase I
    • Develop beta-test, finalize, and demonstrate the GUI
    • Demonstrate the software systems ability to integrate data from planned Phase II technology compatibility matrix data sources using automated algorithms and analytic methods
  • Conduct usability testing of the GUI elements of the metabolite data integration tool(s).
  • Develop systems documentation where applicable to support the software and bioinformatic methods.
  • In the first year of the contract, provide the program and contract officers with a letter(s) of commercial interest.
  • In the second year of the contract, provide the program and contract officers with a letter(s) of commercial commitment.

 

 

Updated: July 24, 2015