Fast-Track proposals will be accepted.
Direct-to-Phase II proposals will not be accepted.
Number of anticipated awards: 3-5
Budget (total costs, per award):
Phase I: up to $400,000 for up to 9 months
Phase II: up to $2,000,000 for up to 2 years
PROPOSALS THAT EXCEED THE BUDGET OR PROJECT DURATION LISTED ABOVE MAY NOT BE FUNDED.
Imaging data are a core component in the development of the National Cancer Data Ecosystem and are important in areas from basic research to diagnostics and surveillance. Sharing of any data collected from patients, however, requires that information that can connect that data to the individual from which the data were collected must be removed, or anonymized to the extent possible. Removal of Protected Health Information (PHI) from imaging data files is a twofold problem. Both the file header and the image field itself must be examined for information that could link the file to a specific individual. In headers, this information is often found in fields not intended to contain such information. In the image field itself, PHI can be found in different forms, inserted into the image by the imaging system, or by the presence of identifying jewelry in the image (in the case of radiological images). The complexity of the de-identification problem dictates that a substantial amount of human curation is required to ensure proper and complete removal of PHI from images. This need for human participation in the de-identification process is a significant bottleneck; it impedes the generation of image collections suitable for public distribution and sharing, including deposition into components of the National Cancer Data Ecosystem like The Cancer Imaging Archive (TCIA) (https://cancerimagingarchive.net) and the proposed Imaging Data Commons of the Cancer Research Data Commons. For example, on a TCIA data curation team, one person manually reviews files for PHI. Improved tools would shift a large portion of the de-identification burden to software, improving data throughput and increasing data accessibility. Currently, tools do not exist to properly remove PHI from proprietary file formats (e.g. digital pathology images) while retaining other data that maybe be useful to researchers.
The goal of this contract solicitation is to support development and sustainment of software tools and pipelines for image de-identification, especially for but not exclusive to CT patient data sets and images produced by whole slide imagers (WSI) for digital pathology applications. These tools will selectively remove PHI while retaining other metadata fields that help provide interoperability with other image formats and other data types, such as genomic data and proteomic data.
The following tasks/objectives should be met by the software tool:
Brute force methods for de-identification (e.g., erasing of all header information) are not acceptable. Retention of data and metadata necessary for downstream applications (population studies, segmentation training) is required. Solutions should not compromise the biomedical use of data files. To build upon previous work for field retention, removal, and alteration, the TCIA de-identification knowledge base (https://wiki.cancerimagingarchive.net/display/Public/De-identification+K...) may serve as a foundation for determining and prioritizing similar attributes in digital pathology images.
Receipt date: October 23, 2019, 5:00 p.m. Eastern Daylight Time
Apply for this topic on the Contract Proposal Submission (eCPS) website.
For full PHS2020-1 Contract Solicitation, CLICK HERE.