Feil Family Brain & Mind Research Institute

You are here

Extensive sequencing of seven human genomes to characterize benchmark reference materials.

TitleExtensive sequencing of seven human genomes to characterize benchmark reference materials.
Publication TypeJournal Article
Year of Publication2016
AuthorsZook JM, Catoe D, McDaniel J, Vang L, Spies N, Sidow A, Weng Z, Liu Y, Mason CE, Alexander N, Hénaff E, McIntyre ABR, Chandramohan D, Chen F, Jaeger E, Moshrefi A, Pham K, Stedman W, Liang T, Saghbini M, Dzakula Z, Hastie A, Cao H, Deikus G, Schadt E, Sebra R, Bashir A, Truty RM, Chang CC, Gulbahce N, Zhao K, Ghosh S, Hyland F, Fu Y, Chaisson M, Xiao C, Trow J, Sherry ST, Zaranek AW, Ball M, Bobe J, Estep P, Church GM, Marks P, Kyriazopoulou-Panagiotopoulou S, X Y Zheng G, Schnall-Levin M, Ordonez HS, Mudivarti PA, Giorda K, Sheng Y, Rypdal KBjarnesdat, Salit M
JournalSci Data
Volume3
Pagination160025
Date Published2016 Jun 07
ISSN2052-4463
KeywordsBenchmarking, Exome, Genome, Human, Genomics, Humans, INDEL Mutation
Abstract

The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly.

DOI10.1038/sdata.2016.25
Alternate JournalSci Data
PubMed ID27271295
PubMed Central IDPMC4896128
Grant ListR01 NS076465 / NS / NINDS NIH HHS / United States
R25 EB020393 / EB / NIBIB NIH HHS / United States
R01NS076465 / NS / NINDS NIH HHS / United States
R25EB020393 / EB / NIBIB NIH HHS / United States