Whole-genome sequencing and intensive analysis of the undomesticated soybean (Glycine soja Sieb. and Zucc.) genome

The genome of soybean (Glycine max), a commercially important crop, has recently been sequenced and is one of six crop species to have been sequenced. Here we report the genome sequence of G. soja, the undomesticated ancestor of G. max (in particular, G. soja var. IT182932). The 48.8-Gb Illumina Gen...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the National Academy of Sciences - PNAS Vol. 107; no. 51; pp. 22032 - 22037
Main Authors Kim, Moon Young, Lee, Sunghoon, Van, Kyujung, Kim, Tae-Hyung, Jeong, Soon-Chun, Choi, Ik-Young, Kim, Dae-Soo, Lee, Yong-Seok, Park, Daeui, Ma, Jianxin, Kim, Woo-Yeon, Kim, Byoung-Chul, Park, Sungjin, Lee, Kyung-A, Kim, Dong Hyun, Kim, Kil Hyun, Shin, Jin Hee, Jang, Young Eun, Kim, Kyung Do, Liu, Wei Xian, Chaisan, Tanapon, Kang, Yang Jae, Lee, Yeong-Ho, Kim, Kook-Hyung, Moon, Jung-Kyung, Schmutz, Jeremy, Jackson, Scott A., Bhak, Jong, Lee, Suk-Ha, Phillips, Ronald L.
Format Journal Article
LanguageEnglish
Published United States National Academy of Sciences 21.12.2010
National Acad Sciences
SeriesFrom the Cover
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The genome of soybean (Glycine max), a commercially important crop, has recently been sequenced and is one of six crop species to have been sequenced. Here we report the genome sequence of G. soja, the undomesticated ancestor of G. max (in particular, G. soja var. IT182932). The 48.8-Gb Illumina Genome Analyzer (Illumina-GA) short DNA reads were aligned to the G. max reference genome and a consensus was determined for G. soja. This consensus sequence spanned 915.4 Mb, representing a coverage of 97.65% of the G. max published genome sequence and an average mapping depth of 43-fold. The nucleotide sequence of the G. soja genome, which contains 2.5 Mb of substituted bases and 406 kb of small insertions/deletions relative to G. max, is ∼0.31% different from that of G. max. In addition to the mapped 915.4-Mb consensus sequence, 32.4 Mb of large deletions and 8.3 Mb of novel sequence contigs in the G. soja genome were also detected. Nucleotide variants of G. soja versus G. max confirmed by Roche Genome Sequencer FLX sequencing showed a 99.99% concordance in single-nucleotide polymorphism and a 98.82% agreement in insertion/deletion calls on Illumina-GA reads. Data presented in this study suggest that the G. soja/G. max complex may be at least 0.27 million y old, appearing before the relatively recent event of domestication (6,000∼9,000 y ago). This suggests that soybean domestication is complicated and that more in-depth study of population genetics is needed. In any case, genome comparison of domesticated and undomesticated forms of soybean can facilitate its improvement.
Bibliography:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ObjectType-Review-3
ObjectType-Article-2
USDOE Office of Science (SC), Biological and Environmental Research (BER)
Author contributions: M.Y.K., K.V., S.-C.J., J.M., J.B., and S.-H.L. designed research; I.-Y.C., D.-S.K., D.H.K., K.H.K., J.H.S., Y.E.J., K.D.K., W.X.L., T.C., and Y.-H.L. performed research; S.L., T.-H.K., D.-S.K., Y.-S.L., D.P., W.-Y.K., B.-C.K., S.P., K.-A.L., and Y.J.K. analyzed data; and M.Y.K., S.L., K.V., T.-H.K., S.-C.J., K.-H.K., J.-K.M., J.S., S.A.J., J.B., and S.-H.L. wrote the paper.
1M.Y.K., S.L., K.V., and T.-H.K. contributed equally to this work.
Edited* by Ronald L. Phillips, University of Minnesota, St. Paul, MN, and approved October 29, 2010 (received for review July 12, 2010)
2Present address: Personal Genomics Institute, Suwon, Gyeonggi 433-759, Korea.
ISSN:0027-8424
1091-6490
1091-6490
DOI:10.1073/pnas.1009526107