Identification of the sources of error in allele frequency estimations from pooled DNA indicates an optimal experimental design

Genotyping costs still preclude analysis of a comprehensive SNP map in thousands of individual subjects in the search for disease susceptibility loci. Allele frequency estimation in DNA pools from cases and controls offers a partial solution, but variance in these estimates will result in some loss...

Full description

Saved in:
Bibliographic Details
Published inAnnals of human genetics Vol. 66; no. 5‐6; pp. 393 - 405
Main Authors BARRATT, B. J., PAYNE, F., RANCE, H. E., NUTLAND, S., TODD, J. A., CLAYTON, D. G.
Format Journal Article
LanguageEnglish
Published Cambridge, UK Blackwell Science Ltd 01.11.2002
Online AccessGet full text
ISSN0003-4800
1469-1809
DOI10.1046/j.1469-1809.2002.00125.x

Cover

Abstract Genotyping costs still preclude analysis of a comprehensive SNP map in thousands of individual subjects in the search for disease susceptibility loci. Allele frequency estimation in DNA pools from cases and controls offers a partial solution, but variance in these estimates will result in some loss of statistical power. However, there has been no systematic attempt to quantify the several sources of error in previous studies. We report an analysis of the magnitude of variance components of each experimental stage in DNA pooling studies, and find that a design based on the formation of numerous small pools of approximately 50 individuals is superior to the formation of fewer, larger pools and the replication of any of the experimental stages. We conclude that this approach may retain an effective sample size greater than 68% of the true sample size, whilst offering a 60‐fold reduction in DNA usage and a greater than 30‐fold saving in cost, compared to individual genotyping. The possibility of combining pooling with informed selection of haplotype tag SNPs is also considered. In this way further savings in efficiency may be possible by using pooled allele frequency estimates to infer haplotype frequencies and hence, allele frequencies at untyped markers.
AbstractList Genotyping costs still preclude analysis of a comprehensive SNP map in thousands of individual subjects in the search for disease susceptibility loci. Allele frequency estimation in DNA pools from cases and controls offers a partial solution, but variance in these estimates will result in some loss of statistical power. However, there has been no systematic attempt to quantify the several sources of error in previous studies. We report an analysis of the magnitude of variance components of each experimental stage in DNA pooling studies, and find that a design based on the formation of numerous small pools of approximately 50 individuals is superior to the formation of fewer, larger pools and the replication of any of the experimental stages. We conclude that this approach may retain an effective sample size greater than 68% of the true sample size, whilst offering a 60‐fold reduction in DNA usage and a greater than 30‐fold saving in cost, compared to individual genotyping. The possibility of combining pooling with informed selection of haplotype tag SNPs is also considered. In this way further savings in efficiency may be possible by using pooled allele frequency estimates to infer haplotype frequencies and hence, allele frequencies at untyped markers.
Author RANCE, H. E.
TODD, J. A.
CLAYTON, D. G.
BARRATT, B. J.
NUTLAND, S.
PAYNE, F.
Author_xml – sequence: 1
  givenname: B. J.
  surname: BARRATT
  fullname: BARRATT, B. J.
– sequence: 2
  givenname: F.
  surname: PAYNE
  fullname: PAYNE, F.
– sequence: 3
  givenname: H. E.
  surname: RANCE
  fullname: RANCE, H. E.
– sequence: 4
  givenname: S.
  surname: NUTLAND
  fullname: NUTLAND, S.
– sequence: 5
  givenname: J. A.
  surname: TODD
  fullname: TODD, J. A.
– sequence: 6
  givenname: D. G.
  surname: CLAYTON
  fullname: CLAYTON, D. G.
BookMark eNqNkM1OAyEUhYmpiW31HXiBGfmbKSTGpKnaNml0o2uCzB2lmQ4VprFd-eoy1bhw1RUcOOe7cEZo0PoWEMKU5JSI8nqdU1GqjEqickYIywmhrMj3Z2j4dzFAQ0IIz4Qk5AKNYlz3Jin4EH0tK2g7VztrOudb7GvcvQOOfhcsxF5CCD5g12LTNNAArgN87KC1Bwyxc5tjLKZTv8Fb7xuo8N3jNPmrHpkQJkG3vbHBsN9CcJs0MIkKontrL9F5bZoIV7_rGL083D_PFtnqab6cTVeZ5ZOyyIw0piokYxMKlHEiqJLAFeUlq02lSsHUqxJJkYmtC0ELYpWlXChZS0Et42N0-8O1wccYoNbWdce3d8G4RlOi-zr1Wvet6b413depj3XqfQLIf4Bt-ooJh1OiNz_RT9fA4eScni7macO_ARWej_Y
CitedBy_id crossref_primary_10_1038_sj_tpj_6500167
crossref_primary_10_1016_j_jneuroim_2006_06_003
crossref_primary_10_1111_j_1469_7610_2010_02236_x
crossref_primary_10_1158_0008_5472_CAN_06_4784
crossref_primary_10_1101_sqb_2003_68_65
crossref_primary_10_1016_j_aquaculture_2021_737633
crossref_primary_10_1080_09064701003801922
crossref_primary_10_1002_gepi_20517
crossref_primary_10_1111_j_1601_183X_2006_00251_x
crossref_primary_10_1111_j_1601_183X_2009_00553_x
crossref_primary_10_1016_j_jneuroim_2003_08_004
crossref_primary_10_1007_s10519_005_9016_9
crossref_primary_10_1093_bioinformatics_btn333
crossref_primary_10_1111_j_1469_1809_2008_00486_x
crossref_primary_10_1002_gepi_10277
crossref_primary_10_1007_s10709_008_9275_5
crossref_primary_10_1016_j_mrfmmm_2004_11_004
crossref_primary_10_1158_0008_5472_CAN_04_1788
crossref_primary_10_1086_432962
crossref_primary_10_1007_s10741_009_9138_x
crossref_primary_10_1186_1297_9686_44_12
crossref_primary_10_1214_09_STS288
crossref_primary_10_1038_sj_ejhg_5201146
crossref_primary_10_1038_sj_mp_4002012
crossref_primary_10_1007_s10519_010_9350_4
crossref_primary_10_1002_humu_20196
crossref_primary_10_1186_1471_2105_9_196
crossref_primary_10_1002_elps_200410403
crossref_primary_10_1016_S0140_6736_05_67424_7
crossref_primary_10_1534_genetics_104_032052
crossref_primary_10_2337_diabetes_53_7_1884
crossref_primary_10_1093_bioinformatics_btn324
crossref_primary_10_1158_1055_9965_EPI_06_0146
crossref_primary_10_1111_j_0006_341X_2005_454_1_x
crossref_primary_10_1016_j_jneuroim_2003_08_011
crossref_primary_10_1111_are_13216
crossref_primary_10_1002_gepi_20062
crossref_primary_10_1016_j_jneuroim_2003_08_010
crossref_primary_10_1534_genetics_105_042648
crossref_primary_10_1086_510686
crossref_primary_10_1371_journal_pone_0096374
crossref_primary_10_1186_1471_2350_6_9
crossref_primary_10_1111_j_1529_8817_2005_00232_x
crossref_primary_10_1186_1479_7364_1_6_421
crossref_primary_10_1093_hmg_ddh294
crossref_primary_10_1007_BF03175562
crossref_primary_10_1093_hmg_ddi309
crossref_primary_10_1016_j_jneuroim_2006_04_021
crossref_primary_10_1038_sj_mp_4001785
crossref_primary_10_1186_1471_2105_14_270
crossref_primary_10_1021_pr034080s
crossref_primary_10_1038_nrg1521
crossref_primary_10_1007_s12042_008_9016_z
crossref_primary_10_1186_1471_2164_6_52
crossref_primary_10_1093_bioinformatics_btn587
crossref_primary_10_1200_JCO_2005_04_4339
crossref_primary_10_1002_gepi_20040
crossref_primary_10_1016_j_smallrumres_2013_03_013
crossref_primary_10_1038_sj_ejhg_5201234
crossref_primary_10_1371_journal_pone_0236343
crossref_primary_10_1016_j_ygeno_2008_08_010
crossref_primary_10_1371_journal_pgen_0020127
crossref_primary_10_1007_s00122_009_1198_8
crossref_primary_10_1186_1755_8794_4_81
crossref_primary_10_1016_j_jbbm_2003_11_005
crossref_primary_10_1186_1471_2156_9_6
crossref_primary_10_1093_genetics_166_4_2001
crossref_primary_10_1007_s10519_009_9308_6
crossref_primary_10_1002_gepi_20024
crossref_primary_10_1093_nar_gkl446
crossref_primary_10_1093_biostatistics_kxj020
crossref_primary_10_1093_bioinformatics_btm435
crossref_primary_10_2144_04365RR01
crossref_primary_10_1111_j_1740_0929_2012_01022_x
crossref_primary_10_1073_pnas_0409806102
crossref_primary_10_1038_sj_mp_4002048
crossref_primary_10_1016_j_ijms_2004_03_015
crossref_primary_10_1186_1471_2156_13_94
crossref_primary_10_1038_nprot_2006_442
crossref_primary_10_1002_gepi_10290
crossref_primary_10_1007_s00572_006_0067_4
crossref_primary_10_1086_381716
crossref_primary_10_1186_1471_2105_7_233
crossref_primary_10_1002_humu_20100
crossref_primary_10_1038_nrg930
crossref_primary_10_1038_mp_2012_81
crossref_primary_10_1111_j_1469_1809_2007_00419_x
crossref_primary_10_1159_000073729
crossref_primary_10_1371_journal_pone_0035307
crossref_primary_10_1038_nrc1476
crossref_primary_10_1093_nar_gkl700
crossref_primary_10_1111_1755_0998_12186
crossref_primary_10_1073_pnas_1237858100
crossref_primary_10_1038_s41437_021_00421_0
crossref_primary_10_1093_bioinformatics_btq492
crossref_primary_10_1002_sim_4367
crossref_primary_10_1186_1471_2164_8_238
crossref_primary_10_1371_journal_pone_0131202
crossref_primary_10_1186_1472_6750_4_15
crossref_primary_10_3389_fgene_2021_635846
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.1046/j.1469-1809.2002.00125.x
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
CrossRef
DeliveryMethod fulltext_linktorsrc
Discipline Sociology & Social History
Biology
EISSN 1469-1809
EndPage 405
ExternalDocumentID 10_1046_j_1469_1809_2002_00125_x
AHG125
Genre reviewArticle
GroupedDBID ---
-~X
.3N
.55
.GA
.GJ
.Y3
05W
0R~
10A
1OB
1OC
23M
24P
2WC
31~
33P
36B
3O-
3SF
4.4
50Y
50Z
51W
51X
52M
52N
52O
52P
52R
52S
52T
52U
52V
52W
52X
53G
5GY
5HH
5LA
5RE
5VS
66C
6J9
702
7PT
8-0
8-1
8-3
8-4
8-5
8UM
930
A01
A03
AAESR
AAEVG
AAHHS
AAHQN
AAIPD
AAMNL
AANHP
AANLZ
AAONW
AASGY
AAXRX
AAYCA
AAZKR
ABCQN
ABCUV
ABDBF
ABEML
ABGDZ
ABJNI
ABPVW
ABQWH
ABVKB
ABXGK
ACAHQ
ACBWZ
ACCFJ
ACCJX
ACCZN
ACFBH
ACGFO
ACGFS
ACGOF
ACMXC
ACNCT
ACPOU
ACPRK
ACQPF
ACRPL
ACSCC
ACUHS
ACXBN
ACXQS
ACYXJ
ADBBV
ADBTR
ADEOM
ADIZJ
ADKYN
ADMGS
ADNMO
ADOZA
ADXAS
ADZMN
ADZOD
AEEZP
AEGXH
AEIGN
AEIMD
AENEX
AEQDE
AEUQT
AEUYR
AFBPY
AFEBI
AFFNX
AFFPM
AFGKR
AFPWT
AFWVQ
AFZJQ
AHBTC
AHEFC
AI.
AIACR
AIAGR
AITYG
AIURR
AIWBW
AJBDE
ALAGY
ALMA_UNASSIGNED_HOLDINGS
ALUQN
ALVPJ
AMBMR
AMYDB
ATUGU
AZBYB
AZFZN
AZVAB
BAFTC
BAWUL
BDRZF
BFHJK
BHBCM
BMXJE
BROTX
BRXPI
BY8
C45
CAG
COF
CS3
D-6
D-7
D-E
D-F
DCZOG
DIK
DPXWK
DR2
DRFUL
DRMAN
DRSTM
DVXWH
E3Z
EAD
EAP
EBC
EBD
EBS
EJD
EMB
EMK
EMOBN
EST
ESX
EX3
F00
F01
F04
F5P
FEDTE
FIJ
FUBAC
G-S
G.N
GODZA
GX1
H.X
HF~
HGLYW
HVGLF
HZI
HZ~
IH2
IHE
IPNFZ
IX1
J0M
K48
KBYEO
LATKE
LC2
LC3
LEEKS
LH4
LITHE
LOXES
LP6
LP7
LUTES
LW6
LYRES
MEWTI
MK4
MRFUL
MRMAN
MRSTM
MSFUL
MSMAN
MSSTM
MVM
MXFUL
MXMAN
MXSTM
N04
N05
N9A
NF~
O66
O9-
OHT
OIG
OK1
OVD
P2P
P2W
P2X
P2Z
P4B
P4D
PALCI
Q.N
Q11
QB0
R.K
RCA
RIG
RIWAO
RJQFR
ROL
RX1
SAMSI
SUPJJ
SV3
TEORI
TN5
TR2
TUS
UB1
UKR
V8K
VH1
W8V
W99
WBKPD
WIH
WIJ
WIK
WIN
WNSPC
WOHZO
WOW
WQJ
WRC
WXI
WXSBR
WYISQ
X7M
XG1
XOL
ZGI
ZXP
ZZTAW
~IA
~WT
AAYXX
ADXHL
AEYWJ
AGHNM
AGQPQ
AGYGG
CITATION
ID FETCH-LOGICAL-c3765-a8aad582271e12304198e391362fad96429b9436207cf54150c9c13498f841c23
IEDL.DBID DR2
ISSN 0003-4800
IngestDate Tue Jul 01 01:47:07 EDT 2025
Thu Apr 24 22:56:55 EDT 2025
Wed Jan 22 16:57:45 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 5‐6
Language English
License http://onlinelibrary.wiley.com/termsAndConditions#vor
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c3765-a8aad582271e12304198e391362fad96429b9436207cf54150c9c13498f841c23
PageCount 13
ParticipantIDs crossref_citationtrail_10_1046_j_1469_1809_2002_00125_x
crossref_primary_10_1046_j_1469_1809_2002_00125_x
wiley_primary_10_1046_j_1469_1809_2002_00125_x_AHG125
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate November 2002
PublicationDateYYYYMMDD 2002-11-01
PublicationDate_xml – month: 11
  year: 2002
  text: November 2002
PublicationDecade 2000
PublicationPlace Cambridge, UK
PublicationPlace_xml – name: Cambridge, UK
PublicationTitle Annals of human genetics
PublicationYear 2002
Publisher Blackwell Science Ltd
Publisher_xml – name: Blackwell Science Ltd
SSID ssj0012843
Score 2.0133784
SecondaryResourceType review_article
Snippet Genotyping costs still preclude analysis of a comprehensive SNP map in thousands of individual subjects in the search for disease susceptibility loci. Allele...
SourceID crossref
wiley
SourceType Enrichment Source
Index Database
Publisher
StartPage 393
Title Identification of the sources of error in allele frequency estimations from pooled DNA indicates an optimal experimental design
URI https://onlinelibrary.wiley.com/doi/abs/10.1046%2Fj.1469-1809.2002.00125.x
Volume 66
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3PS8MwFA4yELz4Yypu_uAdxFtHmzZrcxzOOQR3EAe7lbZJQDa60W3gvPivm5e0ZYqHId4aaEKb9L1-yXvv-wi5RQo4FirheKFMnCBigZOEijo-Z0pvsakrTMr_86g7HAdPEzYp85-wFsbyQ9QHbmgZxl-jgSepVSFxDbutMXLuIP-UyTTAmAJlHcSTnt9FGv3-S80khV7Yr8TzAg2SyqSeKsD520Df_lTbyNX8egZHZFo9tM04mXbWq7STffzgc_yftzomhyVChZ79pE7InsybZN9qVm6apF2XuMAd2OJesFwjm1Pyaet-VXkQCHMFGmCCjRAssSmLYl7AWw6o4TKToAqby70BpPuwdZRLwKIXQPUvKaA_6gEG1jF1awmJHnSBN85gW5wAhMlEOSPjwcPr_dApJR6cTHs25iRRkgimQUroSQ_Ppz0eSZ_rBaQqEVxvjnjKA91yw0wxDTbcjGfIqBipKPAy6p-TRj7P5YVeTppqxx1GAkm8UoXM_alGN1T4PvczKlskrJYzzkr-c5ThmMUmDh907T6Ixzj3qM5JYzP38XuLeHXPheUA2aEPM8u8c4e4N3zUF-0_9rskB1anBg-HrkhjVazltYZLq_TGGMIXuRoD8g
linkProvider Wiley-Blackwell
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1JS8NAFB5EEb24i63bO4i31GaZJnMs1lq17UFa8BayzIBY0pK2YL34131vJikqHkS8JZA3JDOZmW_e8n2MXRAFHPdVatm-jCwv4J4V-cqxXMEVHrGdeqpT_nv9Rmfo3T_xp0IOiGphDD_E0uFGM0Ov1zTBySF9VYQly1kuLCKg0qkGFFRweA0B5ZqHuINOYq3HJZcUrcNuKZ_nIUwq0nrKEOdPLX3Zqz5jV735tLfZqHxtk3PyUpvP4lry9o3R8Z--a4dtFSAVmuav2mUrMttj60a2crHHqssqF7gEU98Lhm5ksc_eTemvKnyBMFaAGBNMkGBKtzLPxzk8Z0AyLiMJKjfp3Asgxg9TSjkFqnsBEgCTKbT6TaDYOmVvTSHCRif04Ag-6xNAqpNRDtiwfTO47liFyoOV4OLGrSiIopQjTvFtaZOL2haBdIWNO6uKUoHnIxELD-_qfqI44o16IhIiVQxU4NmJ4x6y1WycySMcTyfGtdsPUuLxihWR98cIcJzUdYWbOLLC_HI8w6SgQCcljlGoQ_FewxyFREh9TwKdTqj7PnytMHtpOTE0IL-w4Xqcf20QNju3eFH9o9052-gMet2we9d_OGabRraGfEUnbHWWz-UpoqdZfKZnxQcdFAgR
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1bS8MwFA4yUXzxLs7reRDfOtdL1uZxqHPehojC3krbJCAb2-g2cL741z0naYeKDyK-tdCENmlOvpzL9zF2QhRwPNTScUOVOEHEAycJtef4gms8Ynt1aVL-7zuN9nNw0-XdIv-JamEsP8Tc4UYrw9hrWuAjqc-KqGS5yIVD_FMm04BiCh6vIZ5cDBoILAggPc6ppMgM-6V6XoAoqcjqKSOcP_X0Zav6DF3N3tNaY73yrW3KSa82naS17O0boeP_fNY6Wy0gKjTtP7XBFtRgky1Z0crZJtub17jAKdjqXrBkI7Mt9m4Lf3XhCYShBkSYYEMEY7pVeT7M4WUAJOLSV6Bzm8w9A-L7sIWUY6CqFyD5LyXhotMEiqxT7tYYEux0RA_24bM6AUiTirLNnluXT-dtp9B4cDI0bdxJoiSRHFFK6CqXHNSuiJQvXNxXdSIFno5EKgK8q4eZ5og26pnIiFIx0lHgZp6_wyqD4UDt4nR6KVruMJLE4pVqou5PEd540veFn3mqysJyOuOsIEAnHY5-bALxQcMehERMY0_ynF5sxj5-rTJ33nJkSUB-0Yabaf51g7jZvsKLvT-2O2bLDxet-O66c7vPVqxmDTmKDlhlkk_VIUKnSXpk1sQH5lAGwA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Identification+of+the+sources+of+error+in+allele+frequency+estimations+from+pooled+DNA+indicates+an+optimal+experimental+design&rft.jtitle=Annals+of+human+genetics&rft.au=BARRATT%2C+B.%C2%A0J.&rft.au=PAYNE%2C+F.&rft.au=RANCE%2C+H.%C2%A0E.&rft.au=NUTLAND%2C+S.&rft.date=2002-11-01&rft.pub=Blackwell+Science+Ltd&rft.issn=0003-4800&rft.eissn=1469-1809&rft.volume=66&rft.issue=5%E2%80%906&rft.spage=393&rft.epage=405&rft_id=info:doi/10.1046%2Fj.1469-1809.2002.00125.x&rft.externalDBID=10.1046%252Fj.1469-1809.2002.00125.x&rft.externalDocID=AHG125
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0003-4800&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0003-4800&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0003-4800&client=summon