Inferring Human Phylogenies Using Three CODIS STR Markers (CSF1PO, TPOX and TH01)

Nuzhat A. Akram, Shakeel R Farooqi

Abstract


Over the past several decades polymorphic genetic loci have been discussed for their utility in human phylogenetic inferences. Short Tandem Repeat (STR) loci have shown promising results for this purpose. Unfortunately, allele frequency data of polymorphic loci are largely confined to few populations. Therefore, the number of shared loci declines as the number of population increases. We hypothesize that even a smaller number of STR loci can be used efficiently for phylogenetic purposes if an appropriate theoretical and statistical strategy is employed. This strategy provides a feasible and cost effective method to choose appropriate STR loci for phylogenetic studies. For this purpose, an empirical study was conducted using allele frequency data of three STR loci CSF1PO, TPOX, and TH01 across 98 human populations from the literature (references are available at http://dnaa.bravehost.com/ index.html and http://www.cstl.nist.gov/strbase/population/Omnipop). The choice of markers was based on locus polymorphism, high heterozygosity, low mutation rate, less artifacts and independence between the loci. Three methods were used to measure genetic distances between the populations; Cavalli Sforza’s chord distance (DC), Nei’s genetic (DA) and Nei’s standard genetic distances (DST). Coefficient of variation (CV) was calculated across hundred (100) datasets obtained by re-sampling of the original dataset for each of the genetic distance methods. CV was in order of DST >DA >DC. Therefore, a consensus tree based on DC was constructed using Neighbour Joining (NJ), Unweighted Pair Group Method with Arithmatic mean (UPGMA) and Maximum Likelihood (ML) methods. NJ and UPGMA methods got more statistical support that is higher bootstrap values than ML (NJ> UPGMA> ML). Validation study was performed using (A) Principal Component Analysis (B) Comparison with trees reported for other molecular markers (C) STR genotyping of five Pakistani subpopulations. Results strongly supported our hypothesis that the three STR markers CSF1PO, TPOX, and TH01 are successful in delineating ethnic, geographic and linguistic differentiation between the populations.


Full Text:

PDF


DOI: http://dx.doi.org/10.5539/ijb.v7n1p1

International Journal of Biology   ISSN 1916-9671(Print)   ISSN 1916-968X  (Online)

Copyright © Canadian Center of Science and Education

To make sure that you can receive messages from us, please add the 'ccsenet.org' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.

------------------------------------------------------------------------------------------------------------------------------

scholar_logo_lg_2011_120 proquest_logo_120 lockss_logo_2_120