Characterizing the admixed African ancestry of African Americans
In this study, we characterize the African origins of African Americans by making use of the high-density genotype data generated for 94 HGDP indigenous Africans from differing geographic and linguistic groups, including 21 Mandenka from West Africa, 21 Yoruba from West Central Africa, 15 Bantu speakers from Southwestern and Eastern Africa, 20 Biaka Pygmy and 12 Mbuti Pygmy from Central Africa, and five San from Southern Africa 18]. These subjects are used to represent the potential African ancestors of 136 African Americans recently genotyped in a GWA study of early-onset coronary artery disease (ADVANCE) 19]. In addition, we include 38 U.S. Caucasian subjects from ADVANCE to represent the European ancestors of the African Americans.
These results were confirmed in the estimation of IA by using the program
frappe (also in Figure
1). The amount of European ancestry shows considerable variation, with an average (± SD) of 21.9% ± 12.2%, and a range of 0 to 72% (Table
1).
The largest African ancestral contribution comes from the Yoruba, with an average of 47.1% ± 8.7% (range, 18% to 64%), followed by the Bantu at 14.8% ± 5.0% (range, 3% to 28%) and Mandenka at 13.8% ± 4.5% (range, 3% to 29%). The contributions from the other three African groups were quite modest, with an average of 1.7% from the Biaka, 0.5% from the Mbuti, and 0.3% from the San. In the bar plot of
frappe estimates, individuals (vertical bars) are arranged in order (left to right) corresponding to their value on the first PC coordinate. Clearly, this order correlates nearly perfectly with a decreasing proportion of European ancestry (Figure S1 in Additional file 1). Thus, the most important source of genetic structure in African Americans is based on the degree of European admixture.
Table 1. Estimates of European ancestry and proportional African ancestries in African Americans by US region of birth
African components of ancestry in African Americans
ds).
As a validation of the accuracy of this partitioning procedure, we performed PCA on the combined set of U.S. Caucasians, Africans, and the African Americans with putative non-African-derived genotypes removed (that is, coded as missing). For comparison, we also examined the results of the same analysis, but including all of the genotype data of the African Americans.
For these analyses, we included only the three African population groups that, based on the first analysis, contributed significantly to the African Americans (the Mandenka, Yoruba, and Bantu). As shown previously, when all genotypes are included, the African Americans lie intermediate between the Africans and European Americans, at varying distances based on their degree of admixture (Figure
2a).
(a) all genotypes, and
(b) only the genotypes of African origin in the African Americans. Comparison of (a) and (b) demonstrates the effective elimination of the European ancestry component from African Americans by using
saber.
We then characterized the African ancestry in African Americans by performing PCA and estimating IA with
frappe by using the U.S. Caucasians, Africans, and African Americans, with non-African genotypes removed. To determine whether we could distinguish the African populations from one another, we first ran
frappe including all the 94 African individuals (setting K = 6). This unsupervised analysis unambiguously separated the San and Pygmy populations from the West Africans and, to a lesser degree, the three West African populations (Yoruba, Mandenka, and Bantu). To be confident in the groupings of the West African population, we performed a series of leave-one-out
frappe analyses that include 57 individuals from the three West African populations: in each
frappe run, we fixed all individual within their respective populations except for one, whose ancestry was allowed to be admixed and estimated (see Methods). Results are given in Figure S2 in Additional file 1. The close genetic relationship of these three groups is evidenced by the imperfect ancestry allocation to an individual's own population. However, in every case,
frappe assigns the majority ancestry to an individual's own population, and in most cases, the large majority.
The Bantu appear to have closest ancestry to the Yoruba. This is consistent with the Nigerian origins of the Yoruba and the presumed origins of the Bantu from the southwestern modern boundary of Nigeria and Cameroon 24], and the subsequent migration of the Bantu east and south 5,
25].
Figure
3 displays the PCA results of the African Americans and the three closely related African populations (Yoruba, Mandenka, and Bantu). Several features are worth comment. First, despite their genetic similarity, PCA shows clear separation among the Yoruba, Mandenka, and Bantu populations, based on the first two PCs. Second, Figure
3 reveals that the African Americans are placed as a single cluster in the convex hull defined by the three African groups.
Figure 3. Principal components analysis of three West and Central West African populations (Mandenka, Yoruba, and Bantu) and African Americans by using only African-origin genotypes in the African Americans.
Figure 4. Individual ancestry estimates in African Americans by using only their African genotypes, from a supervised structure analysis with
frappe, including all six African populations and U.S. Caucasians as fixed (K = 7). Color coding of populations is the same as that in Figure 1.
Table
1 provides the averages and standard deviations of IA derived from the
frappe analysis described earlier (Figure
4) for the African components of African ancestry for the 128 African Americans.
Overall, we estimate within-Africa contributions of 64%, 19%, and 14% from Yoruba, Mandenka, and Bantu, respectively. The variances for the various African IA components are much smaller than those for the European IA and are roughly similar across groups (SD ranging from 0.038 to 0.049). These observations are consistent with visual inspection of the bar chart in Figure
4, that African Americans generally derive substantial ancestry from all three West and Central West African population groups. We also note from Table
1 that no significant differences exist among African-American subgroups defined by U.S. region of birth, in terms of IA estimates for any African ancestral component, nor are any significant differences in IA found, based on gender (data not shown).
Discussion
Another study of mtDNA haplotypes in African Americans and different African populations found that more than 50% of the African-American mtDNAs exactly matched common haplotypes shared among multiple African ethnic groups, whereas 40% matched no sequences in the African database they referenced
26]. Fewer than 10% of African-American mtDNA haplotypes matched exactly to a single African ethnic group. The haplotypes that did match were more often found in ethnic groups of West African or Central West African than of East or South African origin.
The most extensive examination of mtDNA haplotypes in Africans and African Americans
13] used mtDNA data from a large number of African ethnic groups spread around the continent. These authors observed large similarities in mtDNA profiles among ethnic groups from West, Central West, and South West Africa, with a continuous geographic gradient. As observed previously
26], these authors also found that many mtDNA haplotypes were widely distributed across Africa, making it impossible to trace African ancestry to a particular region or group, based on mtDNA data alone.
These authors also estimated the proportionate ancestry within Africa based on African American mtDNA haplotypes as 60% from West Africa, 9% from Central West Africa, 30% from South West Africa, and minimal ancestry from North, East, Southeast, or South Africa.
These studies all suggest close genetic kinship among various West African, Central West African, and South West African ethnic groups. A prior analysis of genetic structure among the African populations included in the HGDP based on 377 autosomal STR loci was able to define distinct genetic clusters for the Biaka, Mbuti, and San; however, the study lacked the power to differentiate the Mandenka, Yoruba, and Bantu groups
27].
Similarly, another study examining two ethnic groups from Ghana (Akan and Gaa-Adangbe) and two from Nigeria (Yoruba, Igbo), based on 372 autosomal microsatellite markers in 493 individuals, did not differentiate these groups by genetic cluster analysis and found only modest genetic differences between them 28]. In contrast, greater resolution of African ethnic groups, particularly for the Mandenka and Yoruba, was possible in our analysis, based on more than 450,000 SNPs. We note that, in a recent study of malaria, PCA distinguished the HapMap YRI individuals from the Mandenka individuals in the Gambian sample on the basis of 100,715 SNPs; however, admixture analysis with a few selected markers did not reveal clear clusters that correspond to self-reported ancestry
29].
It is of interest to compare our African admixture estimates to descriptions of proportional representation of various African groups to the Middle Passage and slave trade occurring in post-Columbian America. A highly detailed census based on historic records has been documented by several authors
10-
12]. Africans were deported from numerous locations along the broad western coast of Africa, ranging from Senegal in the far west all the way down to Angola in the southwest. In addition, a smaller number of slaves were taken from the southeast of Africa.
In terms of numbers, the largest group, approximately 50% to 60%, derived from Central and Southern West Africa and the Bight of Biafra; approximately 10% from Western Africa; 25% to 35% from the West Coast in between (Windward Coast, Gold Coast, and Bight of Benin), and the remaining 5% from Southeast Africa 7]. These estimates show considerable consistency with our results, which also indicated the largest ancestral component of African Americans to be from Central West Africa, followed by West Africa and Southwest Africa. However, because we did not have groups representative of Southeastern and other parts of Southern Africa, we may have underestimated their ancestral representation among African Americans.
It is important to note that considerable migration has occurred among African ethnic groups over the past three millennia or more. For example, the two Bantu groups included in our analysis originated from a more-central African location (Nigeria-Cameroon) several millennia ago, making precise geographic localization of African ancestry difficult
30].
This difficulty is also reflected in the close genetic relationships among the various West, West Central, and South West African groups, who also show considerable overlap in terms of mtDNA haplotypes.
The results of our analysis also strongly point to random mating among African Americans with respect to the African components of their ancestry. This is reflected both by the modest variances we observed in the African IA components, and also by the lack of structure in the PC analysis of African Americans with non-African genotypes removed. This conclusion is consistent with the idea that,
for most African Americans, specific African origins are mixed or unknown or both and do not affect social characteristics that influence the choice of mate. It is also consistent with the notion that the African slaves brought to North America were mixed with regard to their geographic and ethnic ancestry and language 32]. By contrast, considerably greater variation in the proportion of European ancestry was found within the African Americans in our study. This high level of variation in European ancestry may reflect recent admixture or nonrandom mating (for example, as seen in Latino populations
33]), or both; these questions require additional study.
Format: PDF Size: 16KB
Download file
This file can be viewed with:
Adobe Acrobat Reader