I've been fascinated by this stuff for a long time. For reference, the main resource I'm going to use in this thread is this study, published in Nature in 2017. This study isn't definitive and I'm sure more and more accurate work will appear over time, but I like it because it's one of the most comprehensive to date and appears to agree with the results of most of the other recent large-scale studies. Feel free to add other studies and information, as I will add to some parts as I post.
main article: Human ancestry correlates with language and reveals that race is not an objective genomic classifier
supplemental data (pdf download): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5431528/bin/41598_2017_1837_MOESM1_ESM.pdf
Basically, they're saying that mathematically it makes more sense to split mankind into 21 different ancestral groups than to split them into 4 races. And in reality, the divisions between groups are so vague that they could have split humanity up into 12 groups or into 30 groups and the math would still have worked out pretty similarly. And in reality there are likely more groups than that - for example, even though they put together 282 global studies with nearly 6,000 genomes combined, I don't think they have any Australian aborigines in the sample, or the Andaman Islands, and they're probably missing other genetically unique groups as well.
But with the 21 group-division they ended up using because it was slightly more mathematically supportable than other groups, this is the ancestry tree they got:
Notice that it's not just a straight "tree", but that there were also multiple mixing events in ancient history that helped to form those groups.
However, it must be noted that these are not modern ethnic groups but rather ancient ancestries, ones that formed perhaps 10,000 years ago at the latest. In reality, modern people no matter where they live are a mix of multiple different ancient ancestries. Here are what their 6000 individuals from 282 ethnic samples look like with the different colors representing different ancestries.
That graphic is small and some of the names of the various ancestries are unclear, so I'm going to make more posts with clarifications.
Human ancestry correlates with language and reveals that race is not an objective genomic classifier
main article: Human ancestry correlates with language and reveals that race is not an objective genomic classifier
supplemental data (pdf download): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5431528/bin/41598_2017_1837_MOESM1_ESM.pdf
Genetic and archaeological studies have established a sub-Saharan African origin for anatomically modern humans with subsequent migrations out of Africa. Using the largest multi-locus data set known to date, we investigated genetic differentiation of early modern humans, human admixture and migration events, and relationships among ancestries and language groups. We compiled publicly available genome-wide genotype data on 5,966 individuals from 282 global samples, representing 30 primary language families. The best evidence supports 21 ancestries that delineate genetic structure of present-day human populations. Independent of self-identified ethno-linguistic labels, the vast majority (97.3%) of individuals have mixed ancestry, with evidence of multiple ancestries in 96.8% of samples and on all continents. The data indicate that continents, ethno-linguistic groups, races, ethnicities, and individuals all show substantial ancestral heterogeneity. We estimated correlation coefficients ranging from 0.522 to 0.962 between ancestries and language families or branches.
Basically, they're saying that mathematically it makes more sense to split mankind into 21 different ancestral groups than to split them into 4 races. And in reality, the divisions between groups are so vague that they could have split humanity up into 12 groups or into 30 groups and the math would still have worked out pretty similarly. And in reality there are likely more groups than that - for example, even though they put together 282 global studies with nearly 6,000 genomes combined, I don't think they have any Australian aborigines in the sample, or the Andaman Islands, and they're probably missing other genetically unique groups as well.
But with the 21 group-division they ended up using because it was slightly more mathematically supportable than other groups, this is the ancestry tree they got:
Notice that it's not just a straight "tree", but that there were also multiple mixing events in ancient history that helped to form those groups.
However, it must be noted that these are not modern ethnic groups but rather ancient ancestries, ones that formed perhaps 10,000 years ago at the latest. In reality, modern people no matter where they live are a mix of multiple different ancient ancestries. Here are what their 6000 individuals from 282 ethnic samples look like with the different colors representing different ancestries.
Ancestry analysis of the global data set. The 282 samples are labeled alternating in the left and right margins. The 21 ancestral components are Kalash (black), Southern Asian (dark goldenrod), South Indian (slate blue), Central African (magenta), Southern African (dark orchid), West-Central African (brown), Western African (tomato), Eastern African (orange), Omotic (yellow), Northern African (purple), Northern European (blue), Southern European (dark olive green), Western Asian (white), Arabian (light gray), Oceanian (salmon), Japanese (red), Southeastern Asian (coral), Northern Asian (aquamarine), Sino-Tibetan (green), Circumpolar (pink), and Amerindian (gray).
That graphic is small and some of the names of the various ancestries are unclear, so I'm going to make more posts with clarifications.