III. Béla király csontjainak tanúsága – Az Árpád-ház eredete
The role of applications based on machine learning is continuously growing in the industry, health sector/bioinformatics and scientific research. American researchers published a bit more than 10 years ago the first machine learning algorithms, which were able to safely predict Y-SNP based haplogroups from Y-STR data.
The goal of the present study was to predict with machine learning algorithms the SNP-based subgroup of three ancient DNA samples (King Béla III and two Khazar samples) belonging to Y-DNA Haplogroup R1a, in order to predict their geographic origin and mutual genetic relatedness more accurately. This is the first study applying machine learning algorithms for researching Hungarian prehistory.
Based on the Y-STR haplotype of King Béla III, we estimated with the machine learning algorithm in the first step that he belonged to the R1a-Z93 subgroup that is most common among Indo-Iranic and Turkic speaking peoples. The second step predicted that King Béla III belonged to the Z2123 subgroup of R1a-Z93. The Phylogenetic analysis showed King Béla III most likely belonged to the relatively rare YP451+ YP449- subgroup of Z2123, which practically only appears in the North Caucasus, especially among Karachays and Balkars.
Based on our results, we could hypothetically conclude that the Árpád Dynasty has common origin with one ethnic component of the Karachay people.
In our study we proved that it is possible to increase the accuracy of Y-DNA haplogroup prediction of historical aDNA samples with mathematical methods using contemporary Y-STR haplotypes. With the help of this method, larger historical aDNA studies could save a lot of research funds and DNA carrying out tailored deep SNP-testing of samples instead of using general SNaPshots.
