In May 2025 was published by Arwin Ralf – ''UYSD: a novel data repository accessible via public website for worldwide population frequencies of Y-SNP haplogroups'', in Nature's European Journal of Human Genetics. It introduces a new Y-DNA haplogroup database, the so-called Universal Y-SNP Database (UYSD), with a public website (ysnp.erasmusmc.nl), similar to FTDNA's and YFull's database and Ytree, and ISOGG's tree, but based on only YFull's YTree.
Although the database can be publicly searched, the platform is exclusively made for the academic community who can submit sample data. It states that "data that have already been published in a peer-reviewed scientific journal are eligible for submission", but in the initial dataset ''some of the data were generated using samples from older DNA-collections and were therefore collected without ethical review or written informed consent'', and were using different technologies, making up a minimal tree, so classification of Y-DNA haplogroup depends on the level of resolution (which is mostly low due to use of older technology). This sparks doubts on actual reliability and how this ''helps minimize errors, ensure accountability, and maintain high data quality'', not causing sampling bias and lack of viability of the same tree.
From the Croatian Institute for Anthropological Research, Zagreb, participated Dubravka Havaš Auguštin and Jelena Šarac. The submitted Croatian samples' year of collection is 2000-2017, without information on original publication, informed consent, ethical approval, and other comments.
There are 185 national samples from Croatia, regionally distributed among Croatian županija's (i.e. counties) of Grad Zagreb (5), Zagrebačka (27), Osječko-baranjska (9), Primorsko-goranska (23), Zadarska (47), Splitsko-dalmatinska (28), and Dubrovačko-neretvanska (46). The haplogroup's frequency in total is 31.89% R1a, 24.86% I2 (almost all is I-L621), 14.59% E-V13, 9.18% R1b (roughly half R-Z2103>Z2105), 5.94% I1 (mostly I-Z58 and Z63 with a I-L22>P109 small minority), 4.32% G2a (mostly G-L497), 3.78% J2a and J2b (is only subclade J-L283) each, 1.08% Q2a1-M378 and 0.54% T1a-Z709.
The results of the Croatian dataset are a bit controversial, for now, as the results are in contradiction with all previously tested sampled populations - nationally and regionally - in scientific and citizen science research. The biggest difference is on the population of Croatia and Dalmatia. For an example in scientific research, the haplogroup I2a-L621 in the dataset is less than 25% (46/185) compared to a national average of 36-39% in Croatia (Mršić et al. 2012, Purps et al. 2014, Primorac et al. 2022), and less than 27% (32/121) compared to a regional average of 50-56% in Dalmatia i.e. approximately 56% in so-called South Croatia (Ljubković et al. 2008, n=166), 54% in South Croatia (Mršić et al. 2012, n=220), 53% in Dubrovnik (Šarac et al. 2016, n=179), 50% in Split (Primorac et al. 2022, n=105), 60% in Zadar hinterland (Šarac et al. 2016, n=25), 61% in Hvar (Šarac et al. 2022) and so on with some differences on individual islands, but cumulatively and aggregated based on demographic is min. 43% on Dalmatian islands. The haplogroup R1a is of significantly higher frequency than usual, with almost 32% (59/185) compared to a national average of 22-24% in Croatia, and 26-38% cca. 29% (35/121) compared to a regional average of 13-21% in Dalmatia.
The designated haplogroup subclades - for now - usually have Y-SNPs of old age e.g. I2-L621 (TMRCA 6400 YBP), R1a-M458 (5000 YBP), R1a-CTS1211/M558 (4600 YBP), R1a-CTS3402 (4600 YBP), E-V13 etc. which are not relevant enough for today's phylogenetic knowledge and understanding of Y-DNA haplogroups frequency and diversity, but the results do make a SNP confirmation that the R1a haplogroup in Croatia is predominantly of R1a-CTS1211/M558 clade (48/59), and that R1a-M458 probabily has a R1a-L260 minority (1/10), however, the frequency of the clade (26%) is too high, even higher than in the dataset of Slovakia and Poland, and shouldn't be misunderstood and used to make hastened claims as 26% in reality is roughly the frequency of the whole haplogroup R1a in Croatia.
In conclusion, the UYSD dataset - at least for Croatia, and for now - is not advisable to be considered and used as a representative of the Croatian population for an analysis of frequency and distribution due to sampling bias, but not necessarily variation of Y-DNA haplogroups on a national and regional level (aside haplogroup peculiarities). A commendable project, hopefully in the future will be added more datasets of more modern technology.