Your analysis is like reading a thesis book, very interesting and informative, great work. Since you said I am one of the outliers, is it possible that 23andMe can sometimes misrepresent somali sample? Can there be more than 1 somali sample they use for their basis to identity the autosomal? I know Myheritage is unreliable for horn africans but their recent update version is actually good. The updated version gave similar results as G25.You were right when you said that the samples that turned 12% Arabian were wrongly designated.
I have proved this:
View attachment 357030
I used the least Arabian samples 0-3% as representative of the source, while I added Bronze Age Levantine (reflects the Arabian-like because it neatly reflects this ancestry type) as a stand-in for Arab, then I added Nilo-Saharan. Notice how when the Levantine ancestry rises, Nilo-Saharan elevates in direct proportionality? Well, this is because it is Cushitic ancestry. To know if these people are more or less Arabian, you subtract the NS from the BA Levant and then you will get a decrease or increase. So sample SOMALI6 that is supposedly 12% Arab is likely just 4%.
This makes perfect sense because the basal similarity between the most "Arab" samples was effectively in the same range without much discernment. It's simply because Somalis don't have a lot of Arabian ancestry. Outliers will score higher. I am an outlier, and let me demonstrate how I look using that same source:
View attachment 357031
This is in the ballpark of my 23andme readings:
View attachment 357032
See, no Nilo-Saharan.
Using the averages by Michalis:
View attachment 357033
Somalilanders who have the sample size of 9 barely got anything higher than the selected source samples which were the least Arabian. They're not exceptionally Arabian.
You might ask, but why is the number from Somali less and the BA and NS increased? Because there is an internal signature dimension that is entirely Somali but has differences within Somalis. Meaning, although the source samples who represent southern, northeastern, and central Somalis, there are slight shifts in what is Somali ancestry. Basically one pure sample that is fully Somali might account for 80% of Somali genetics.
To illustrate what I mean (the most outer circle represents the signature extent and boundary, hypothetically, whereas anything outside that is admixture):
View attachment 357036
These are overlapping circles that represent parts of Somali ancestry with the entire thing representing that extent of the internalized diversity without any admixture. However, it does not mean that if samples are 90% overlapping, that the other 10% is foreign. No. That sample is going to soak up the entire 100% because it is Somali ancestry but still, the fit is going to increase. Still, the other 10% is just a homogenous cluster with the rest of the 90%, it's just that no one sample is representative of all non-admixed Somalis to a perfect extent, although as far as homogenity goes, they do represent better than anything else, especially compared to other population internal differences.
Within-signature dimensionality can strictly be because of zero admixture, theoretically speaking.
Look at the Kenyan Somalis, they have higher NS than BA Levant and this checks out. Some of those samples received increased non-Cushitic DNA that was not Eurasian.
To summarize, the Arab ancestry in non-admixed Somalis is greatly exaggerated, where people who have 8% are actually outliers.
We have Giire, who is a Habar Awal (from what I recall) who clearly is very Arab:
View attachment 357038
On par with the Saho:
View attachment 357039
There are some samples taken from a research on UAE that seem to show Somalis:
View attachment 357037
They show fluctuations. The first one clearly has more than the average Arab ancestry, but its like tops 7%. The second one likely fits in the normal range. The third one is similar to the Somalilander samples (does not mean its from there). The fourth one is very similar to the source samples but with slightly higher Arabian. Maybe it's from the south central or Puntland. The third one is a bit higher but we have to remember the taforalt typically levels toward the non-NS needs, so you can roughly group it with the Levant. Then the last one, which is kind of within the norm.
It seems like Somalis are in general 0-5% Arabian.
Look at the average of all the samples (removed the samples from Kenya and likewise did not include the heavy Arab ones):
View attachment 357040
It shows parity between the Jordan EBA and Sudanese, which shows that on average, most Somalis are not that Arab when they don't have any recent admixture. But that does not mean there are not outliers. Many of them exist, but they don't make up a bulk that shifts anything tremendously. Most of the Arab stuff you see is something that was likely soaked up many centuries ago.
These are the Somali Emirati samples:
For example, this was designated as Sudanese Arab, but really is half Somalis, half Arab:
My own description is on the right side.
There are Sudanese, and seemingly Beja samples available from that UAE dataset that I should also post about. They are interesting. I had to correct some of the readings from the original labelling.
Old version

Ethnicity Estimate
MyHeritage DNA uncovers your ethnic origins and helps you find new relatives
Updated version

Ethnicity Estimate
MyHeritage DNA uncovers your ethnic origins and helps you find new relatives