Heiner Rindermann, Michael A. Woodley, James Stratford (2012)
Studies investigating evolutionary theories on the origins of national differences in intelligence have been criticized on the basis that both national cognitive ability measures and supposedly evolutionarily informative proxies (such as latitude and climate) are confounded with general developmental status. In this study 14 Y chromosomal haplogroups (N=47 countries) are employed as evolutionary markers. These are (most probably) not intelligence coding genes, but proxies of evolutionary development with potential relevance to cognitive ability. Correlations and regression analyses with a general developmental indicator (HDI) revealed that seven haplogroups were empirically important predictors of national cognitive ability (I, R1a, R1b, N, J1, E, T[+L]). Based on their evolutionary meaning and correlation with cognitive ability these haplogroups were grouped into two sets. Combined, they accounted in a regression and path analyses for 32–51% of the variance in national intelligence relative to the developmental indicator (35–58%). This pattern was replicated internationally with further controls (e.g. latitude, spatial autocorrelation etc.) and at the regional level in two independent samples (within Italy and Spain). These findings, using a conservative estimate of evolutionary influences, provide support for a mixed influence on national cognitive ability stemming from both current environmental and past environmental (evolutionary) factors.
1.1. Theories on the origins of international variation in intelligence
Research on international cognitive ability differences has produced important, but also ambiguous and controversial results (Hunt & Carlson, 2007). Many studies have shown that cognitive ability (intelligence, knowledge and the intelligent use of knowledge) is the most important feature of “human capital” undergirding both individual and national wealth and its growth (e.g. Jones, 2011; Rindermann, 2012; Weede, 2004), technological and scientific progress (Gelade, 2008; Rindermann & Thompson, 2011; Woodley, 2012), tolerance and democracy (Deary, Batty, & Gale, 2008), and health (e.g. Oesterdiekhoff & Rindermann, 2007; Rindermann & Meisenberg, 2009).
From an evolutionary point of view the most parsimonious explanation for the existence of individual (genetic) differences in cognitive ability is mutation selection balance. This results from the tendency for deleterious alleles of small effect size to be weeded out of a population at a rate equal to the rate at which new mutations arise. It has been argued that traits like cognitive ability function as fitness indicators in sexual and social selection, as they indicate via their levels the underlying genetic quality of a phenotype (Miller, 2000; Penke, Denissen, & Miller, 2007). The genetic component of population level differences in intelligence on the other hand is unlikely to be related to mutation load. Instead it has been hypothesized that these result from the effects of common polymorphisms with small effect sizes, which differ in frequency between populations in non-random ways (Meisenberg, 2003). In both cases (individual and population differences), environmental factors are also an important source of differences.
In order to account for this apparently non-random between-population variation in cognitive ability, a variety of proximate (non-evolutionary) theories have been proposed. Their status ranges from speculative to firmly empirically based: Frequently discussed are the effects of wealth and health, educational level and educational policy, geography and political power, modernization and Westernization, culture and global world views (e.g. Eppig, Fincher, & Thornhill, 2010; Rindermann, 2008; Rindermann & Ceci, 2009; Wicherts, Borsboom, & Dolan, 2010).
Several scholars have also developed distal or evolutionary theories, three of which are frequently cited:
(1) The oldest of the three is the cold winters theory (Hart, 2007; Lynn, 2006), which is based on the idea that colder environments are more cognitively demanding than milder environments. Lynn argues that these more cognitively demanding environments facilitated the evolution of higher intelligence among groups of Homo sapiens migrating northwards into Europe and then eastwards into Asia. The sorts of selective pressures encountered by these populations would have included being able to effectively hunt large prey (e.g. through the invention of the atlatl and bow), the use of a large range of tools, the capacity to store food, the need for shelter and clothing, and the need for more sophisticated social competences of a sort which would have permitted the emergence of social complexity.
(2) The second is the life history Differential-K theory (r–K theory within highly K selected humans; Rushton, 2004), which holds that higher stability environments such as those found in more northerly or easterly regions of the globe facilitated the evolution of cognitive mechanisms for anticipating future conditions and planning ahead, in addition to the need to deal with predictable, recurrent environmental challenges. Intelligence is believed to be a high-K trait, which, along with the capacity to delay gratification, enhanced prosociality and greater parenting effort, would have been positively selected in these stable environments. Conversely, low stability environments (such as those associated with more southerly latitudes) are thought to have selected for less differentially-K (relatively more r-selected) traits, as humans would have been less able to plan for contingencies. The observation that brain size covaries with intelligence and other supposed life history indicators has been offered as support for this theory (Beals, Smith, & Dodd, 1984; Rushton & Rushton, 2004). The intelligence-brain size correlation furthermore fits with Lynn’s model, although contrary to one aspect of the theory is the observation that at the individual differences scale, the K-factor latent in diverse measures of life history speed does not correlate with the general factor of intelligence (g; Woodley, 2011a). It must be noted however that at cross-national scales national cognitive ability does in fact load positively on a K super-factor (Templer, 2008; Woodley, 2011b).
(3) The third is the general intelligence as a domain specific adaptation theory (Kanazawa, 2004, 2008, 2010), which claims that general intelligence evolved as a domain specific adaptation to evolutionary novelty. Evolutionarily novel problems are problems not regularly encountered in the environment of evolutionary adaptedness (EEA – the environment(s) which shaped human evolution in the Pleistocene – it roughly corresponds to the African Savanna), but which were nonetheless solvable via the application of logical and inductive thinking. Environments more geographically removed from the ancestral human EEA are believed to contain a greater degree of ‘evolutionary novelty’ (i.e. they would have been colder, contained different flora and fauna and had larger seasonal food fluctuation all of which would have posed cognitively solvable challenges), so would have selected for greater general intelligence. This theory has been criticized however on the basis that g is a source of individual differences in intelligence, and appears to be highly domain general (i.e. it arises from the intersections among a large array of cognitive processes), rather than being a domain specific adaptation (i.e. human universal) in the vein of language acquisition or cheater detection (Borsboom & Dolan, 2006; Kaufman, DeYoung, Reis, & Gray, 2011; Penke et al., 2011; Woodley, 2010a).
(4) There are additional theories (and also much speculation) concerning the evolutionary forces that influenced single populations in certain historical periods such as in the case of the Ashkenazi Jews in medieval Europe (Cochran & Harpending, 2009; MacDonald, 1994), the British between years 1250 and 1850 (Clark, 2007) and the Chinese from around 200 BCE to 1950 (Unz, 1981).
All these theories could be summarized under the general assumption that environmental challenges, both natural and social, which could be mastered by intelligence, will increase genotypic intelligence in the long run because phenotypically more intelligent members of the population historically had more surviving offspring. The requirements for this are that intelligence is at least modestly heritable, that there has been sufficient time in terms of generations for the frequencies of intelligence genes to have changed, and that increases in intelligence are non-prohibitive in terms of limitations imposed by brain metabolism, infant cranial size, and susceptibility to physical and mental diseases.
Correlations between what we here term national cognitive ability scores (which combine both psychometric IQ and student assessment tests) and variables such as latitude (Kanazawa, 2008), skin reflectance (Meisenberg, 2009; Templer & Arikawa, 2006), aggregated life history indicators (Templer, 2008), temperature and distance from the ancestral environment (Kanazawa, 2008), along with correlations involving proxies for evolutionary cognitive development such as cranial capacity with variables like population density (as an indicator of cognitive demand resulting from increased competition and social complexity; Alexander, 1989; Bailey & Geary, 2009), mean temperature and latitude have all been offered as support for evolutionary theories. In particular the study of Ash and Gallup (2007), in which the co-development of cranial capacity and encephalization was examined in the context of variation in mean temperature and latitude during the transition of Homo habilis into Homo sapiens lends support to these hypotheses (the reported correlations were around r=.50 to .60).
However, different researchers have called these evolutionary theories and their supporting evidence into question (e.g. Hunt & Sternberg, 2006; Wicherts, Borsboom, & Dolan, 2010). Recently, Wicherts et al. made two significant points of criticism, firstly that these theories assume that contemporary national cognitive ability estimates are meaningful proxies of “stable” national intelligence differences. According to Wicherts et al. these theories entirely ignore the Flynn effect: differences between countries and peoples in terms of current intelligence levels could reflect either advances or delays in a common global modernization process. In 100 years time, national cognitive ability levels could look very different owing to the fact that developing countries now exhibit larger secular gains than developed countries (Flynn, 2007; Wicherts, Dolan, Carlson, & Maas, 2010). Secondly, these theories ignore population movements — this is especially significant as the ancestors of groups possessing a certain level of cognitive ability may have originated in a region substantively removed in geographical terms from the group’s current range, which makes evolutionary interpretations of latitude/national cognitive ability correlations problematic (e.g. prehistoric migration of Iberian Celts to Britain or more recently Chinese to Singapore and British to Australia). In relation to the first criticism, Wicherts et al. found evidence that contemporary national cognitive ability along with latitude and temperature are highly confounded with variables indicative of the developmental status of nations and together give rise to an apparent general development common factor, which is in turn a potential source of the Flynn effect.
Wicherts et al.’s criticism is significant as it suggests that any viable study of possible evolutionary influences on the causes of national intelligence differences must not only be capable of properly distinguishing between alternative explanations (i.e. environmental causes vs. evolutionary ones), but must also be sensitive to other potentially confounding factors such as population movements.
2. Haplogroups — an evolutionarily informative variable
A haplotype (haploid genotype) represents a group of linked genetic loci on a single chromosome. Haplogroups represent groups of similar haplotypes linked through a common evolutionary past (common ancestor) via a single-nucleotide polymorphism (SNP). Especially useful as markers of descent are haplogroups inherited paternally through the nonrecombining portion of the Y-chromosome or maternally through mitochondrial DNA. They are unambiguous measures of ancestry because, as a rule, they are selection neutral and neither Y-DNA nor mtDNA recombines. Their current distributions in the world’s populations are attributed to genetic drift and especially population bottlenecks and founder effects, migration, and, more importantly, their association with selected variants in neighboring genes or with larger genetic patterns. Past research indicates that certain haplogroups are associated with relevant phenotypic patterns, such as haplogroup I, which is associated with an accelerated progression from HIV to AIDS (Sezgin et al., 2009) and haplogroups E3* and K*(xP) (meaning: K excluding P), which is associated with endurance running (Entine, 2000; Moran et al., 2004).
In this study the relationship between measures of haplogroup frequency and national cognitive ability will be studied. This may be especially informative, as significant associations may indicate a relationship between the evolutionary history of haplogroups and national cognitive ability, thus satisfying Wicherts et al.’s criteria for an evolutionarily informative variable which (unlike latitude, skin reflectance or life history indicators) is not likely to simply be a proxy for developmental status. Wicherts, Borsboom, and Dolan (2010) even acknowledge that comparisons between haplogroups are a good potential basis for testing evolutionary theories, however they question the relevance of national cognitive ability measures to this endeavor.
A further advantage to using haplogroups is their capacity to precisely identify environments and factors of potential relevance to the evolution of cognitive ability differences, as they are associated with specific points of origin in time and space and specific histories in terms of migrations along with cultural and evolutionary developments (Woodley & Stratford, 2009). This has a significant advantage over studies employing the cruder and also more controversial category of race (Lewontin, 1972; Woodley, 2010b), as races constitute broad categories, whose constituent populations have significantly disparate origins in both time and space.
Haplogroups by contrast are highly population-specific. Haplogroups are not potential genes for intelligence; any significant relationship between the frequency of these haplogroups and national intelligence merely indicates that the emergence and spread of the haplogroup might have been concomitant with selection for genes, which may have been either directly relevant to the development of intelligence (i.e. they enhanced neurological functions) or were indirectly relevant in some way (i.e. they facilitated improvements in the absorption of nutrients relevant for brain development). Robust effects hint at the possible location of associated genes for cognitive ability.
One crucial problem in need of addressing is the pattern of high magnitude correlations among important evolutionary, ethnic, cultural, social, historical, geographic and economic aspects of societies (Hunt, 2011, p. 440; Meisenberg, 2012; Wicherts, Borsboom, & Dolan, 2010). It is therefore essential to control for possible biasing variables influencing any putative haplogroup-cognitive ability relationships. We choose for this purpose measures of global developmental status, including important factors like education, wealth and health. Moreover developmental status is a potentially overly strong control (it may possibly absorb too much variance) as differences in economic and cultural development might themselves be the consequences of differences in cognitive ability and indirectly therefore an expression of genetic differences (i.e. environment as extended phenotype; Dawkins, 2008/1992). Any remaining statistical effect of haplogroups on cognitive ability would suggest evolutionary-genetic influences on national differences in this variable. Furthermore within-country analyses can be used to check the robustness of any effects.
3. Materials and methods
3.1. Haplogroup data
The data on haplogroup frequency were obtained from Eupedia (2011), an online quantitative human biodiversity resource, which collects data sets on haplogroup frequencies and other genetic variables in addition to data on the distribution and history of countries, regions and ethnicities. Eupedia has been used in various published studies (e.g. by Bembea, Patocs, Kozma, Jurca, & Skrypnyk, 2011; De Beule, 2010; Lee et al., 2008; Solovieff et al., 2010). They use a clear and homogeneous nomenclature for haplogroups, which is an advantage given that vagaries in haplogroup nomenclature can be confusing (see Karafet et al., 2008; Underhill & Kivisild, 2007 for an overview). 
 As it is not open source the Eupedia dataset also seems to be less affected by errors. E.g. there was an error on Wikipedia, until 5th of July 2011 on the page on “Haplogroup T”, referring to the Moran et al. (2004) study and claiming that K*(xP) is identical with haplogroup T (this was eventually corrected by the authors of the Moran et al. paper after hints from the authors of this paper).
The distributions of 14 (condensed to 12) Y chromosomal DNA haplogroups were available for N=47 European, Middle Eastern and North African countries. To ascertain their robustness, the distributions were compared to other published data (e.g. Karafet et al., 2008). The investigated haplogroups and their geographic origins are described in Table 1. In the case of the 11 non-European countries, Eupedia does not list separate entries for the subclade haplogroups I1, I2a, I2b, G2a, E1b1 and N1c1. Instead it lists more general frequency data for I, G, E and N. Based on their chronological development and indicating coloration (Eupedia, 2011), haplogroup I encompasses I2a, but not I1 and I2b; as a consequence these two haplogroups were assigned frequencies of 0 and I and I2a were lumped into the broad category of haplogroup I. The same procedure also indicated that G, E and N were concomitant with respect to G2a (rather than the other G subclades), E1b1 (E1b1 is the most common subclade within E) and N1c1 (no other N subclade is listed). G, E, and N are also on a direct line of descent to G2a, E1b1 and N1c1. As a consequence these were lumped into single haplogroups (G, E and N) for the purposes of this analysis. Finally individual frequencies for the related T and L (common ancestor haplogroup LT) were combined to give a composite T(+L) frequency for these countries. Among the combinations the aligning of I with I2a is somewhat questionable, however this haplogroup category turned out to be irrelevant upon analysis and was eliminated. Eupedia states that for each country or region the sample size is at least 100. In the case of Italy, Germany, England and Ireland, the sample sizes are over 2000 (France and Spain: more than 1000; Portugal: over 900; Belgium: over 750; Netherlands, Finland and Hungary: over 650; Greece: over 500).
Data on mtDNA haplogroups are also available from Eupedia, but these have not been used because there is substantive evidence that mtDNA is not selection neutral, as had once been widely believed (Mishmar et al., 2003). This means that mtDNA haplogroups cannot be used as reliable indicators of ancestry as positive selection would have encouraged the spread of these haplogroups between populations thus obviating any potentially evolutionarily informative shared clinality with national cognitive ability. One possible reason for this is that males were historically less mobile than females, as they typically migrated shorter distances (Cavalli-Sforza, 1997, p. 7721).
Based on their evolutionary meaning and correlation with cognitive ability we grouped the haplogroups into two supergroups: “A”: I1, R1a, R1b and N (indicators of more recent cultural progress in Mesolithic and Holocene – these were positively correlated with national cognitive ability), and “B”: J1, E and T[+L] (indicators of older cultural progress – these were negatively correlated with national cognitive ability). An in depth analysis of the evolutionary meaning of these supergroups can be found in the discussion.
3.2. Cognitive ability
Cognitive ability data by nation (N=47) were obtained using a compilation of Lynn’s national psychometric intelligence data updates (Lynn & Meisenberg, 2010) and the sum of student assessment tests (PISA, TIMSS, PIRLS from 1995 to 2009; updated version from Rindermann & Thompson, 2011), which were combined using the procedure described in Rindermann (2007a). Corrections (e.g. for missing data or low participation rates) were not applied. “National cognitive ability” is the mean cognitive ability level of a country found in our sum value of psychometric and student assessment studies. A frequently used synonymous term for cognitive ability would be cognitive competence.
A note on g/G: It has been argued that because a) the Flynn effect does not occur on g, instead it manifests as heterogeneous gains in specific abilities (Wicherts et al., 2004), and b) national cognitive ability measures are confounded with potential developmental facilitators of the Flynn effect, national cognitive ability measures cannot be said to substantially capture g (Wicherts, Dolan, Carlson, & Maas, 2010). In anticipating these criticisms, Rindermann (2007a, 2007b) found that in spite of flaws in sampling, standardization and measures (e.g. Hunt, 2011) the sum of different psychometric tests and the sum of different student assessment tests are strongly correlated at the national level (r>.85), and also exhibit high factor loadings on an international Big-G factor (λ=.95–1.00; Rindermann, 2007a). In addition to which they correlate highly with measures of adult education (r>.70; Rindermann, 2007a). This Big-G is an index variable encompassing measures of both fluid and crystallized intelligence (intelligence, knowledge and the intelligent use of knowledge). When we use the shorter term “intelligence” it means that we do not have to repeat in consecutive sentences the more complicated term“cognitive ability”. “IQ” as acronymis used in tables and mathematical formula and as our cognitive ability scale (M=100, SD=15; Greenwich norm).
While the Big-G has been criticized on the grounds that this common factor might not have the same psychometric properties as the individual differences level g, and could also arise from the same source as the general development factor discussed earlier (Wicherts & Wilhelm, 2007), we interpret this finding in light of strong inference to indicate that although national cognitive ability measures are by no means perfect proxies for g (fluid intelligence measures in particular exhibit a very large Flynn effect), they are nonetheless sufficiently g-loaded for our purposes. We grant that the Flynn effect has both widened ability gaps between nations and has attenuated (but not obviated) the g-loadings of national cognitive ability measures.
3.3. Environmental conditions and further controls
In order to determine the correlations with developmental factors thought to promote cognitive ability, three variables were selected: per capita nutrition (as measured by energy consumption in Kcal per capita per day 2003–2005 — the data were obtained from the Food & Agriculture Organization of the UN, 2009, Table D.1; N=46), Gross National Income (logged GNI 2008, N=46; UNDP, 2010) and the Human Development Index (HDI, N=45) which is a highly general measure of human development used by the UN (consisting of life expectancy, years of schooling, and GNI; UNDP, 2010). HDI correlates even more strongly with national cognitive ability than the general educational level of society (the sum of the rates of literate adults, of people who graduated from secondary school, and years of school attendance; see Rindermann, 2007a; Rindermann & Ceci, 2009), which was the strongest correlate in past analyses (here: HDI and cognitive ability: r=.85, general educational level and HDI r=.82, N=43 nations with data for all three variables in this data set).
Median latitude was used as a customary national climate indicator (data were obtained from The CIA World Factbook; CIA, 2011; N=47). Skin brightness (skin reflectance) was taken from Jablonski and Chaplin (2000, pp. 74f.). Owing to the fact that they provide data for only 13 countries in our sample we added a second source (Templer & Arikawa, 2006, pp. 124f.). Both indicators (high reflectance/brightness) correlate at r=|.91| (N=43, mean: α=.95). Data here are for N=46 countries. It is debated as to whether coloration has a direct or indirect causal association with cognitive ability and behavioral dispositions, although some researchers assume pleiotropic effects (Ducrest, Keller, & Roulin, 2008; Jensen, 2006). We include it as an indicator of evolutionary history, however this variable like race is broad, encompassing the evolutionary histories of many disparate populations (Beaver & Wright, 2011).
Religions were weighted according to Max Weber (2001/ 1905) and Werner Sombart (1915/1913) in terms of their positive attitude toward education (“Bildung”, literacy), thinking, rationality and achievement (from most to least: Protestantism, Judaism, Confucianism, Catholicism, Orthodoxy, Buddhism, Hinduism and Islam, and finally Animism; see Rindermann & Meisenberg, 2009). We coded countries based onwhich religions were traditionally predominant, e. g. Protestantism for the Netherlands. Data on religions are from the CIA World Factbook, from the German Department for Foreign Affairs (www. auswaertiges-amt.de/www/de/laenderinfos), and from a country encyclopedia (“Länderlexikon”, Jahrbuch, 2004). N=47 countries. Religion as belief system can also have an impact on cognitive ability via its influence on education, learning and thinking habits.
Finally, we controlled for spatial autocorrelation. This results from the non-independence of spatially distributed data-points (such as countries) owing to a) the tendency for variables to spatially cluster and b) the tendency for data points to be arbitrarily defined, such that having lots of countries within a given region with essentially arbitrary national boundaries can significantly inflate N, thus giving rise to inflated correlation magnitudes and significances (Hassall & Sherratt, 2011). To control for the effects of spatial autocorrelation a distance based spatial lag variable was incorporated into the regression and path analyses as a predictor to produce spatially autoregressive models (Anselin & Bera, 1998). The variable was produced in GeoDa (freely available software from: http://geodacenter.asu.edu/) using a shape file composed of polygons representing all UN administrative regions. Each nation was converted from a representative polygon into a geometric centroid and attributed a cognitive ability level. The distances between the centroids in longitude and latitude acted as the weights for determining the spatial distribution of IQ values. The distance threshold was set to ensure that all nations had at least one neighbor. In essence the spatial lag variable is the IQ value expected for a nation based on its position with respect to other nations and the spatial distribution of national cognitive ability. Spatially autoregressive models allow the effects of spatial distribution to directly compete for predictive power with the developmental and evolutionary factors. It is also useful to know to what extent spatial autocorrelation exists in the IQ values of all nations. To this end the Moran’s I was calculated for the IQ values of all nations and found to be I=.556 this suggests that there is a high degree of spatial autocorrelation in national cognitive ability values which accords with the findings of Hassall and Sherratt (2011), and provides a strong justification for controlling spatial autocorrelation.
3.4. Within-country data for three countries
For three countries Eupedia lists haplogroup distributions for different regions and for these we have aggregated regional cognitive ability data from PISA (Programme for International Student Assessment). The PISA data were selected or aggregated according to the regions listed by Eupedia. PISA 2006 results in reading, mathematics and science data were taken from the OECD (2007, pp. 250, 304, 308). As the “Central Italy” region listed in Eupedia is associated with negligible data in PISA 2006, we added PISA 2003 data for Tuscany in reading, mathematics and science (OECD, 2004, pp. 453–457).
Germany: North (PISA 2006 results from Schleswig-Holstein, Hamburg, Lower Saxony), East (Mecklenburg-Vorpommern, Brandenburg, Saxony, Saxony-Anhalt, Thuringia), West (Saarland, North Rhine-Westphalia, Rhineland-Palatinate, Bremen) and South (Baden-Württemberg, Bavaria), not assigned (Berlin/east and west, Hesse/middle). Bremen was assigned to the West because politically and culturally (in the last 50 years) it has come to resemble North Rhine-Westphalia more than Schleswig-Holstein and Lower Saxony.
Italy: North (PISA 2006 results from Trentino, Friuli, Piedmont, Lombardy, Veneto), Central (PISA 2006 results from Emilia-Romagna, PISA 2003 results from Tuscany), South (PISA 2006 results from Campania, Apulia, Basilicata) and Sicily (ditto) and Sardinia (ditto).
Spain: Andalusia (ditto), Basque (ditto), Cantabria (ditto), Galicia (ditto).
For two of the countries (Italy and Spain) there exist evolutionary hypotheses for intelligence differences between the North and South (Lynn, 2010, 2012), but not for Germany. Within Germany in the 20th century there were large migrations due to annexations, expulsions and political suppression resulting in a new mixing of haplogroups at the level of regions. Small sample sizes within countries (number of regions) do not allow complex analyses using further controls.
For the statistical analyses SPSS 19 (correlations, regressions) and Mplus 5.21 were used (path analysis with full-information maximum likelihood — FIML, with no listwise deletion in the case of missing data). Regressions, partial correlations and path analyses were used for controlling possible biasing factors. Hypotheses are tested on the basis of effects and their robustness against controls and in different samples; at the level of countries or regions significance testing is not an appropriate method (see: Rindermann, 2008). FIML is based only on two missing values for HDI (Iraq and Lebanon). We do not assume any distortion resulting from this.
4.1. Cross-country analyses
Correlations among and between haplogroup frequencies, intelligence, the developmental status variables and latitude were obtained in order to determine which of the chosen 12 haplogroups had important and high associations (see Tables 2 to 4).
The haplogroups are modestly correlated with each other, with the highest involving T(+L) (see: Table 2). National cognitive ability was very highly correlated with HDI and latitude (r=.85 and .80; Table 3). Based on Table 4 it appears that of the 12 haplogroups selected for study, eight yielded substantial correlations with national cognitive ability: I1 (r=.57), I2b (r=.56), R1a (r=.38), R1b (r=.47), J1 (r=-.73), J2 (r=-.49), E (r=-.70) and T(+L) (r=-.62). The mean of the correlations is r=|.43|. The haplogroups were also correlated with nutritional quality (mean r=|.24|), wealth (r=|.38|), HDI (r=|.39|) and latitude (r=|.46|). Additionally, the haplogroups were used in linear multiple regression analysis to determine their impacts as predictors of differences in national intelligence. HDI as the most general indicator of environmental quality and also as the strongest correlate of cognitive ability was used in the last two analyses as a further control (see Table 4).
Different stepwise regression analyses (stepwise selection, backward elimination) with only the haplogroups as predictors (first two regressions) or HDI as an additional predictor (last two) explained between 81 and 90% of the variance in national cognitive ability as the criterion variable (corrected 79 to 85%). HDI explained more variance in intelligence than the haplogroups (between 51 and 58% vs. 36 and 32%). Due to correlations among haplogroups and with HDI, the methods used (simple bivariate correlations, partialing out HDI, stepwise regression, backward regression) differentially stressed the impact of single haplogroups. There were also suppressor effects associated with I, G, J2 and N. Finally we added the spatial lag (spatial autocorrelation control) variable. Including it in both the stepwise and backward analyses did not change the results (the variable was removed by the statistical procedure).
Based on theoretical considerations and the empirical results the four haplogroups that were positively correlated with cognitive ability (“A”: I1, R1a, R1b and N; see Fig. 1), and three, that were negatively correlated with cognitive ability (“B”: J1, E and T[+L]; see Fig. 2) were grouped (after standardization), so as to determine their combined statistical impacts on cognitive ability. The breadth of all used variables is now similar (general HDI, general “positive” and “negative” haplogroup sets and general cognitive ability) and comparisons are also more appropriate owing to Brunswik-symmetry (Wittmann, 1991).
These two general haplogroup sets along with HDI (as the most important global development measure and strongest environmental correlate of intelligence) were used in a path analysis predicting cognitive ability (see: Fig. 3). As social conditions (captured by HDI, which is inclusive of health, education and wealth) are assumed to further cognitive development, both direct and indirect paths (where HDI mediates the relationship between haplogroup frequency and cognitive ability) were incorporated into the model.
The correlations of all three predictors with national cognitive ability levels are very high (rHDI=.86, rHsA=.81, rHsB=-.88). In the model HDI (βHDI→IQ=.41) has a strong direct effect on cognitive ability. The same is true for the combined negative haplogroups (βHsB→IQ=-.41), followed by the combined positive haplogroups (βHsA→IQ=.19). Both sets of haplogroups also have indirect effects in this model (total: βHsB→IQ=-.64 and βHsA→IQ=.29).  HDI explains (statistically) 35% of the variance in cognitive ability differences (R²=β×r); the combined haplogroup set B explains 36% and haplogroup set A 15%. When both sets of haplogroups are added, they explain 51% of the variance. This large fraction of the variance in international intelligence differences, explained by haplogroups, is independent of assumptions regarding the relationship between haplogroups and HDI. If we set only correlations and no paths between haplogroups and HDI, the fraction remains the same (as contemporary HDI cannot have had an impact on the distributions of haplogroups). The model is statistical rather than theoretical, as haplogroups are only markers of ancestry and also because reciprocal effects between HDI and intelligence can also be assumed to be operating (i.e. environmental quality stimulates cognitive development in addition to cognitive ability improving environmental quality leading to macrosocial feedback loops).
 There is some evidence that the correlations are slightly increased due to the inclusion of Middle Eastern countries (see also scatterplots in Figs. A1 and A2). Excluding these Middle Eastern countries decreases the correlations, but does not change the correlational pattern (for all 47 countries: rHDI=.85, rHsA=.81, rHsB=-.88; without Azerbaijan, Lebanon, Iraq, Egypt, Tunisia, Syria and Morocco, N=40: rHDI=.74, rHsA=.73, rHsB=-.76). We chose to include these nations in the present analysis owing to the lack of a cogent theoretical reason for their exclusion.
Further controls: If latitude (as an indicator for historical climate and its possible cognitive demands) is used instead of HDI, the impact of the two haplogroup sets increases (they explain 80% of the variance in national cognitive ability relative to less than 1% explained by latitude; rLat=.80, rHsA=.81, rHsB=-.88, βLat→IQ=-.04, βHsA→IQ=.33, βHsB→IQ=-.65). The impact of the two haplogroups also remains stable when skin brightness is used (rSb=.84, rHsA=.81, rHsB=-.88, βSb→IQ=.28, βHsA→IQ=.23, βHsB→IQ=-.46). If weighted Weberian religion (weighted to achievement and educational orientation) is used this cultural index has a strong effect, but the effect of haplogroup set B remains (rWR=.89, rHsA=.81, rHsB=-.88, βWR→IQ=.54, βHsA→IQ=-.04, βHsB→IQ=-.48). If all variables together are used in one analysis the haplogroup set B has the strongest effect (-.362) followed by religion (.361) and HDI (.264) (βHDI→IQ=.26, βLat→IQ=.07, βSb→IQ=.02, βWR→IQ=.36, βHsA→IQ=-.06, βHsB→IQ=-.36). The spatial lag variable (our spatial autocorrelation control) correlates highly with the other variables (N=47: rHDI=.69, rHsA=.72, rHsB=-.72, rIQ=.71). If we add this variable in a path analysis the impact of the other variables slightly increases (βHDI→IQ=.43, βHsA→IQ=.21, βHsB→IQ=-.42), however the spatial lag variable is negligible (βSpat→IQ=-.05).
In summary the impact of haplogroups (when clustered together) remains, together they have the strongest effect, however the meaning of the majority of variables remains open, as they are proxies rather than direct causal determinants. The statistical effects of both environmental and genetic factors show that while national cognitive ability differences have more than one cause, some of the variance can plausibly be attributed to evolutionary factors.
4.2. Within-country data for three countries
Putting all 13 regions of Germany, Italy and Spain together replicates the pattern found internationally: Haplogroup set A (I1, R1a, R1b and N) is positively correlated with cognitive ability (here PISA measures) and haplogroup set B (J1, E and T [+L]) negatively (rHsA=.62 and rHsB=-.69; βHsA→IQ=.27, βHsB→IQ=-.50). Within Germany (no assumption) this pattern cannot be found (rHsA=-.73 and rHsB=.77, N=4 regions), but within Italy (rHsA=.81 and rHsB=-.61, N=5 regions) and within Spain (rHsA=.68 and rHsB=-.30, N=4 regions; here were assumptions), the pattern is detectable. This finding in two further samples supports evolutionary accounts of the origins of regional differences in cognitive ability means within Italy and Spain.
Based on our model, it appears that national cognitive ability is confounded with the general development of society. This is shown by the high correlations with HDI and the observation that in three analyses HDI accounted for the largest mean share of the variance in national cognitive ability. The mean across two regressions and one path analysis for this variable was 48%. Haplogroups do however also appear to be significant predictors of cognitive ability (mean across two regressions and path analysis: 40%). They are especially strong predictors when grouped, with the first set (A, positively correlated) accounting for 15% of the variance and the second set (B, negatively correlated) accounting for 38% of the variance (sum: 53%). Interestingly, controlling for spatial autocorrelation had no effect on the predictive validity of the models, which suggests that the relationships are largely spatially independent.
The fact that these haplogroups are significant predictors of national cognitive ability given (i) the relatively large fraction of the variance attributable to the most important and general developmental indicator, (ii) the relatively small country sample size used, and (iii) the use of different controls coupled with replication at the within country scale suggests that some of the variance in contemporary national differences in cognitive ability can be plausibly attributed to evolutionary causes. The environments of origin and history of these haplogroups may therefore provide clues as to the nature of the evolutionary factors that contributed to these differences.
The positively correlating haplogroup set A was composed of I1, R1a, R1b and N. I1 is a subclade of I which arose either in Europe or Asia Minor 25–30,000 ybp. It has been suggested that the spread of this haplogroup was concomitant with the diffusion of the Gravettian culture (ca. 28–23,000 ybp), which produced some of the earliest works of figurative art, a proxy for advanced symbolic communication (Roebroeks, Mussi, Svoboda, & Fennema, 2000). A major subclade of I is I1 prevailing in Scandinavia (I1/I1a, Rootsi et al., 2004); the spread of this haplogroup was concomitant with the Ertebølle culture (ca. 5300–3950 BCE) and the successive Funnelbeaker culture (ca. 4000–2700 BCE) (Eupedia, 2011). The former culture was associated with a hunter gatherer and fishing mode of subsistence coupled with the large scale use of pottery, whereas the latter culture was associated with a transition away from a hunter–gatherer mode of subsistence toward an agrarian one and is characterized by the proliferation of numerous innovations such as advances in tool use, metallurgy, sustainable agriculture and the development of animal husbandry (Price, 2000).
R1a is principally associated with the Corded Ware culture (ca. 3200–1800 BCE), which was responsible for the introduction of metals into northern Europe and the transition from the Neolithic into the Copper Age and then into the early Bronze Age. This broad archeological horizon is additionally associated with significant innovations in tool use, farming and animal husbandry (Eupedia, 2011). R1b is the most prevalent haplogroup in Western Europe; its introduction into Europe was likely concomitant with the Neolithic and the rise of farming (Cruciani et al., 2010).
Haplogroup N appears to have been principally associated with both the Kunda (8000–5000 BCE) and Comb Ceramic (4200–2000 BCE) cultures. The former was a Mesolithic hunter–gatherer culture associated with the Baltic forest zone and extending through Latvia into Russia. The latter culture was also a hunter–gatherer culture, but it is also associated with innovations in the use of ceramics. The Comb Ceramic culture was gradually absorbed into the Corded Ware culture as it spread throughout the Baltic region and southern Finland c. 4500 ybp (Eupedia, 2011).
I1 arose in southern Scandinavia between 4000 and 6000 years ago (Rootsi et al., 2004). R1a and R1b arose in southwestern Asia (Caucasus, Pontic–Caspian steppe, Kurgan culture) around 22,000 ybp or somewhat later at 18,500 ybp. N and its relevant European subclades arose in Siberia and central Asia 12–27,000 ybp (Rootsi et al., 2007). This suggests that these environments may have been evolutionarily significant for cognitive ability: The presence of environmental harshness (i.e. extreme winter cold) suggests that factors relevant to the cold winters theory could have contributed to an increase in intelligence among the ancestors of those possessing these haplogroups. It is also likely that factors such as the development of agriculture, tools and dairy farming (milk from horses and cattle around 6000 ybp) were themselves an evolutionary catalyst for increasing cognitive ability (Cochran & Harpending, 2009; Hawks, Wang, Cochran, Harpending, & Moyzis, 2007; Wade, 2006), possibly enhancing neurological maturation via the provision of better nutrition during pregnancy, in youth and adulthood. The Neolithic transition to agriculture in cold climates would have been particularly evolutionarily demanding in terms of the need for heightened cognitive resources (e.g. farsightedness and planning).
Cochran, Harpending and their colleagues have found that human evolution, in particular among farmers and stock farmers in European populations, has been accelerating over the last 10,000 years, and that the Neolithic and later periods would have experienced a rate of adaptive evolution significantly higher than the rate characteristic of most of human evolution. This process would have been facilitated by factors such as innovation and transitions between major ecological subsistence paradigms, which would have raised the carrying capacity of the environment such that larger populations would have been possible. These larger populations would have carried more high-ability genes upon which selection could have operated. Additional novel factors would have included increased disease burdens and exposure to new diseases, along with cultural diffusion associated with the carriers of distinct haplogroups meeting and mixing. These factors would have created greater opportunities and challenges for further genetic adaptation. The idea that the association between haplogroup set A and cognitive ability may owe something to recent accelerated evolution should therefore not come as a surprise in light of these recent findings.
Finally the steppe presents an unprotected environment, people living in such an environment are different to the people living in mountains, near to large oceans, in dense forests or in oases surrounded by large deserts, as they are permanently in danger of being attacked by neighboring peoples. This challenge could have selected for enhanced military preparedness a component of which may have been higher cognitive ability.
Haplogroups J1, E and the combined haplogroups T(+L) (set B) were found to be negatively predictive of cognitive ability. J1 is believed to have emerged between 9000 and 20,000 years ago in the Arabian Peninsula (Chiaroni et al., 2010; Semino et al., 2004). Historically the peninsula was more fertile than today, and the spread of the J1e/J1c3 subclade in particular is believed to be associated with the establishment of rain-fed agriculture and semi-nomadic herders present in the Fertile Crescent (Chiaroni et al., 2010). Haplogroup E arose between 50,000 and 55,000 ybp in East Africa. A major subclade E1b1b arose about 25,000 ybp and is thought to have also originated in East Africa (Cruciani et al., 2004), spreading outwards to colonize parts of Europe in the late-Pleistocene. Its presence (subclade E1b1b1b) in Spain and southern Italy is due to recent gene flow from North Africa (Semino et al., 2004). Haplogroup T originated in West Asia and East Africa around 25–30,000 ybp (Eupedia, 2011), and is currently most common in North-East Africa and the west coast of the Arabian Peninsula. Haplogroup L is largely restricted to the Indian subcontinent, originating there some 30,000 ybp, and like haplogroup T, is relatively rare in Europe, with the highest frequencies being found in Southern Europe along the coast of the Mediterranean (Eupedia, 2011).
The populations associated with these haplogroups developed farming, however these populationsmight not have been subjected to the sorts of environmental hardships (i.e. cold winters, seasonal temperature variation) believed necessary to facilitate the transition toward higher levels of cognitive ability. Additionally, certain cultural practices strongly associated with these populations (such as consanguineous marriages) may also have subsequently impaired increases in cognitive ability through negative culture-gene co-evolutionary feedback (Woodley, 2009).
The pattern of haplogroup impacts on cognitive ability found internationally was replicated at the within-country scale in the case of Italy and Spain. The stability of the findings provides additional support for the above mentioned interpretations.
6. Conclusions and limitations
The general developmental level of a society accounts for a sizable portion of the variance in national cognitive ability. Insofar as developmental status does not represent a stable pattern (due to the influence of both modifiable and faster cultural or slower genetic factors) criticisms of evolutionary theories on the origins of national cognitive ability are correct. However, it is evident that when haplogroups are selected as evolutionarily informative variables, they suggest that significant percentages of the variance in national cognitive ability can be accounted for by evolutionary factors. They enhance the presently fragmentary evidence of environmental and genetic factors relevant to the question of the origins of national cognitive ability differences. Both more recent environmental conditions reflected in physical and social conditions and more ancient environmental conditions reflected in genes (genetic heritage from an evolutionary perspective is the consequence of past environments) would seem to be relevant. Furthermore it goeswithout saying that finding evidence for one factor does not deny the relevance of others.
The first limitation of this study is geographic range as haplogroup frequency data were only available for a relatively small number of countries restricted to Europe, the Middle East and North Africa. Future studies should aim to expand upon the list of countries for which haplogroup frequency data are available. This could permit larger samples and a significantly wider range of haplogroups to be studied. The second limitation lies in the analysis of only male lineages, as genes relevant to cognitive development will stem from maternal as well as paternal lineages. Third, single haplogroups are heterogeneous, being composed of a number of diverse subclades, which may in some cases have experienced diverse selective pressures. A possible instance of this can be seen in the case of haplogroup J1: In this sample it is a marker for a negative effect on cognitive ability, however J1 is also a marker for the Ashkenazi Jewish “Cohen” or “Cohn” lineage, an ancestry which has produced many famous intellectuals (e.g. Hermann Cohen, German philosopher; Jacob Cohen, American statistician; Leonard Cohen, Canadian poet and musician; Paul Cohen, American mathematician). Furthermore, countries and regions in the near East, such as Jordan, Palestine and Lebanon, achieve (compared to the rest of the Middle East) good cognitive competence results (World Bank, 2008).
Since the first appearance of the studied haplogroups, enough time has passed for further substantial evolutionary change in genomes. This leads to the fourth caveat as contemporary cognitive ability distributions across countries do not necessarily correspond to past environmental challenges and human accomplishment three to thirty thousand years ago: in the meantime there have been larger migrations, in which one set of people was replaced by another or merged via admixture with others. There are also selective pressures operating within countries, which are thought to have positively or negatively altered ability levels over relatively short periods of time (Woodley, 2012). Therefore not only absolute levels (Flynn-effect) but also differences between nations are at least somewhat subject to change.
Fifth, while this study sheds light on possible evolutionary influences on national differences in cognitive ability today, it tells us nothing about the genetic basis (i.e. common polymorphisms) of intelligence differences, which thus far remain largely unidentified (those that have been identified explain only a small fraction of variance) (e.g. Deary, 2012; Meisenberg, 2003). Therefore, while these findings are intriguing, there can be no certainty about the ways in which genes influence the neurological processes and structures undergirding cognitive ability until the genes responsible are elucidated. The missing link still needs to be found.