David C. Rowe 2005
University of Arizona
Environmental and genetic explanations have been given for Black–White racial differences in intelligence and other traits. In science, viable, alternative hypotheses are ideally given equal Bayesian prior weights; but this has not been true in the study of racial differences. This article advocates testing environmental and genetic hypotheses of racial differences as competing hypotheses. Two methods are described: (a) fitting means within structural equation models and (b) predicting means of interracial children. These methods have limitations that call for improved research designs of racial differences. One improvement capitalizes on biotechnology. Genetic admixture estimates — the percentage of genes of European origin that a Black individual possesses (independent of genes related to skin coloration) — can represent genetic influences. The study of interracial children can be improved by increasing sample size and by choosing family members who are most informative for a research question. Eventually, individual-admixture estimates will be replaced by molecular genetic tests of alleles of those genes that influence traits.
This article advocates a nearly equal Bayesian treatment of genetic and environmental hypotheses of Black–White racial differences. Black refers to African Americans with Negro racial grouping and a sub-Saharan biological ancestry; White, to Caucasians with a European heritage (ignoring the complexity that some Caucasians reside in India and the Middle East). In explaining racial differences, researchers should regard the prior probability of a genetic hypothesis being true as about the same as that of an environmental hypothesis being true. Too often, this desirable goal has been ignored. In this regard, many journal editors are reluctant to publish articles providing a genetic explanation of a racial difference. Unpopular opinions about race have also led to threats on persons and their careers (e.g., Arthur Jensen; see Pearson, 1991). Yet the study of racial differences “ought to be done from whatever perspectives might promise to shed any light on the subject matter” (Loehlin, 1992, p. 1). A racial difference in intelligence (IQ), for example, could result from unequal stimulation of learning in the families of Blacks versus Whites. It could also result from different gene variant frequencies between Blacks and Whites (or from both processes). Both genetic and environmental hypotheses should be examined by scientists. This article focuses on data suggesting a partial genetic explanation of racial differences; the environmental hypothesis is a topic that is covered in many articles and book chapters (e.g., Helms, 1992; Nisbett, 1998; Suzuki & Valencia, 1997).
This article is divided into three sections. The first deals with the concept of race. It responds to a claim that race is a biologically meaningless concept — a claim that has adherents in fields as diverse as anthropology, population genetics, and medicine (Cavalli-Sforza, 2001; Marks, 2002; Schwartz, 2001). The second section presents two methods for testing racial difference hypotheses. One is based on structural equation modeling; the other, on the use of interracial children. The last section discusses ways in which to strengthen research designs about racial differences.
Race as a Biological and Social Concept
Physical differences among people reveal population origins and histories, tribal and country affiliations, and different sets of customs that are all a part of the richness of human diversity. These are genetic differences available to the eyes — for example, the tall and slight build of a Masai warrior in Kenya; the muscular Bantu farmer in West Africa; the short, fine-featured women of Thailand; the light-skinned European with blue eyes. Examples illustrating the breadth of human diversity are endless. These body builds, facial features, and skin coloration differences are a result of natural selection and genetic drift acting on different geographical populations (Loehlin, Lindzey, & Spuhler, 1975, pp. 40–48).
The melanocortin 1 receptor (MC1R) gene illustrates this principle. In Europeans, the gene has acquired a number of mutations that separately cause a loss of normal function, whereas one functional version of the gene is common throughout Africa. Specific mutations in MC1R may produce a light skin coloration and red or blonde hair; they also contribute to the greater risk of skin cancer in light-skinned, redheaded Europeans than in darker Europeans and also to variation in cancer risk among Europeans who tan (Bastiaens et al., 2001; Palmer et al., 2000). In Europe, MC1R gene mutations that create a light skin color may have been selected for their ability to enhance vitamin D production via light absorption in the skin (Rana et al., 1999). According to Rana et al., this gene also helps to explain the dark skin of Indians of the subcontinent, the white skin of the Chinese, as well as skin coloration in other populations.
Variation in most physical features is recognized as being largely genetic in origin. Yet physical appearances can be misleading because they may also suggest that genetic racial differences exist only in traits that are “no more than skin deep” and not in traits that are “under the skin.” Under evolutionary selective pressures, a time scale that produced the obvious physical differences between races would also be sufficient to produce a racial genetic difference in medical conditions and behavioral traits.
Racial Differences in Medical Traits
As was the case for the MCR1 gene, medical geneticists have discovered many ways in which genes affect disease and also explain racial variation in disease prevalence. Their discoveries started with single gene disorders that were first characterized because the gene had Plomin’s (1994) OGOD effect: One Gene, One Disease. The sickle-cell anemia mutation, which arose in Africa under the selective pressure of malaria, is more common in Africans than it is in other racial or ethnic groups. This disorder of the red blood cells has an allele frequency of .10 –.20 in Africans, compared with less than .001 in U.S. Whites and an even lower frequency in nonadmixed Northern Europeans. About zero genetic variation would exist within Whites, substantial genetic variation would exist within Blacks, and there would be substantial between-race variation. In the Mediterranean basin region, other mutations have arisen independently that also confer protection against malaria. No population escapes susceptibility to some single gene disorder. Caucasians, for example, carry a higher frequency of an allele disposing toward cystic fibrosis, a disorder of ionic transport in the lung and other tissues, than do Africans (Weaver & Hedrick, 1989, p. 520).
Genetic influence on behavioral traits is not due to single genes but instead results from the influences of many genes. Polygenic influence also holds for many common diseases, such as hypertension, obesity, diabetes, and some cancers. It is much more difficult to look “under the skin” and find predisposing genes for complex diseases than it is to find fully penetrant single genes. Consider prostate cancer, a disease that is more common in Blacks than in Whites and that also takes a more aggressive course in Blacks. There are two regions of the androgen receptor gene on the X chromosome that raise the risk of prostate cancer by adding repeats of identical amino acids. Repeat lengths that are shorter confer a greater risk of prostate cancer. Black men are twice as likely as White men to have fewer than 20 repeats at one site in the gene (Zheng & Eastham, 1999). They also have fewer repeats at the other site. These repeat variants are thought to explain a large part of the biology of racial difference in the prevalence of prostate cancer among Blacks and Whites (Pettaway, 1999). These genes definitely operate “under the skin” to contribute to a racial difference in disease prevalence; the rapid pace of medical genetics suggests that many more such discoveries will occur in the near future.
Race as a Biological Concept
One of the most common observations about racial differences is that more variation exists within races than between races for many genetic markers. Lewontin (1982) estimated the variance components at 85% between individual people within a nation or tribe, 7.5% between nations within a race, and 7.5% between races. An oft-cited statement from the Human Genome Project is that people and chimpanzees have 98% to 99% identical DNA; people of different racial groups probably have about 99.9% identical DNA (see, among many others, Plomin & McGuffin, 2003, p. 207, for a discussion of the similarities in DNA between humans, primates, and mice). Yet there is still room for a difference. As Crow (2002) observed, if just 0.1% of DNA bases vary, the number of potential genetic differences is still huge:
Most of the differences we notice are caused by a very tiny fraction of our DNA. Given six billion base pairs per cell, a tiny fraction — 1/1000 of six billion base-pairs — is still six million different base pairs per cell. So there is plenty of room for genetic differences among us. (p. 83)
For continuous traits, the racial differences are quantitative. The mean displacement may be relatively smaller than the group difference at the left or right tail of a distribution. The attention-grabbing tails tend to emphasize racial differences. In sports, it is likely that genetic racial differences contribute to African superiority in terms of quickness, running speed, and jumping abilities (Entine, 2000).  As of this writing, no runner of Asian or European descent — a majority of the world’s population — has broken 10 seconds in the 100-meter dash, but dozens of runners of West African descent have done so. The 32 finalists in the 100-meter dash in the last four Olympics were all of West African descent. Because Olympians represent the maximal extreme of a distribution, this does not imply that the mean displacement between races is nearly so great; unfortunately, no studies of the running ability of Black and White children could be located. For IQ, a one standard deviation difference exists between Blacks and Whites — admittedly a large displacement that puts four fifths of Blacks below the White population mean of 100. At the distributional right tail, an even more disproportionate racial difference could explain why Black individuals are underrepresented in earned doctoral degrees in the natural sciences or mathematics.
 Entine (2000, pp. 249–253) reviewed some of the evidence for physiological difference in muscle fibers. West Africans possess a mean of 67.5% fast-twitch fibers, compared with a mean of 59% for French Canadians; fast-twitch fibers are essential to success in sprinting. Olympic sprinters would be found at the far right tail of the two bell curves established by this mean displacement; in the tail, racial differences would be magnified several times. By searching MEDLINE, one can locate other articles showing physiological differences in muscle tissue between Whites and Blacks.
Definitions of Race
Racial groups form when populations are reproductively isolated from one another over generations (Crow, 2002). Each population, exposed to somewhat different conditions of natural selection, diverges in its traits from a lack of gene flow among them. Other processes may also contribute to the formation of a racial group. Founder and genetic drift effects occur when a relatively small population breaks off from a larger one to establish a new population. The “Out-of-Africa” hypothesis of the origin of European populations is that relatively small groups migrated one or more times from Africa into Europe and Asia. They would not have carried the full genetic variability of this African parental population when they colonized these new regions. In the Northern European and Asian populations, possible bottlenecks may have been an additional source of relative genetic uniformity in a period when the human numbers fell to just sustainable levels. An analogy with an extended family can be used to illustrate the concept of race. In a large, extended family the members share physical similarities, disease risks, and behavioral traits because of their common biological heritage. The same is true of populations that form racial groups.
Race is a fuzzy concept because there is no method to count absolutely the number of different racial groups. Some populations, such as U.S. Hispanics, consist of genetic admixture from two or more races, complicating their identification and assignment. Indeed, one could regard Irish Catholics and Irish Protestants (immigrants from Scotland) as different races because they have managed to avoid intermarriage for a few hundred years. The “splitters,” on the one hand, believe that a large number of different races are useful to define; the “lumpers,” on the other hand, focus on large populations and relatively few races. The lumpers accepted the groupings made on the basis of easily observed phenotypic differences: Caucasoids, Negroids, Mongoloids, and Pacific Islanders (e.g., Australian aborigines).
Modern molecular genetic methods have confirmed these broad classifications (Cavalli-Sforza, 2001; Nei & Roychoudhury, 1974). Although he assiduously avoided racial terminology, Cavalli-Sforza’s (2001) maximum likelihood tree made on the basis of molecular genetic markers reproduced the traditional racial groups exactly (p. 70). His Australoid group included Melanesians from one island, Australian aborigines, and New Guineans. The most distant group was the Africans, with Europeans and Asians being closer. Cavalli-Sforza observed that “all world trees place the earliest split between Africans and non-Africans, which is expected given that all modern humans originated in Africa” (p. 72). Risch, Burchard, Ziv, and Tang (2002) emphasized the continental origins of the major racial groups: “namely, African, Caucasian (Europe and Middle East), Asian, Pacific Islander” (p. 3). The conceptual fuzziness of racial definitions does not negate their utility. A decision to split or lump smaller populations into racial groups will depend on the focus of a research question.
Recent genetic work has begun exploring the amount of overall genetic differences between racial groups. Regardless of the type of genetic marker employed, it is possible to re-create the five major racial groups from the clustering of genetic markers (Risch et al., 2002). Stephens et al. (2001) compared racial groups in the United States on 3,899 mutations (i.e., a change in a single DNA base) in 313 genes. These gene-based mutations correctly classified individuals into nonoverlapping Black and White groups. Because disease-promoting alleles tend to be rare, they may provide stronger racial differentiation than do more common alleles (Risch et al., 2002). Lower frequency alleles have a greater probability of being found in just one race or shared by only two races. As few as about 20 highly discriminative genetic markers can be used to distinguish Blacks from Whites, Blacks from Asians, and Asians from Whites in the United States. Risch et al. (2002) also corrected a common misunderstanding that two people within a race vary as much genetically as two people drawn from different races:
This assertion is both counter-intuitive and factually incorrect. . . . If it were true, it would be impossible to create discrete clusters of humans (that end up corresponding to the major races). . . . Two Caucasians are more similar to each other genetically than a Caucasian and an Asian. (2002, p. 5)
The Preferred Method of Racial Classification
In most U.S. studies, race is classified by respondents’ self-reports of racial group. Respondents can make a judgment from family genealogy and their own physical appearance. Typically, the respondent has the most access to information relevant to making a racial classification. Errors of classification would tend to attenuate associations. Although, as noted, racial group assignments could be done on the basis of genetic markers alone, any small gain in impartiality would not likely be worth the large additional expense. Furthermore, genetic classifications that do not mention race are not useful for identifying the cultural, behavioral, sociological, psychological, and epidemiological variables that distinguish among racial groups (Risch et al., 2002). These variables often provide an alternative hypothesis to a genetic hypothesis of a racial difference, especially a behavioral one; they permit competing hypotheses of racial differences. In practice, self-reported race is likely to remain the assessment method of choice for studies of racial differences.
Methods of Distinguishing Genetic and Environmental Hypotheses
In this section, structural equation models (SEMs) are advocated as a method with which to investigate racial differences. In psychology, most SEMs begin and end with fitting structural equations to covariance matrices. Software systems that implement maximum likelihood estimation of these models include Mx (Neale, 1997) and LISREL (Jöreskog & Sörbom, 1989). It is an uncommon practice for an SEM to include population means, but doing so facilitates the investigation of racial differences (and both Mx and LISREL accommodate including means).
In an earlier study (Rowe & Cleveland, 1996), 11-year-old children (in 1990) were sampled from among the children of the females in the National Longitudinal Survey of Youth (the NLSY–C data set; Center for Human Resource Research, 2000). The mothers of these children, sampled in 1979, originally formed a nationally representative sample. The 1990 NLSY–C children were from disproportionately poorer families, however, because they were the children of younger and less well-educated mothers. The Rowe and Cleveland study used only those children who could be classified as maternal half-siblings or as full siblings. Each child took three subtests of the Peabody Individual Achievement Test (PIAT; Dunn & Markwardt, 1970): Reading Recognition, Reading Comprehension, and Mathematics. 
 Note that academic achievement is highly correlated with intelligence (IQ). Current data suggest that the racial IQ gap is closing slightly or has remained unchanged (Lynn, 1998).
The SEM fit to the covariance matrices included four latent factors: (a) the genotype of Sibling A, (b) the genotype of Sibling B, (c) the shared environment of Sibling A, and (d) the shared environment of Sibling B. An arbitrary sibling was designated as A or B. All three PIAT subtests loaded on each factor. The shared environmental factors of the sibling pair correlated 1.0 because a shared environment is a component of variance that contributes to the similarity of children raised in the same family. It is not an estimate of cultural influences shared by most members of a culture, which would be relatively uniform across families (e.g., patriotism toward an American flag). The genetic latent factors correlated according to the genetic relatedness of the sibling pair: .50 for full siblings and .25 for half-siblings. The genetic contribution to variation in PIAT reading scores is estimable as 4(rFS – rHS).
This SEM required a number of assumptions. An important one is that the shared environmental effects relevant to achievement were no stronger for full siblings than for half-siblings. The variables were assumed to be approximately normally distributed. Finally, the SEM assumes that the genetic variation is additive (even though it may in fact also contain nonadditive variation).
The first research question was whether the covariance structures (i.e., the correlations among variables) were quantitatively the same in Blacks and Whites. For example, if Blacks had special causes of variation in mathematics that did not exist in Whites, then the total variance of math scores would be greater in Blacks than in Whites. Or if shared environmental effects were twice as strong in Blacks than in Whites, tests scores would correlate more highly within sibling pairs in Blacks than in Whites. Equal covariance matrices in the two populations, however, would imply a similarity of influences on academic achievement.
The correlations for each racial group and their means on three PIAT subtests are presented in Table 1. The matrices were 6 x 6 because, in each group, there were 3 tests x 2 siblings. The correlations among the three tests were nearly identical in the four groups (2 races x 2 sibling types), with the two verbal tests correlating approximately .80, and the math test correlating with each verbal test approximately .60. Sibling correlations were also on the same order of magnitude in equivalent groups. Hence, a striking similarity of the two races was observed: They were nearly identical in the association of the variables (Rowe, Vazsonyi, & Flannery, 1994; see also Jensen, 1998, pp. 350–530). As expected under a genetic hypothesis, correlations were greater for full siblings than for half-siblings. For instance, the sibling correlations on reading comprehension were .36 and .42 in White and Black full siblings, respectively, compared with .09 and .22 in White and Black half-siblings, respectively. The correlation pattern, however, did not always support a genetic hypothesis; but the Black sample was small, and thus its correlations had large standard errors. Because the method of maximum likelihood was employed, the structural equations’ fit used all the statistical information in the covariance matrices. In the best-fitting model, both the genetic and shared environmental latent variables were retained.
Once the equivalence of correlation matrices between Blacks and Whites has been established, a second step is fitting the racial means. In the model, the latent genetic and shared environmental factors were permitted to have a racial mean difference. The product of factor loadings of a test and this mean difference should reproduce the observed PIAT mean. Because the PIAT racial differences must be proportional to factor loadings for the model to be correct, where mean differences belong in a model of within-group variation can be tested statistically. A good fit increases one’s confidence in the explanation of mean differences.
On the PIAT subtests, racial mean differences ranged from 0.3 to 0.5 standard deviation units. This relatively small racial difference may reflect the sampling bias noted earlier (i.e., that the siblings were the offspring of young mothers). It is possible to calculate from the factor loadings and a factor’s mean difference the percentage of a test’s mean difference due to shared environment and to genes. In the best-fit SEM, the genetic factor accounted for 66%–74% of the racial mean difference in reading comprehension and reading recognition and 36% of the racial mean difference in mathematics, which was the test most strongly loaded on the shared environment factor. The shared environmental latent factor accounted for the remainder of the mean differences.
A Critique of Structural Equation Modeling of Blacks’ Versus Whites’ IQ Means
A limitation of the Rowe and Cleveland (1996) SEM is that it assumes that one particular statistical model of Black and White verbal IQ means is the correct one. In actuality, many alternative models can be proposed. One interesting type of mean change in IQ for which some models attempt to account is called the Flynn effect, although careful treatment of the Flynn effect is not the primary topic of the current article. 
 The Flynn (1987) effect refers to increasing raw scores on post-1945 IQ tests, especially on nonverbal ones. For IQ researchers, Flynn had made a startling and challenging discovery, which can be summarized as follows: In many generation and age cohorts, raw IQ scores have increased about .3 IQ points per year in many Westernized countries, including the United States (Dickens & Flynn, 2001; Flynn, 1984). His result strongly suggests that IQ scores have undergone tremendous historical/age cohort gains.
My own explanation of the Flynn effect differs from Flynn’s and was inspired by Mingroni’s (2002) article on the Flynn effect. This is the hypothesis that the Flynn effect is due to heterosis. Another term for heterosis is hybrid vigor, as is found in plants. For example, marketed corn is hybrid corn made by mating two different types of corn plants together to yield a robust and vigorous hybrid plant.
In its application to people, hybrid vigor can occur when people of different genetic backgrounds marry and become more “outbred” — for example, even England’s population contains many pockets of genetically different (or at least slightly so) populations. This outbreeding of marriage has been happening for decades and is not an all-at-once process. Thus, it could account for the steady rise in IQ that I accept as real. In this same regard, it is worth noting that there is also a Flynn effect for height — that is, there has been a secular trend for many years in developed countries of increased height over time. But no one would argue that this secular trend invalidates the measurement of height, as some have argued in the case of IQ.
Another alternative model is one that focuses on a particular type of environmental influence. This interpretation can best be explained by thinking in terms of a fertilizer example of plant growth. In a fertilizer experiment, two types of plants are each planted in two growing conditions: high fertilizer versus low fertilizer. Thus the research design is a 2 x 2 analysis of variance, with plant growth as the dependent variable. Within-condition plant variation is treated as an error in the analysis of variance. Such an experiment always yields a huge fertilizer effect because nutrients strongly promote plant growth and development.
Now consider that instead of plants, Black and White individuals are the groups exposed to low versus high “fertilizer” (in this case, fertilizer might refer to educational support, income, etc.). What happens? Any racial differences that emerge could be completely caused by the between-race “fertilizer advantage” and have nothing at all to do with within-race variation. These results would support an environmental interpretation of racial differences. Such an interpretation is plausible in relation to results from the earlier Rowe and Cleveland (1996) study, although the discussion in that article focused on genetic interpretations. However, as summarized previously, there were strong environmental paths to the means, and not all correlations corresponded to what would be expected for a pure genetic model. Thus, the results from the Rowe and Cleveland study leave room for either a genetic or an environmental interpretation — or for both.
Interracial Children and the Genetic Hypothesis
In the following section, interracial children are considered as a window on the origin of racial differences. Half of their genes come from their Black parent, and half from their White parent. Assuming no prior genetic admixture of either parent, then the expected mean of interracial children is the average of the two parents’ means, adjusted to the parents’ population means. The relationship is summarized in the equation that follows:
Mchild = h²[½(MBP – MPOPB) + ½(MWP – MPOPW)]
… + ½(MPOPB – MPOPW), (1)
where M is the mean, BP is the Black parent, WP is the White parent, POPB is the Black population, POPW is the White population, and h² is the heritability of the trait. This equation assumes, often unrealistically, that random mating has taken place between populations. Although mating may be random with regard to a trait such as low birth weight, it is certainly not so for IQ. Nonrandom mating would either lower or raise the child’s expected mean; information on the parents would be needed to estimate its effects. Hybrid vigor (also called heterosis) — the tendency of offspring of isolated and crossed genetic strains to show greater health and robustness than offspring of a single strain — would also lead to an underestimate of an offspring’s trait value. Such effects, although they do exist for IQ (Nagoshi & Johnson, 1986), are relatively small in magnitude. Much of the variation in most behavioral traits has both additive and nonadditive components. Nonadditive components can be recognized because dizygotic twin correlations are often less than one-half monozygotic twin correlations (Plomin, DeFries, McClearn, & McGuffin, 2000). Genetic admixture in the Black (or White) parent would also render Equation 1 less accurate.
The genetic hypothesis also predicts a specificity to traits involved in racial mean differences. Many racial mean differences in behavior and traits are known. Specificity implies that the traits that give rise to racial mean differences are mutually uncorrelated. For example, the genes that affect skin color and number of sexual partners are presumably different and would therefore assort independently in the genome (i.e., following Mendel’s law of independent assortment; Plomin et al., 2000). Thus, a statistical association among variables could be zero, but a racial difference on each variable could be due to genetic effects.
Three traits that have been examined in a U.S. study (Rowe, 2002) are used here as illustrative examples: intelligence (IQ), birth weight, and number of sexual partners. These traits differ in the parental populations. Africans achieve lower scores on IQ tests in comparison to Caucasians and Asians. For example, in a study of racial differences (Rushton & Skuy, 2000), White Afrikaaner students and African students completed the Raven’s Standard Progressive Matrices test (Raven, Raven, & Court, 1995). Some African students came from middle-class homes in Johannesburg, South Africa; others, from poor families. The racial difference was about one standard deviation in favor of the European-descended Afrikaaner students. Low birth weights are more common among Africans throughout sub-Saharan Africa than in other populations (United Nations Development Programme, 2001). Although environmental hypotheses could be constructed for each of these racial differences, these findings are also supportive of genetic hypotheses.
Using the publicly available National Longitudinal Study of Adolescent Health (Add Health) sample,  interviewers identified 116 Black–White interracial children. Interviewers rated the physical appearances of all Add Health respondents, who had a mean age of 16 years; these 116 respondents appeared to their interviewers to be racially Black. The data set included the Peabody Picture Vocabulary Test (Dunn & Dunn, 1981), birth weight (maternal report), and number of sexual partners (self-report).
 The National Longitudinal Study of Adolescent Health (Add Health) was designed by J. Richard Udry and Peter Bearman and funded by National Institute of Child Health and Human Development Grant P01-HD31921. Data sets may be obtained by contacting the Carolina Population Center, University of North Carolina at Chapel Hill, 123 West Franklin Street, Chapel Hill, NC 27516-2524. E-mail: email@example.com
A genetic prediction was that the mean of the interracial children would fall between those of the two parental groups. As shown in Table 2, this prediction was supported.  On each variable, the mean of the mixed-race offspring positioned between the parental means. As a statistical control variable, familial socioeconomic status was only related to IQ and not to the other two outcomes. Statistically adjusting for socioeconomic status reduced the difference between White and Black racial group means by one IQ point and left the pattern of group means unchanged. As expected under the specificity implication of a genetic hypothesis, the three outcomes were mutually independent. For the Black respondents, the average intercorrelation among the three variables was .05; for the White respondents, it was .00. In the entire Add Health sample, however, a slight curvilinear relationship existed between sexual activity and verbal IQ: Respondents at both IQ extremes were less sexually active than were the remainder of adolescents (Halpern, Joyner, Udry, & Suchindran, 2000).
 A reviewer developed a persuasive argument that this empirical result could also be obtained under a genetic-plus-environment model as well. Unfortunately, this argument came forward too late for Rowe to be able to revise his argument or respond to the reviewer’s. Rowe’s original argument is presented as originally stated. But readers may wish to consider the following reinterpretation: Rowe was correct that a genetic model does indeed predict the observed outcome. Other models that include environmental components — models not considered by Rowe — predict the same pattern, however. Thus, Rowe’s conclusion that the data support a genetic interpretation is certainly correct, but to infer that they reject an environmental interpretation would be an overinterpretation of the empirical data. In fact, Rowe stated this very position two paragraphs earlier in a review of previous literature, and it is consistent with his theme that genetic models should be considered alongside environmental models. — J. Rodgers
This pattern conforms to the specificity hypothesis that variables producing racial mean differences need not correlate, and it eliminates unitary explanations (e.g., socioeconomic differences between races cannot simultaneously explain both the low birth weight and IQ differences). If socioeconomic status did so, birth weight and IQ should correlate positively; they did not. It is possible to posit variable-specific causes — for example, that discrimination effects for birth weight (e.g., poverty) are independent of discrimination effects for IQ (e.g., poor schooling). This kind of approach could be used to formulate environmental hypotheses about racial differences that may compete with a genetic hypothesis.
Data from the National Center for Health Statistics (1996) on birth weights in interracial and monoracial babies are presented in Table 3. Black babies are born about half a pound (about 0.23 kilograms) lighter on average than White babies. Interracial babies, as expected under a genetic hypothesis, fell between the means of the two parental populations. The race of the mother had a greater effect than that of the father. A baby with a Black father had a mean birth weight 0.16 pounds (0.07 kilograms) lower than the mean of those with White fathers; the mean difference was about twice as much, 0.38 pounds (0.17 kilograms), for babies with a Black mother compared with those with a White mother. This implies that the uterine environment of the women is particularly important for birth weight, because a tug and push between maternal, placental genes and fetal genes may partially determine birth weight. Maternal effects have been found in other studies. For instance, Morton (1955) reported that the birth weight correlation across their half-sibling pairs was .58 for maternal pairs but only .10 for paternal pairs. A single gene is also involved in racial differences in birth weight. The maternally active GNB3 gene lowers children’s birth weight (Hocher et al., 2000). The low-birth-weight-risk allele has a frequency of 80% in Africans as opposed to 30% in Caucasians (Siffert et al., 1999); hence, this gene can explain a part of the lower birth weight of Black babies.
The Minnesota Transracial Adoption Study had a small sample that included 49 mixed-race children (Weinberg, Scarr, & Waldman, 1992). The subjects were White, Black, or interracial children adopted by White families and given IQ tests when they were about 17 years old. Weinberg et al.’s results dovetail precisely with the predictions of a genetic hypothesis on the basis of an ordering of mean IQs  (note that this interpretation has been debated; see Levin, 1994; Lynn, 1994; Waldman, Weinberg, & Scarr, 1994): (a) For the White biological children of the adoptive parents, the mean IQ was 109; (b) for the adoptive children with two White biological parents, 106; (c) for the interracial children with one White parent and one Black parent, 98; and (d) for the adoptive children with two Black parents, 89. Thus, the IQ mean of the mixed-race children fell between those of the homogenous Black and White children, as expected under a genetic hypothesis. The IQ mean of Blacks in Minnesota is about 90 (Weinberg et al., 1992); hence, the intellectual stimulation provided by adoption seems to have had no lasting effect on the IQs of the Black adoptees, which is again consistent with, but not proof of, a genetic hypothesis.
 Scarr has expressed some reservations about her initial environmentalist interpretation of her transracial adoption study: “I reported the data [from the Transracial Adoption Study] accurately and as fully as possible, and then tried to make the results palatable to environmentally committed colleagues. In retrospect, this was a mistake. The results of the transracial adoption study can be used to support either a genetic difference hypothesis or an environmental difference one” (Scarr, 1998, p. 230).
Methods for Investigating Genetic and Environmental Hypotheses of Racial Differences
The studies that were presented imply that many Black–White differences in behaviors and traits could be attributable to racial genetic differences. In this section, research methods are advocated that may be used more effectively to decide between genetic and environmental influences, especially when the two are correlated. One general design is to use estimates of racial, genetic admixture in survey studies. Another design is to use interracial children and their biological relatives.
In the United States, Blacks have a common genetic heritage with both Europeans and Africans. On average, about 17% of genes possessed by Blacks are of European origin, ranging in different U.S. regions from about 12% to 23% (Parra et al., as cited in Risch et al., 2002). The percentage of European genes in Black individuals varies from near zero to more than 50%. First-generation offspring of mixed-race marriages would have half their genes of African heritage and half of European heritage, ideally, but probably have more European genes because of prior admixture in the Black parents.
For a geneticist, viewing race as a continuous variable is straightforward: It is the estimation of individual admixture. Individual admixture is the proportion of genes that a Black individual has inherited from the European population. Correspondingly, a variable could be constructed measuring Black admixture in Whites, although only low values would be expected because of inequality in past population sizes and mating patterns. A genetic hypothesis predicts that Black individuals who possess more Caucasian genes will approach the behaviors and traits of Caucasians, to the extent that those traits and behaviors have genetic origins. The individual admixture score gives a continuous variable that can be used in conjunction with environmental, control variables assessed on the same subjects.
A research design that uses genetic admixture, for example, has been applied in the medical genetics of diabetes. Pima Indians who possessed more genes of European heritage were less likely than other Pima Indians to be overweight and to develop type 2 diabetes (Williams, Long, Hanson, Sievers, & Knowler, 2000): The Caucasian genes were protective. Individuals’ degree of risk varied linearly with the degree of previous admixture with Whites.
Admixture scores can be obtained by molecular genetic methods that reveal various genetic markers. Previous genetic studies (e.g., Scarr, Pakstis, Katz, & Barker, 1977) used inadequate genetic indicators of individual admixture by failing to choose genetic markers with large allele frequency differences between Europeans and Africans (Reed, 1973, 1997), but molecular genetic technology was primitive in the 1970s. It takes about 25 informative markers (i.e., genetic markers with large allele frequency differences between sub-Saharan Africans and Northern European Caucasians) to produce an admixture score. The DNA for genotyping them can be obtained from a single mouthwash. After the DNA is isolated from the buccal (cheek) cells in the mouthwashes, a laboratory can perform the genotyping at a reasonable cost. The distribution of scores cannot be predicted in advance, but in a heterogenous Black sample, it would probably range from zero to about 60%. Shriver et al. (1997) offered one of the first genetic marker sets for calculating individual admixture in Blacks, but many more informative markers are available today.
Skin Color as a Confounding Variable
Skin color is often used in formulating environmental hypotheses of racial differences. One line of argument is that discrimination against dark-skinned individuals would produce traits found in greater frequency in Blacks (e.g., low IQs). One source of lighter skin color in Blacks is racial admixture of Blacks with lighter skinned Whites (e.g., through the inheritance of alleles of the MC1R and other skin color genes). Fortunately, genetic markers that do not determine skin color can be used to assess genetic admixture. This means that an admixture measure that is unconfounded with skin color is available to use in survey studies of racial differences.
The relationship of skin color to behavior, furthermore, is more complex than is often believed. Skin color lightens with higher IQ and socioeconomic status (Krieger, Sidney, & Coakley, 1998; Lynn, 2002). Lynn’s (2002) estimate of the correlation between IQ and skin color was .17 in a representative sample of 430 adult Blacks. The low population value easily accounts for the inconsistencies of statistical significance found in many small samples. One explanation of this correlation is that it relates to a greater racial discrimination against dark-skinned than against light-skinned Blacks. Yet no mechanism for this discrimination effect has been proposed that is viable. In the United States, Jews and Asians have both endured significant discrimination but without apparent harm to their IQs.
The self-perception of discrimination introduces further complexities. Krieger et al. (1998) measured skin color with a reflectance meter and administered a questionnaire on perceptions of discrimination. Their study’s sample was 1,844 Black women and men, from 24 to 42 years of age. Light-skinned Blacks perceived a greater discrimination in school than did dark-skinned Blacks. This finding makes sense if the dark-skinned Blacks were more often in highly segregated schools, whereas the light-skinned Blacks were more often in integrated schools, where their skin color might provoke teasing and discrimination. This hypothesis also makes it less likely that a discrimination explanation can be used to explain the results of the transracial adoption study (Weinberg et al., 1992). Skin color, though, is an inadequate proxy for individual admixture with European genes. Skin color may be used to investigate social reactions to Blacks but not to test genetic hypotheses; in any new investigation, measures of both skin color and individual admixture would be desirable.
A survey study, for example, could be used to test an environmental hypothesis that racial discrimination causes IQ variation in Blacks. Among the variables that could be used would be (a) skin color, as a proxy for social reactions to the individual; (b) perceived discrimination, as a direct assessment of social reactions; and (c) individual genetic admixture. A genetic hypothesis would predict that individual admixture will give the best prediction of IQ, controlling for skin color and perceived discrimination. To the contrary, an environmental hypothesis would predict a dominant effect of perceived discrimination. With such variables, researchers with different positions on the racial difference question could come together to design a single, large-scale survey to test their hypotheses about racial differences.
A method of genetic linkage analysis may offer a way to locate the genes that produce racial differences. This method is called mapping by admixture linkage disequilibrium (MALD; Collins et al., 2002). Many researchers have explored MALD from a theoretical viewpoint, but to date there are few examples of actual applications. Although the details of MALD are beyond the scope of this article, a simplified description can be given. In a MALD study, Black individuals would be grouped into those who rate high on a trait and those who rate low on a trait. Each individual would be genotyped for 200–300 highly informative genetic markers (i.e., markers that differ greatly in frequency in sub-Saharan African and European populations) that would cover the entire genome. Genetic markers that differed significantly between the high and the low groups would indicate putative genome regions containing genes influencing the trait. MALD derives its statistical power from the genetic admixture from prior generations. Thus, admixture analysis can be extended to localizing important genome regions; positive MALD results would convincingly demonstrate a genetic origin of racial differences.
Studies of Interracial Children
The study of mixed-race children gives considerable purchase to the question of racial differences. Sampling these families, though, is difficult because too few of them exist in large data sets to provide adequate samples. Thus, a study of mixed-race children needs a sampling framework for them, such as a snowball sampling in which each mixed-race family nominates similar families whom they know. A study could be carried out in a location, such as Evanston, Illinois, or Montreal, Canada, with a high proportion of mixed-race families. This is unavoidably an expensive design and may require a collaboration among different sampling sites.
The ideal mixed-race design depends upon the outcome to be investigated. Because of generational changes in sexual activity, parental reports of their own sexual behavior, retrospective to their teenage years, may be unreliable and not that useful. However, an investigation of IQ would gain from having IQ tests on the Black parent, the White parent, and the interracial child. The genetic expectation for the child’s IQ is given in Equation 1. Parental IQs could be used to estimate the non-randomness of human matings between racial groups, and this effect could then be statistically controlled. Again, the study could add as a control variable physical appearance and the individual admixture of the children. (To avoid racial discrimination effects, in an earlier study [Rowe, 2002], I sampled only children whom the interviewers defined as being Black in appearance.)
Another way in which to expand this design is to include cousins. The cousins live in monoracial families; both Black and White cousins may be available. Instead of comparing the means of the interracial children with those of unmatched racial samples, researchers would compare the means of the interracial children with those of their Black and White cousins. A further expansion would be to include the cousins’ families, but this would make sampling extremely difficult.
In an interracial design, siblings are also a potentially interesting group. All siblings have the same average degree of genetic admixture; however, they may differ in overt physical traits that make them look more Black than White in physical appearance. In general, sibling difference scores control statistically for shared family environmental influences (i.e., those influences that siblings possess in common that are removed in a difference score). These scores can remove many third-variable explanations of an observed association (e.g., variables such as parental educational level or general family emotional climate cannot explain why siblings differ). Thus, sibling differences — that is, the signed difference score of Sibling A minus Sibling B (who is designated A or B is arbitrary; younger vs. older could be used as well) — are informative about within-family genetic and environmental influences only. For example, difference scores on physical appearance could be correlated with difference scores on IQ. An environmental hypothesis is that these difference scores will correlate, as they do between families. A genetic hypothesis is one of no association. Although this may seem to be an acceptance of the null hypothesis, it is actually an effort to evaluate a hypothesis of a trivial or close-to-zero effect size.  In a study with a large enough sample size to provide appropriate statistical power, these two hypotheses can be tested.
 Acceptance of the null hypothesis is often critical in scientific studies. In clinical trials, for example, one wants to accept the hypothesis that the placebo and treatment groups are equivalent for potentially confounding variables. Hence, for these variables, the null hypothesis is accepted when they do not show statistically significant differences. Although perhaps never strictly true, supporting the null hypothesis can be regarded as accepting that an effect size is trivial.
The environmental hypothesis needs to be sharpened to account for multiple outcomes. For all hypotheses, the more relevant the measured variables, the better the prospects for deciding among them. Consider a post hoc, handwaving explanation of the fact that interracial children’s means fall between those of their parental populations. One could simply say that is because interracial children live in a mixture of the two cultures, Black and White, and that their half exposure to Black culture makes them behave more as Blacks do. This explanation is fine if culture can be put into a study as a set of measured variables that explain variance in the trait. Measured mediators are also needed on the genetic side (e.g., no one directly inherits a number of sexual partners). Thus, personality trait mediators should be included in a study of racial differences in sexual behavior.
The research designs I have proposed in this article could be made obsolete by gene discoveries. It has proven far more difficult than was expected to find the genes of small effect size that contribute to variation in such medical conditions as diabetes or to variation in psychiatric traits. Nonetheless, many research groups are successfully locating these genes, and genetic linkage and association methods are also improving (Feingold, 2002). I do not expect this article’s research designs to become obsolete in the next 10–15 years, but eventually the racial difference question will be addressed using the specific genes that contribute to variation in traits within and between racial populations, as was illustrated for prostate cancer.
At this time, the main change that is needed is in the treatment of the environmental and genetic hypotheses. By putting ideology and politics aside, researchers could treat the two hypotheses with a greater impartiality; this is a goal of scientific investigation. Lastly, in the next few decades many genetic racial differences are likely to be discovered. I share the sentiments expressed by Crow (2002) concerning society’s adjusting to this new knowledge:
It is important for society to do a better job than it now does in accepting differences as a fact of life. New forms of scientific knowledge will point out more and more ways in which we are diverse. I hope that differences will be welcomed rather than accepted grudgingly. Who wants a world of identical people, even if they are Mozarts or Jordans?” (p. 86)