A. A. E. Vinkhuyzen, S. van der Sluis, E. J. C. de Geus, D. I. Boomsma and D. Posthuma (2009)
Childhood environment, social environment and behavior, leisure time activities and life events have been hypothesized to contribute to individual differences in cognitive abilities and physical and emotional well-being. These factors are often labeled ‘environmental’, suggesting they shape but not reflect individual differences in behavior. The aim of this study is to test the hypothesis that these factors are not randomly distributed across the population but reflect heritable individual differences. Self-report data on Childhood Environment, Social Environment and Behavior, Leisure Time Activities and Life Events were obtained from 560 adult twins and siblings (mean age 47.11 years). Results clearly show considerable genetic influences on these factors with mean broad heritability of 0.49 (0.00–0.87). This suggests that what we think of as measures of ‘environment’ are better described as external factors that might be partly under genetic control. Understanding causes of individual differences in external factors may aid in clarifying the intricate nature between genetic and environmental influences on complex traits.
Complex traits, such as cognitive ability, physical well-being or psychiatric dysfunctioning, are known to be influenced by both genetic and environmental factors. Although current research mainly targets dissecting genetic influences on complex traits, charting environmental influences seems at least of equal importance to understanding individual differences in such traits. Few studies have reported on the influence of environmental factors (such as socioeconomic status or life events) on, for example, cognitive ability (Turkheimer et al. 2003) and psychiatric dysfunctioning (Middeldorp et al. 2008). However, it has been reported that these proposed environmental factors are under genetic control themselves (Kendler & Baker 2007; Plomin et al. 1994, 1988, 1989; Rowe 1983), suggesting that these factors are not randomly distributed across the population but reflect heritable individual differences. If true, this will introduce bias to models that treat environmental factors as purely environmental in origin and may therefore impede our understanding of individual differences in complex traits.
Such bias is perhaps most notable when environmental factors are used to investigate environmental moderation of genetic effects (G × E interaction). If the environmental moderator is itself under genetic control and part of the genes that influence the environmental moderator also have a direct effect on the trait under investigation (i.e. a genetic correlation; rGG), ignoring genetic effects on the measured environmental factor leads to an overestimation of the moderating effect of the environmental factor (Purcell 2002). Both rGE and G × E interaction have been reported in the context of cognitive ability, physical well-being and psychiatric dysfunctioning (Boomsma et al. 1999; Plomin & Bergeman 1991; Plomin & Daniels 1987; Plomin et al. 1985; Rowe et al. 1999; Scarr & McCartney 1983; van der Sluis et al. 2008). If environmental factors are partly under genetic control, some of these reports may have overestimated the effects of the environmental moderators on the genetic influences of a trait.
Kendler and Baker (2007) recently reviewed the findings of 55 independent studies on the genetic influences on ‘environmental factors’ that are of etiological importance for psychiatric (dys)functioning. The overall weighted heritability estimate across all environmental factors was 0.27 (range: 0.07–0.47). An essential limitation of this study put forward by the authors themselves is the possibility of publication bias with respect to the studies included in the review, i.e. studies showing genetic control on external factors might be more likely to be accepted for publication than studies reporting on the absence of genetic influence. Because environmental factors are also involved in other domains, it is important to systematically study external factors that are relevant outside the psychiatric domain as well.
Measured factors in the domains of Childhood Environment, Social Environment and Behavior, Leisure Time Activities and Life Events, all generally labeled as environmental, have been hypothesized to contribute to individual differences in various complex traits. The goal of the present study is to test the hypothesis that these factors are not randomly distributed across the population but reflect heritable individual differences.
This study is part of a large ongoing project on the genetics of cognition (e.g. Posthuma et al. 2001) from the Netherlands Twin Register (NTR; Boomsma et al. 2006). The study was approved by the Central Committee on Research involving human subjects, which oversees medical research involving human subjects in the Netherlands. Information on environmental factors was gathered using the Life Experiences List (LEL), which is described in more detail below. The study was undertaken with the understanding and written consent of each participant. Data were available for 560 twins and siblings (59% females) from 256 different families: 150 complete twin pairs [55% monozygotic (MZ)], 87 incomplete twin pairs (32% MZ) and 173 siblings (number of participating siblings per family ranges from 0 to 5). From 19 families, only sibling data were available. The average age of the participants was 47.11 years [standard deviation (SD) = 12.40, range: 23.44–75.61] at the time they completed the LEL. Zygosity of same-sex twins was based on DNA polymorphisms (97 pairs, 74%) or, if information on DNA markers was not available, on questions about physical similarity and confusion of the twins by family members and strangers. Agreement between zygosity diagnoses from survey and DNA was 97% (Willemsen et al. 2005). All five zygosity groups were reasonably well represented: monozygotic males (MZMs: 21%, 119 participants), monozygotic females (MZFs: 27%, 150 participants), dizygotic males (DZMs: 12%, 66 participants), dizygotic females (DZFs: 23%, 131 participants) and dizygotic opposite sex (DOS: 17%, 94 participants). Non-twin sibling data were available for 81 (47%) brothers and 91 sisters. The non-twin siblings were included in the analyses to enhance the statistical power to detect genetic and environmental effects (Posthuma & Boomsma 2000).
The sample of participating twins and siblings was representative of the general Dutch population with regard to educational level (see Posthuma et al. 2001 for details). Prevalences and means of sport participation, having a partner and average number of children per women among others, were also comparable to national large scale surveys (CBS 2008), implying that the sample is representative of the Dutch population.
A small, independent sample of 52 participants (26 parent-offspring pairs, 75% women; age range: 17–71, mean: 39.95, SD: 16.19) completed the survey twice in a period of 2 months. These data were used to calculate test-retest reliability.
Analyses were carried out using the raw data option in Mx (Neale 1994; Posthuma & Boomsma 2005). Age and sex were included as covariates in the model. Ordinal items were assumed to reflect an underlying normal distribution of liability (Falconer & Mackay 1989). As the liability is a theoretical construct, its scale is arbitrary. For straightforward interpretation, the liability was assumed to be standard normally distributed with zero mean and unit variance and the number of thresholds a function of the number of ordered categories minus 1.
First, twin and sibling correlations for all traits were estimated. Means or thresholds, and variances, were constrained equal across twins and non-twin siblings and across all zygosity groups for all domains. Correlations for MZ twins, dizygotic (DZ) twins and siblings were allowed to differ. A difference between DZ and sibling correlations may represent a true twin environmental influence on a trait or may be induced when the environmental factor is something that happens at a fixed time-point and at the same time affects all family members (such as parental divorce).
Second, genetic models were specified in which individual differences (in liability, in case of ordinal data) were modeled as a function of genetic and environmental effects. Genetic factors A and D, and environmental factors T, C and E, were considered. ‘A’ represents additive effects of alleles summed over all genetic loci. ‘D’ represents non-additive or dominant genetic effects. ‘T’ represents a special twin environment that renders twins more alike than regular siblings. ‘C’ represents common environmental influences that render members of the same family more alike. ‘E’ represents all environmental influences that result in differences between members of a family, including measurement error. In a twin-sibling design, the effects of C and D are confounded and cannot be estimated simultaneously. In the present study, the variance (in liability, in case of ordinal data) was decomposed as due to A, C, T and E, or due to A, D, T and E. If sibling correlations were significantly different from twin correlations, a special twin environment (T) was included in the genetic model. When DZ twin correlations are at least half the MZ twin correlations, additive genetic effects are implied and an ACE or ACTE model was fitted to the data. DZ twin correlations less than half the MZ twin correlations suggest the presence of genetic dominance, in which case an ADE or ADTE model is deemed more suitable. Significance of parameters was tested by comparing the fit of nested (increasingly more restricted) models to the fit of less restricted models. Goodness-of-fit of these submodels was assessed by hierarchic likelihood ratio tests. The difference in log-likelihoods between two models (which follows a χ² distribution) was evaluated. If the χ² difference test is significant, the constraints imposed on the nested models are not tenable. If the χ² difference test is not significant, the nested, more parsimonious model is to be preferred. A criterion level α of 0.05 was adopted for all tests.
Table 1 lists frequencies of all ordinal measures and means and SDs of the continuous measures, as well as test-retest reliabilities and missingness. Means and thresholds could be constrained to be equal across all zygosity groups without significantly deteriorating the fit of the model.
Original categories from the LEL were maintained for the ordinal analyses, except for a few factors. Because the endorsement rate of the highest categories of the items concerning ‘parental interest in school achievement’ and ‘being bullied at primary and secondary school’ was very low, it was decided to merge the two highest categories. The ordinal items concerning ‘being read to’ and ‘current musical and physical activity’ were dichotomized because of low test-retest reliability of the higher order versions. The item concerning ‘being read to’ was categorized into ‘yes’ if being read to took place at least once a week, and ‘no’ for all other categories. Finally, items on the frequency of playing an instrument and participation in physical activity were dichotomized and should be interpreted as ‘yes’ vs. ‘no’ items.
In general, the percentage missing (see Table 1) is reasonable except for factors concerning ‘educational level of the participants’ partner’ and ‘good friend’. A relatively large proportion of participants did not know or left blank the level of education of their partner (22%) and good friend (50%). The high percentage of missingness with respect to ‘educational level partner’ was mainly attributable to the older participants of this study. Thirty-three percent of the participants above 45 years of age did not report the educational level of their partner. Most likely, the missingness was dictated by educational changes over the last decades, with the categories presented in the questionnaire not exactly matching the former educational system.
Test-retest reliability (see Table 1) was above 0.80 for the majority of the items (24 out of 34 items). Test-retest reliability within the domain of Childhood Environment was exceptionally high. Within the domain of Leisure Time Activities, the item concerning ‘number of years sport participation’ showed relatively low test-retest reliability: 0.37. Two items within the domain of Social Environment and Behavior showed relatively low test-retest reliability as well (social support numbers: r = 0.44, social support satisfaction: r = 0.46). Table 2 shows the MZ, DZ and sibling (including twin-sib) correlations for all environmental factors, with the type of correlation depending on the measurement level of the factors [tetrachoric (TC) for dichotomous items, polychoric (PC) for ordinal items and Pearson (PE) for continuous items]. Correlations for MZ, DZ and sibling pairs were based on a maximum of 83, 67 and 315 pairs, respectively.
Sibling correlations did not differ from DZ correlations except for two factors in the domain of Life Events (positive and neutral life events up to the age of 18) in which DZ correlations exceeded the sibling correlations. The factor neutral life events mainly exist of events that happen within a family at a fixed time-point. The difference in twin and sibling correlations is therefore most likely attributable to twins being of the same age when an event takes place, while regular siblings are not. For these two factors, special twin environment T was estimated in addition to environment shared by all twins and siblings (C).
In general, MZ twin correlations exceeded the DZ and sibling correlations, suggesting the presence of genetic influences. The point estimate of the DZ twin correlation of the item ‘level of education friend’ exceeds the point estimate of the MZ twin correlation. This is likely dictated by the relatively low number of complete DZ twin pairs, percentage missingness of this environmental factor was 50%. DZ twin correlations, however, were not significantly different from sibling correlations for this factor resulting in a lower DZ/sib than MZ correlation.
For 23 out of the 34 factors the pattern of MZ, DZ/sib correlations suggested an ADE pattern, for 11 factors an ACE pattern was suggested for subsequent genetic modeling. For the two environmental factors for which the DZ correlation significantly exceeded the sibling correlation, the decision between an ACTE or ADTE model was based on the difference between the MZ and DZ twin correlation. For each environmental factor, the selected model is reported in Tables 3–6 (* denotes ACE, ** denotes ADE and *** denotes ACTE).
Tables 3–6 list the proportions of variance explained by genetic (additive and non-additive) and environmental (special twin, shared and non-shared) influences in full and reduced models for each domain.
For some measured environmental factors, both an AE and a CE model described the observed data well. In that case, preference of an AE or a CE model was based on Akaike’s Information Criterium [AIC, computed as χ²− (2 × df)], were the preferred model was indicated by a lower AIC.
Within the domain of Childhood Environment (Table 3), genetic influences were significant for the majority of the measured factors. Based on the full models, the mean of the broad sense heritability (i.e. a² +d²) calculated across all 14 measured childhood factors was 0.66 (range: 0.47–0.87). Genetic influences were relatively low for the item ‘school achievements discussed by parents’ and were relatively high for factors concerning ‘relative height at primary and secondary school’, ‘to be read to’ and ‘to be bullied at primary school’.
Within the domain of Social Environment and Behavior (Table 4), genetic influences were significant for the majority of the measured factors. Based on the full models, mean broad sense heritability across all nine items was 0.36 (range: 0.00–0.74). No significant genetic influences were observed for two items (‘education good friend’ and ‘duration of relationship with partner’), while relatively high heritability was observed for ‘having children’. Absence of genetic influences for ‘education good friend’ might, however, be related to the relatively high percentage of missingness of this factor. Both AE and CE models described the data well for factors concerning ‘education of partner’, ‘education of good friend’ and ‘duration of the relationship with partner’. Based on the AIC, an AE model was preferred for ‘education partner’ while CE models were preferred for ‘education good friend’ and ‘duration of relationship with partner’.
Within the domain of Leisure Time Activities (Table 5), genetic influences were significant for all measured factors. Mean broad sense heritability was 0.52 (range: 0.31–0.87). The lowest heritability was found for the factor concerning ‘number of years music lessons’, whereas highest heritability was reported for the factor concerning ‘current musical activity’. Both AE and CE models described the data well for the factors ‘number of years music lesson’ and ‘number of years sport participation’, with AE the preferred model based on AIC.
Within the domain of Life Events (Table 6), genetic influences were significant for ‘positive life events’ (<age 18 and ≥age 19) and for ‘neutral life events’ (≥age 19). Mean broad heritability was 0.29 (range: 0.12–0.57). In general, higher heritability estimates were reported for life events occurring later in life (after age 19). Both an AE and a CE model described the data well for ‘neutral life events ≥ age 19’, with AE the preferred model based on AIC. Special twin environmental influences were significant for ‘negative and neutral life events ≤ 18’.
In this study, the hypothesis was tested that measured environmental factors from four general domains (Childhood Environment, Social Environment and Behavior, Leisure Time Activities and Life Events) are not randomly distributed across the population, but reflect heritable individual differences. Results of this study show considerable genetic influences on factors that are often labeled as ‘environmental’, in keeping with the idea of the environment as an ‘extended phenotype’ (Dawkins 1982). Overall, mean broad sense heritability, h² (a² + d²), was 0.49 (range 0.00–0.87) (without items ‘relatively height and weight’, mean broad sense heritability was 0.46). The largest estimates of the broad sense heritability were reported within the domain of Childhood Environment [mean h² = 66, without items ‘relatively height and weight’, mean h² = 0.62 (range: 0.00–0.87)], followed by Leisure Time Activities (mean h² = 0.52) and Social Environment and Behavior (mean h² = 0.36), and the lowest heritability in the domain Life Events (mean h² = 0.29). Only two measured environmental factors, both in the domain Social Environment and Behavior, were found to be purely environmental: ‘the level of education of a good friend’ and ‘the duration of relationship with partner’. Our results suggest that what we think of as environmental factors are perhaps better described as external factors that might be partly under genetic control. Including such external factors in etiologic models of complex traits therefore necessitates a correct specification of both genetic and environmental influences on external factors. For example, external factors may be correlated with the genetic effects on complex traits (rGE), and this rGE can appear as gene–environment interaction (G × E) if the rGE is not accommodated explicitly in the model (Purcell 2002). The finding that environmental factors are partly under genetic control has therefore major implications on studies on interactions between genes and environmental influences.
Some of the measured external factors investigated here have been investigated previously. For example, within the domain of Childhood Environment current heritability estimates for ‘family environment’ exceeded estimates from previous studies; (Jacobson & Rowe 1999; Plomin et al. 1988), while the heritability estimates for ‘being bullied’ were lower in the present study (Ball et al. 2008). No previous studies reported on etiology of one’s intellectual environment (domain Social Environment and Behavior), i.e. the external factors ‘educational level of partner’ and ‘educational level of good friend’. The finding that the level of education of an individual’s partner is under genetic influence may be grounded in assortative mating for intelligence, i.e. non-random mating of spouse pairs. As intelligence is a highly heritable trait, and intelligence has a strong phenotypic and genotypic correlation with educational level (Rowe et al. 1998), educational level of an individual’s partner may be correlated with genes that are related with intelligence. The finding that external factors as ‘having children’, ‘having partner’ and ‘duration of relationship with partner’ are partly under genetic control may not be surprising because these factors are likely to be related to other qualities known to be influenced by genetic factors, including conscientiousness and conservatism (Bouchard et al. 2003).
Previous studies on sport and musical participation show considerable evidence of genetic influences, comparable with the results of the Leisure Time Activities domain of the present study (Coon & Carey 1989; Stubbe et al. 2006; Vinkhuyzen et al. 2009). Studies on the heritability of Life Events were reviewed by Kendler and Baker (2007) in the context of psychiatric (dys)functioning. Life Events are related to psychiatric (dys)functioning (Middeldorp et al. 2008), but may also be related to other domains of interest in genetic epidemiology (Brandes et al. 2002; Buckley et al. 2000; Hart et al. 2008). Kendler and Baker reported mean weighted heritability estimates of 0.34, 0.39 and 0.17 for positive, negative and neutral life events, respectively. The results of the present study are partly in line herewith, with broad range heritability estimates of 0.26/0.44 and 0.40/0.41 for positive/neutral life events up to age 18 and from age 19, respectively. In contrast to the findings of the studies reviewed by Kendler and Baker, genetic influences on negative life events were not significant in the present study.
First, all information on the external factors in this study was gathered through self-report. This induces the possibility of analyzing the heritability of the selective recall and subjective perception of the factor, rather than the actual factor itself. Kendler and Baker (2007) reported weighted heritability estimates for external factors by rating method; weighted heritability estimates based on self-report data (0.29) were somewhat higher than estimates based on informant report data (0.26), and substantially higher than direct rater or videotape observation data (0.14). This suggests that genetic influences on external factors as reported in the present study might be somewhat inflated due to the use of self-report only. In future studies that aim to investigate the genetic influences on environmental factors, it would be valuable to make use of external raters in addition to self-report data to test for the possible selective recall or subjective perception of the participants.
Second, it should be noted that factors of which the variance is naturally attributable to shared environmental influences – such as parental divorce or parental death – were not considered in this study.
Third, variances were assumed to be equal between MZ and DZ twins. For six items, however, the MZ variances were significantly different (P values ranging from 0.00 to 0.02) than the DZ variances: ‘age leaving parental home’, ‘number of years music lessons’, ‘life events positive (≥19 years)’, ‘duration of relationship partner’ and ‘positive and negative life events (≤18 years)’. The observed pattern of MZ and DZ variances and covariances of the first three items was suggestive of competitive sibling interaction (i.e. the behavior of one child leads to opposite behavior in the other child), the observed pattern of MZ and DZ variances and covariances of the latter three items was suggestive of cooperative sibling interaction (i.e. the behavior of one child leads to similar behavior in the other child). We choose not to incorporate possible sibling interaction in the genetic models for two reasons. First, sibling interaction was beyond the scope of this study as our main aim was to establish whether external factors are under genetic pressure. Second, a much larger sample size is required to test both sibling interaction and genetic dominance. Consequently, as the statistical power to detect sibling interaction in the context of genetic dominance would have been very poor with the current sample size (see e.g. Rietveld et al. 2003), it is very likely that we would have ended up with the same results as presented now. Ignoring sibling interaction may lead to inflated estimates of genetic dominance and deflated estimates of additive genetic factors (Rietveld et al . 2003). It does, however, not change the broad sense heritability, which was the main focus of this study.
Fourth, in case of intermediate levels of heritability, the statistical power to resolve dominance genetic effects can be quite poor when only data from twins and siblings are available (Eaves 1969; Martin et al . 1978), and sample sizes in the order of 2000 participants are often required. The use of ordinal data necessitates even larger sample sizes to detect genetic dominance, depending on the prevalences and number of thresholds (Neale et al . 1994). In addition, the (partly retrospective) self-report method used in the questionnaire may have rendered some of the measures less reliable, which also affects the power to detect genetic effects. We tried to deal with these limitations by focusing our discussion on the broad sense heritability h², rather than distinguishing between a² and d², and on the overall heritability of the four general domains, rather than the 34 individual external factors. For reasons of power, we also adopted a somewhat liberal pose by testing all effects against a criterion level α of 0.05, rather than using e.g. Bonferroni correction to correct for multiple testing. However, as can be seen in Tables 3–6, almost all genetic effects would have been considered statistically significant if a more stringent criterion level of 0.01 or even 0.001 would have been used.