# Predictors of Vegetarianism (Add Health)

Using the Add Health data, WAVE 3, I try to investigate which factor is the best predictor of vegetarianism. Since the dependent variable is a dichotomous one, I use the logit regression. Below are the variables used in the regression.

H3GH17 – Do you consider yourself a vegetarian? 0 = No, 1 = Yes.

H3RE41 – To what extent are you a religious person? 0 = Not religious at all, 1 = Slightly religious, 2 = Moderately religious, 3 = Very religious.
H3IR1 – How physically attractive is the respondent? 1 = Very unattractive, 3 = About average, 5 = Very attractive.
H3SE1 – Have you ever had vaginal intercourse? (Vaginal intercourse is when a man inserts his penis in to a woman’s vagina.). 0 = No, 1 = Yes.
H3ED1 – What is the highest grade or year of regular school you have completed? 6 = 6th grade, 22 = 5 or more years of graduate school.
BIO_SEX3 – Respondent’s Gender. 1 = Male, 2 = Female.
CALCAGE3 – Calculated Age at Time of Interview. Range = 18-28 years old.

Allocation of cases
Valid cases – 3,323
Cases excluded by filter – 1,492
Cases with invalid codes on variables in the analysis – 67
Total cases – 4,882

As we can see, being a woman, attractive, and educated are positively associated with the likelihood of being a vegetarian, while having sex, being religious, and being older (coefficient not statistically significant) are negatively associated with the likelihood of being a vegetarian. The negative relationship between H3RE41 and H3GH17 is quite surprising, since religious people are expected to be compassionate with any form of life. Maybe there are some confounding factors.

Keep in mind that an independent variable with a low point-scale (say, 2) is expected to have a higher coefficient than an independent variable with a high point-scale (say, 10). In fact, a change in one unit of an independent variable with a high point-scale would have a very little effect, especially when the independent variable can take on many values (for instance, years, age, or income), on the dependent variable.

It is worth noting that the pseudo R-squared, which expresses the proportion of variance in the dependent variable explained by the entire set of independent variables, shows a very low value.