Spring 2004 (Roos, Soc. 311)


Assignment 1: Hypothesis Construction (due February 2nd)

This assignment is designed to give you practice developing hypotheses that you may want to test for your final project. Because we are starting with existing data and thus are restricted in the kinds of hypotheses we can derive and test, we will be engaged in secondary analysis of existing data. Throughout the semester you will start with already existing data (the 2002 General Social Survey, or GSS), generate a hypothesis about the relationship existing among selected variables, and test that hypothesis by finding indicators of your concepts and applying data analytic techniques. To complete the process, you will present the results of your data analysis and describe how these results affect your initial expectations.

This assignment is the first step in this semester-long process. You will use information from the GSS website to collect what you need to begin Assignment 1. The first step is to use your browser to go to the GSS website:

Click on "Analyze" (top). Go ahead and choose "Browse Codebook in this window", and then "Standard Codebook." This will allow you to browse the codebook for the cumulative General Social Survey (1972-2002). Once you feel comfortable with the codebook, you can switch to the "Codebook by Year of Interview," which gives frequencies for each year the survey was fielded. Try out all the links in the blue column, so you know what they do. My advice is to start with "Sequential Variable List." Once you've chosen your variables (and know your variable names), you can get to them more quickly by using the "Alphabetical Variable List."

My example: in the illustrative answer you'll see I'm interested in the relationship between education (my independent variable) and attitudes toward pornography (my dependent variable). Specifically, I'm interested in explaining variation in attitudes toward pornography, and I think that education probably has something to do with why some people are more tolerant of pornography than others. The GSS has a variable named "pornlaw," which investigates attitudes toward pornography laws. Clicking on the variable name will give you frequencies for that variable, variable name, survey question, and the punch codes associated with each response. Go through the same process for your independent variable. I chose "educ" (education). Use your cursor to copy the frequencies for both variables to a Word file. Print a copy and turn it in with your assignment. Don't print from the GSS site, because it will print ALL the GSS variables (and kill many trees).

Your turn: select two variables that you believe might be causally related. If you choose wisely, you'll be able to use these variables for the rest of the semester. Then construct a theory and develop a hypothesis that describes your expectation about how these two variables are related. Make your hypothesis as specific as possible. Include a short statement about how you came up with the theory and hypothesis you did. Why do you have the expectation you do? Don't use my example, develop one of your own.

Important point: please check to make sure that the variables you have chosen were asked in 2002. You can find this out by switching to "Codebook by Year of Interview" as noted above. If the variable is not available in 2002, you won't be able to use if for your final project. A check of the data reveals that both of my variables are available in 2002 (i.e., there are numbers in the 2002 row). In checking this, however, I realized that there were too many missing values for pornlaw, so in my illustrative answer I ran the analysis for 2000.

You will have to think ahead and not wait until the last minute to do this assignment. You should also make it a point to come see one of us during the next month so we can talk with you about possible "recoding" you may need for one or more of your variables. For example, if you choose "Highest Year of School Completed" ("educ") as one of your variables, you may want to recode the responses into a smaller number of categories (e.g., 0 through 11 years of schooling=1; 12 years of schooling=2, etc.). Alternatively, the "degree" variable provides a recoded education credential variable. We'll show you later how to do this recoding.

Once you have developed your testable hypothesis, follow it through the research process described in class: describe your theory; the hypothesis you plan to test; your conceptualization of the two variables; and your operationalization of the variables (note: your indicators will be the questions asked in the GSS). Later in the semester you will make a crosstabulation table to test your theory with real data from the General Social Survey.

Finally, speculate on which additional variables you might want to include in your future analyses to help you reformulate your theory.

To help you along, I have written up an illustrative answer that you might find helpful in completing assignment 1. Avoid using my language in your writeup--write in your own words. Note that I have included a table using the 2000 GSS to demonstrate the results I found. Peruse this for future reference only. You are not required to make such a table for Assignment 1.

ALL ASSIGNMENTS MUST BE TYPED (12 point font, double spaced)!

[Don't forget to turn in a copy of the frequences for your two variables. As noted above, use your cursor to copy variables to a Word file. Don't print from the GSS site!]