Fall, 2001 (Roos, Soc. 311)
Assignment 6: Frequencies on Final Project Variables
Due: Friday, December 7th
In addition to getting you on the computer again, this assignment is designed to move you a little farther along on your final project. You'll want to do some preparatory work before meeting in the Instructional Microcomputer Lab. Our dates for the IML are Friday, November 30th, and Tuesday, December 4th (note this is a change from the original syllabus). Be sure to write out your programs before meeting your recitations in the IML. For the current assignment:
1) Make sure you have your three variables from the GSS (the ones you used for Assignment 4). If you want to choose new variables, you'll have to go back to the GSS website (see Ass. 1). But if you do this, make sure you check with us to make sure your variables are sufficient for elaboration.
2) Describe your variables: what is your independent variable, your dependent variable, your test variable (and will your test variable be antecedent or intervening)? State your research hypothesis. So far, this just duplicates what you've already done for Assignment 4.
3) Make decisions on how you will recode your three variables (if necessary). We will talk in class about how to use SAS to recode variables. Describe what your recoding decisions are. For example, if one of your variables is "years of school completed", you will need to recode your 20 education values (where 1 = one year of schooling, 2 = two years of schooling, up to 20 means 8 years or more of college) into approximately three or four recoded groups (i.e., grammar school or less, some high school, high school degree, etc.). In general, to avoid running out of cases you should have no more than three to four categories for each of your three variables (especially your independent and test variables).
4) You'll also want to think about what missing values (if any) you'll assign to your variables. For example, Don't know (DK) and No answer (NA) responses should in most circumstances be assigned as missing, which would eliminate them from your analyses.
5) To complete this assignment, you will use the 1998 General Social Survey data, which is loaded on the machines at the Tillett IML. To access these date, use these 2 lines to begin your SAS programming:
libname in v8 'c:\class';
data temp; set in.gss98;
[add other SAS programming statements to recode your variables as necessary
and run SAS procedures]
Using SAS, generate frequencies for all three variables (use the PROC FREQ command). Get frequencies for both the unrecoded and recoded versions of your variables. For the final project you'll adapt this PROC FREQ to run bivariate and trivariate crosstabs.
6) To check your work, look over your output. You should have three basic tables, one for each of your variables (more if you recode variables). Before you get too excited, however, closely peruse your output. If you didn't get any errors, it could just mean that SAS didn't find any spelling errors; it doesn't necessarily mean that you didn't make any "logical" errors. You might want to go back to the GSS website to compare your frequencies with the 1998 frequencies.
7) Use the data from the computer output to make typed versions of your univariate tables for each of your recoded variables. Label them sequentially Tables 1 through 3.
Also, use the computer-generated data to make relative frequency histograms for each of your recoded variables. Label them sequentially Figures 1 through 3.
8) Use either the tables or histograms to write up your results for the recoded variables only. In discussing your results tell us, for example, what percentage of the sample is male, what percentage is female (if sex is one of your variables). This last part of the assignment asks you to interpret your output, and to practice writing about it. Note that if you have MISSING VALUES for any of your variables, the "PERCENT" column will give you percentages without missing values.
[Note: this frequencies output will not yet allow you to test your research hypothesis; rather, it will simply give you percentage distributions. Your next task (for the final project) will be to actually test your hypothesis using PROC FREQ to generate bivariate and trivariate tables.]
TURN IN YOUR OUTPUT AND SAS LOG. ALSO TURN IN COPIES OF THE PAGES YOU PRINTED FROM THE GSS WEBSITE. YOUR ASSIGNMENT MUST BE TYPED.
Remember, computing is fun!