Spring, 2004 (Roos, Soc. 311)

Assignment 2: Introduction to SAS: State Data (due Monday, Feb. 16th)

This assignment is intended to give you a quick immersion into doing statistical analysis using SAS in the IML. The assignment involves gathering data on the 50 states (and the District of Columbia) and doing descriptive statistical analyses on these data.

Before starting this assignment, skim through the SAS tutorial (see especially 1st, 2nd, and 4th set of links). MOST IMPORTANT, read a SAS Manual description of how to input data into SAS's Viewtable Window, which will be your first computer task in the IML.

Bring a 3 1/2" diskette, so you can save your work, since you may not finish in one sitting.

1) The first step involves gathering data on the internet. Your task is to gather interesting data on three (N=3) variables for each of the 50 states and Washington, D.C. You can either choose from among the websites I've noted below, or from a website of your choice (just make sure the data are from a reputable source). You may need (or wish) to collect data from two or more websites.

All the following websites provide state data. The first two give state-level census data for a variety of possible variables (for example, population counts, education, infant mortality rates, personal income, poverty rates, race percentages). For my example, I used data from the first census website and the Children's Defense League (3rd website).

http://www.census.gov/statab/www/ranks.html (Census: state rankings on basic data)
http://quickfacts.census.gov/qfd/index.html (Census Quick Facts on states)
http://www.childrensdefense.org/statesdata.htm (Children's Defense League data, by state)
http://teacher.deathpenaltyinfo.msu.edu/overviews/statesoverview.htm (Death Penalty Statistics, by state)
http://www.acf.dhhs.gov/programs/cb/publications/cwo98/Sec5/states.html (Child Welfare Outcomes, by state)
http://www.iwpr.org/states/index.html (Institute for Women's Policy Research, state data on women's rights)
http://www.edweek.org/sreports/tc99/states/usmap.htm (Education Week on the Web's Technology Counts data, by state)
http://www.iusb.edu/~jmcintos/USA98StatesTab.htm (Suicide Data, from the American Association of Suicidology, by state)
http://www.samhsa.gov/oas/nhsda/2k1State/vol1/lot.htm (Substance Abuse Data, from the Substance Abuse and Mental Health Services Administration (SAMHSA), U.S. Department of Health and Human Services, by state)
http://www.eia.doe.gov/emeu/states/_multi_states.html (Energy Use, from the Energy Information Administration, U.S. Department of Energy, by state)

If you want to look for other variables of interest, start with a search engine, and type "state data." I found the above through google.com.

2) From these data, choose three (n=3) interval-level (or ratio) variables, in addition to the name of the state. Let your interests guide you. Choose variables that you think are associated with each other. Since we'll be dealing with social groups, we're interested in association, not causality.

To make your study more interesting, it's best to avoid variables that are raw numbers; choose rates or percentages instead. State comparisons of raw numbers (i.e., the # of children in poverty) will simply reflect the fact that bigger states like California or New York have more of just about everything (e.g., crime, children in poverty, death penalty cases). Better to use rates or percentages, which control for population size (e.g., # of children in poverty per 1,000 people, or % of children in poverty).

Print out the pages you use for your three variables and turn them in with your assignment. (Note: this assumes that all your state data can be summarized on a page or two; if not, do not print out 51 pages! If you need to go to separate pages to get data on each state, provide 2-3 states to illustrate how the data are provided). Important: write the URL on any web page you turn in!!

3) Code the state data directly onto the SAS coding sheet (here is an example I'll use in class: SAS Viewtable Window), as we talked about in class (get to Viewtable Window from within SAS by clicking Tools then Table Editor). Use a period for any missing data.

Print out a copy of your completed SAS Viewtable Window (your coding sheet) and turn it in with your assignment.

4) Once you have your data entered, you're ready to run your programs. To evaluate the data you collected, run three programs:

-List data (under Reports) (will list the data you input)
-Summary statistics (under Descriptive) (will give you basic statistics for your variables)
-Correlations (under Descriptive) (will give you correlations between your variables)

You can access each through SAS:
-Choose Solutions-->Analysis-->Analyst
-File-->Open by SAS name-->Mylib-->statessys [or whatever you called your file name]
-Statistics-->[choose Reports to list your data, then Descriptive to get summary statistics and correlations]

Hint: click on variables you want included in statistics. Ctrl-click will let you choose more than one. List all four of your variables, but do statistics only on the ratio variables, not character variables (such as state name).

Get printouts of your SAS log and each ouptut and turn it in with your assignment.

5) Interpret your results in one to two double-spaced pages. Don't just describe your results, say something interesting and substantive. Try to be sociological. What are the average values of your variables? How do your variables correlate with one another? Talk about why you think your variables might be associated with each other, but do not write about theories, hypotheses, conceptualization, and operationalization as in Ass. 1.

In using data from any source, it's important to know the reliability of the data source. Please include a paragraph in your final write-up that describes where the data came from, and your assessment of the reliability of the data source (include the URL as a citation for the data).

In sum: along with your write-up, please turn in the original printed web pages from which you collected your data (with the URL), the SAS Viewtable Window, SAS log, and the outputs from the three programs.

THIS ASSIGNMENT MUST BE TYPED (no smaller than 12 font).