MODELING
PROBLEM II
The project will consist of the following:
I. Proposal and Preparation
A. Identify the system which you plan to
model. Specify what it is you wish to
predict or explain (dependent variable).
Develop hypotheses about the variables or factors which you feel are
related to the dependent variable.
B. Express your hypothesis in the form of a
multiple regression model. The model
must have five or more independent variables.
D. Turn in Modeling Problem II Proposal found on
pages 4-5 to your instructor. Include
survey questionnaire if you are planning a survey.
E. Your proposal should be approved by your
instructor before continuing.
II. Collect data on your dependent variable, y,
and your independent variables, xj.
Data must be documented fully.
A standard reference is adequate for data catalogued in the
library. If the data is obtained from a
non-catalogued source (such as newspapers or magazines), it must be Xeroxed and
included with your paper. If you are
using time series data, you should use the most recent data available. You may conduct a survey if you wish. However, you must obtain responses from at
least 60 individuals. Be sure to include
the 60 completed questionnaires with your paper. The questionnaire used in interviewing
respondents should be typed.
III. Run
the regression model on the computer using SPSS.
IV. Write
up the project in a paper using the format below. This format is similar to that for
the Modeling Project 1 write-up although
Part 1 is expanded in order for you to more
carefully
explain your model, data collection procedures, and the model's hypothesized
relationships.
Modeling Project II Write-Up
Part 1:
Introduction and Summary
1) Specify
what you wish to predict or explain (dependent variable) and those factors
which you will use do the explaining (independent variables):
discuss
the theoretical relationships between the dependent and independent variables;
discuss
the purpose of estimating the model's relationships.
2) Present
the original population model:
define
all terms and take care to specify units of measure for each variable;
note
the expected signs of the coefficients for those variables for which you have a
priori notions of the sign of the xj-y relationships;
state
the data sources for each variable and note any difficulties you had obtaining
the data and your techniques to overcome them;
state
associated sample equation with numerical coefficients.
3) Present
revised population model (if any):
define
all terms;
discuss
why this model is estimated;
state
associated sample question with numerical coefficients.
4) Summarize
the conclusions regarding the model(s) you have estimated:
discuss
the usefulness of your model as a prediction tool;
suggest
an alternative model for further study.
Part 2: Body
5) Write
the sample model for the first computer run with numerical coefficients.
6) State
the assumptions and conditions made to use the model(s).
7) Evaluate
the assumptions/conditions which you can from the computer printout:
i) (severe)
multicollinearity (from correlation matrix);
ii) autocorrelation
(from d-test, take care to note if this test is relevant for this model
estimated);
iii) heteroscedasticity (from X2 test).
Suppose
these assumptions/conditions had not been met and state the consequences.
8) Evaluate
the overall model (f-test).
9) Evaluate
the goodness of fit of the model to the data (report and interpret R2
and adjusted R2.
10) Interpret each regression coefficient (a, b1,
..., bm).
11) Which of the independent variables is
significantly related to the dependent variable (t-test on regression
coefficients). Note any coefficients
which have the wrong sign based upon your a priori knowledge of the model's
relationships. (Insignificant and
wrong-signed variables are to be omitted in the second computer run.
12) Write the revised sample regression model
for the second computer run with numerical coefficients.
13) Contrast the goodness of fit of the second
model with the first (compare R2 and adjusted R2 of the
two models).
14) Interpret the regression coefficients and
state why they change in value.
Modeling Problem II Proposal
Name:
A. Title of
Project
B. (1) Are you trying to explain cross-sectional or
time series variations in the dependent variable?
(2) What is your unit of observation? (That is, years, people, etc.)
(3) What is your sample size?
C. Definition of
variables: Write out your hypothesized
population regression model.
(1) a. Define
dependent variable:
b. In what units is it measured?
c. Where will you obtain data on this variable?
(2) a. Define
your first independent variable.
b. In what units is it measured?
c. What sign do you expect its coefficients to
have?
d. Briefly explain why you think it affects y in
the way stated.
e. Where will you obtain data on this variable?
(3) a. Define
your second independent variable.
b.
c.
d.
e.
(4) a. Define
your third independent variable.
b.
c.
d.
e.
(5) a. Define
your fourth independent variable.
b.
c.
d.
e.
(6) a. Define
your fifth independent variable.
b.
c.
d.
e.
(7) a. Define
your sixth independent variable.
b.
c.
d.
e.
ECONOMICS
3992 STATISTICAL RESEARCH: FLOWCHART
Follow steps outlined below for each subject for which
you need data. See Core List for call
numbers and locations of recommended research tools. IGNORE CALL NUMBERS AND LOCATIONS
1. Specify the
data you
need.
2. Do any titles
on Core
List or Yes
Extended
List Use call # given
appear
likely to
have the
data? Do
works have Write
down title,
see pp.
168-182 the
data you Yes data, page.
need? Return
to Step 1.
No
No
Need Need
cross section
time series
3. Do County
and 3. Does Statis-
City Data Book or tical Abstract Use
Historical
State and Metrop- have data, or Use earlier Statistics for
olitan Area Data lead to a source Yes editions
to find pre-1971 data.
Book have the data? which does? data back to 1971 (see handout)
(see
handout) (see
handout) Return to Step 1.
No No
4. Do ASI or SRI
lead to
sources Do
any of these
likely to
have the Yes publications Yes Write down title,
data? have
the data data,
page.
(see
handout) you
need? Return
to Step 1.
No No
5. Does libraian
have any
ideas for Yes Did her/his Yes Write down title,
finding
the data? suggestions
lead date, page.
to
the data?
Return
to Step 1.
No
No
6. Discuss your
situation
with your
instructor.
SOURCES
FOR STATISTICAL DATA
CORE
LIST
The 22
works described below provide the best and the broadest coverage for use in
completing Modeling Problem II.
Consulting 1 or more of these titles should be the first step in your
search for the data you need for your model.
Additional titles are given in the Extended List (pp. 177-186), should
not find appropriate data in any of the titles on the Core List. Note that for some titles, Superintendent of
Documents call numbers (used by the Documents Library) are given as well as
Dewey call numbers.
This Core
List is arranged under several broad subject areas. The Extended List is organized by call
number, but a subject index follows on page 186. The location codes, given in parentheses on
both lists, are standard LCS codes. If
you have difficulty in finding any item, consult a librarian.
Location codes:
CRR: Commerce
Library, Reference Section
CRRes: Commerce
Library, Reserve Section
DOC: Documents
Library
ENR: Engineering
Library, Reference Section
UGC: Undergraduate
Library, Closed Reference Section
REX: Main
Library, Reference Section
UGR: Undergraduate
Library, Reference Section
GENERAL
SOURCES
317.3
Un3co
(CRRes,REX, This
is a supplement to the Statistical Abstract of the U.S. It gives a wide
UGR variety
of demographic, social, and economic statistics for regions, divisions, states,
counties and SMSAs.
DOC.C3.134+2: This
is a good source for cross-sectional statistics.
C83+2+
q.317.3 U.S.
Bureau of the Census. Historical Statistics
of the
Un315hi Colonial
Times to 1970. 2 Vols.
1976 1975.
(REX,CRRes, This
source contains a wide range of historical statistics for the United
UGR) States. Economic, political, social, and demographic
data are given. Many tables correspond with tables in the Statistical
Abstract which provides data
DOC.C3.134+2:
for years after 1970.
H62+789-970+
(DOC)
317.3
St29
(REX,UGR) Presents
data on a wide variety of social and economic topics for metropolitan
areas,
cities, states and regions. Good source for cross-sectional statistics.
DOC.C3.134+5:
(DOC)
317
Un3s
(CRRes,REX, "Standard
summary of statistics on the social, political, and
UGR) economic
organization of the
DOC.C3.134:
(DOC)
353.9 Council
of State Governments. The Book of the
States.
B644 Council,
biennial.
(DOC,REX, Provides
data on state finances, state services such as education,
UGR) transportation
and welfare, and state economics and natural resources for all of the 50
states. A useful source of
cross-sectional statistics.
SOURCES
OF ECONOMIC AND FINANCIAL DATA
330.9 Organization
for Economic Cooperation and Development.
Or14m Indicators: Historical Statistics.
sup. B operation
and Development, 1980.
(CRRes) Contains
data for all OECD countries for the years 1960 through 1979. Statistics given include GNP, imports,
exports, industrial production, stocks, and construction. Data is given quarterly and monthly if
appropriate. This source is updated
monthly by the OECD Main Economic Indicators (330.90r14m).
336.73 Facts
and Figures on Government Finance.
T21fa biennial.
(CRRes,REX, Time
series data on public finance, taxes, expenditures, and
UGR,DOC) indebtedness
at the federal, state, and local government levels.
332.6 Moody's
Handbook of Common Stocks.
M775 Service,
quarterly.
(CRR) Each
issue contains concise financial statistics and price charts for over 900
common
stocks. Data covers a 10-year period.
332.63 Standard
and Poor's Corporation. Analyst's
Handbook.
St24a Standard
and Poor's Corporation, annual.
(CRRes) Gives
composite corporate per share data by industries. Also
included
is per share data on the S & P 400.
332.6 Standard
and Poor's Corporation. Standard and
Poor's Statistical Service.
St24tl
(CRRes) cumulations.
Contains
statistics covering banking and finance, production and labor, price
indexes,
income and trade, building, electric powers and fuels, metals, transportation,
textiles, chemicals, paper products, agricultural products. Last section contains several years of
statistics for bond and stock prices, sales, yields and selected ratios.
DOC.
C59.9:
(DOC) Statistical
tables at end of each issue provide quarterly and annual as well as
monthly
time series data on numerous economic and financial topics. "Series Finding Guide" tells which
issues have historical tables in specific subject areas. Use to update tables in Handbook of
Cyclical Indicators.
DOC
C59.9+3
In2+984 An
excellent source of time-series statistics on economics, business
(DOC) and
financial topics. Most tables provide
monthly, quarterly, and annual data
Q.338.54 from
1947-1982. Use Business Conditions
Digest for data for 1983-.
Un33h
1984
(CRR)
Q.338.542
B96sup.
(REX)
q.382 U.S.
Bureau of Economic Analysis. Business
Statistics.
Un3ds Gov't.
Printing Office, irregular.
sup. Excellent
source for time series data at the national level on a wide
(CRRes,REX) range
of economic subjects, from production and trade of commodities to
general
business indicators, prices and employment.
1982 edition has 20
DOC. years
of data; use 1979 edition for earlier years.
C59.11+3:
(DOC)
331
Un363m
(CRRes) Reports
on labor force and employment trends and various
DOC. government
employment and training programs.
Statistical Appendix
L1.42+2: provides
numerous time series tables for data on employment and earnings
(DOC) for
various population groups, as well as GNP and consumer and producer price
indexes.
330.973
Un315e Printing
Office, annual.
(CRRes,REX) An
excellent source for time-series statistics on the
DOC. production. The data includes GNP, interest rates,
inflation rates, money
PR1.9 stock,
composite stock prices and yields, and some important international
(DOC) statistics.
331
Un342h
(CRRes,REX, Contains
time series statistics on employment and
unemployment as
UGR) well
as various demographic, social and economic characteristics of persons in the
labor force. Also gives some productivity,
earnings, and work
DOC. stoppage
data.
L2.3+5:
982
SOURCES
OF STATISTICS ON SOCIAL TOPICS
q.312 United
Nations. Department of International
Economic and Social Affairs.
D396 Statistical
Office. Demographic Yearbook.
(REX,UGR) annual.
Provides
international statistics on population, mortality, marriage and divorce,
plus
special topic tables each year. A
historical supplement to the 1979 Demographic Yearbook presents related
data for the thirty-year time period of 1948-1977.
370.973
Un3d
(REX) Annual
and time series tables on educational enrollment at all levels, staffing of
schools, revenues and expenditures, and other topics. Most data at national
DOC. level;
some broken down by region or state.
ED1.113:
(DOC)
DOC.
HE20.6210:
(DOC) Each
annual set contains 3 volumes, Natality, Mortality, and Marriage and
Divorce. Offers time series and cross-sectional data
on birth rates, causes of death, and characteristics of married and divorced
individuals. Time series data is for
309.173
Un338s Printing
Office, 1980.
(CRRes,REX) Massive
compendium of data concerning American social conditions. Topics covered include population
characteristics, health, education, labor force
DOC. characteristics,
and use of leisure time. Detail and time
coverage of data vary
C3.2 from
table to table.
Si1+2+979
(DOC)
364.44
So844 Criminal Justice Statistics.
(REX) Compilation
of statistics on crime, the criminal justice and corrections
systems,
and public attitudes toward these. All
varieties of crimes and police
DOC. and
legal activities are included. Most
tables provide cross-sectional data for
J29.9 states
or cities; a few provide time series.
SD-SB-
(DOC)
364.05
Un States.
(REX) Statistics
on numbers and types of crimes, broken down by age, race,
sex
and geographic areas. Documents Library
has last 30 years of this
DOC series.
J1.14+7:
(DOC)
EXTENDED
LIST
INDEXES TO STATISTICAL PUBLICATIONS
016.3173 (1) Statistical Reference Index (SRI).
St2 Service,
monthly with annual cumulations.
(DOC) "A
selective guide to American statistical publications from private organizations
and state government resources."
Covers all subject fields and
Index publication
formats. Organization and use identical
to ASI (see below). Most
Table indexed
items are available on microfiche in Documents Library.
317.3 (2) American Statistics Index (ASI).
Am33 Service,
monthly with annual cumulations.
sup. "Master
guide and index to all the statistical publications of the
(DOC) Government." Covers all subject fields and publication
formats. Index volume contains a
subject index and indexes by geographic, economic, and
Index demographic
categories. Abstracts Volume
provides descriptions of
Table publications
and their contents. Many indexed items
are available in Documents Library.
OTHER
SOURCES PROVIDING TIME SERIES AND CROSS-SECTIONAL DATA
310 (3) Banks, Arthur S. Cross-polity Time-series Data.
B226c 1971.
(CRR) Most
material covers 1900-1966, excluding two major wartime periods, 1914-1918 and
1940-1945. All commonly recognized
members of the international community are included in the 10 segments,
covering a wide variety of subjects.
MICROFICHE
(4) Current National
Statistical Compendiums.
C93 A
microfiche collection of statistical publications and yearbooks of over
(REX) 100
countries.
NOTE: Paper copies of statistical yearbooks for
many countries of the world are also available in the Reference Room and the
Commerce Library in the 314-319 call number range.
310 (5) Information Please Almanac.
In3 Includes
statistics on a wide variety of
(REX,UGC) Many
tables have time series data.
310 (6) United Nations.
Department of International Economic and Social Affairs.
Un3s Statistical Office. Statistical Yearbook.
(CRR,REX) Internationally
comparable economic and social statistics are given at world, regional, and
national levels. Tables cover various
lengths of time.
314 (7) Mitchell, Brian R. European Historical Statistics, 1750-1975. 2d rev. ed.
New
M69e
1981 In
a topical arrangement, statistics on climate, population, labor force,
(REX) agriculture,
industry, external trade, transport, communications, finance, prices, education
and national accounts are given for European countries. Tables cover long time spans.
315 (8) Mitchell, Brian R. International Historical Statistics:
M69i
1982 A
companion volume to European Historical Statistics 1750-1975, this
(CRR,REX) work
presents social and economic time series data for African and Asian
countries. Coverage ends with 1975.
317 (9) Mitchell, Brian R. International Historical Statistics: The
M692i
1983 This
volume includes social and economic time series data for North,
(REX) South
and Central American countries,
317.3 (10)
Un3o
(CRRes, DOC) See
especially appendix 3, "Basic Data", p. 181. Here are included time series tables on GNP,
employment, finance, production and other important
324.73 (11) Scammon, Richard M., ed. and comp.
Am38 Contemporary
American Election Statistics. 14
Vols. [place and publisher
(REX,UGR) vary].
Volumes
include voting statistics for presidential elections from 1948 through 1980 and
for
324.973 (12) Presidential Elections Since 1789. 3rd edition.
C76p Quarterly,
1983.
1983 A
statistical record of the vote for president and vice-president through the
(REX) 1980
election.
330.5 (13) United Nations. Department of International Economic and
Social Affairs.
MONB2 Statistical
Office. Monthly Bulletin of
Statistics. [Vol. 1, January
1947- ].
(REX) Tables
of international economic statistics with time series of varying lengths.
Includes
some monthly data.
330.9 (14)
Un315h Washington:
(CRR,DOC) Provides
statistics on a variety of economic, financial, agricultural, and industrial
topics for Communist countries and selected "non-Communist"
countries. Data are adjusted to
facilitate cross-sectional comparison.
q.331 (15) International Labour Office. Yearbook of Labor statistics.
In831y International
Labour Organization, annual.
(REX) Statistics
for many countries on employment, unemployment, wages,
consumer
prices, occupational injuries and industrial disputes are included. Supplemented quarterly by the Bulletin of
Labour Statistics (331In83b).
331 (16)
Un342m Gov't.
Printing Office, monthly.
(CRR) Provides
current statistics covering employment, earnings, consumer and
producer
prices, productivity, work stoppages, and wage and compensation
DOC. L2.6 data.
(DOC)
q.331.112
(17) Organization for
Economic Co-operation and Development. Labour
Force
Or35m Statistics.
(CRR) annual.
Split
into 2 sections, this source first gives main aggregates from 1967-1980. Also in this part are graphs showing data
from 1961. The second section contains
figures by country referring to the period 1960-1980. This source is updated by the Labour Force
Statistics, Quarterly Supplement, also published by the OECD (331.112Or35m
sup.).
332.05 (18)
UNF Bulletin.
(CRR) Provides
monetary statistics such as prices, labor market, construction and national
income. Presents the Federal Reserve
Board's index of industrial
DOC.FR1.3: production.
(DOC)
332.1 (19)
Un32ba Monetary
Statistics.
(CRR)
332.1
Un32ba Monetary
Statistics, 1941-1970.
1976 Federal
Reserve System, 1976.
(CRRes)
DOC.
FR1.3+2:
941-970
(DOC)
332.1
Un32as Digest,
1970-1979.
(CRR) System,
1981
DOC.
FR1.3+1:
970-79
(DOC)
332.1
Un32as Digest.
(CRR) annual.
DOC.
FR1.3+1:
(DOC)
These four
volumes together give a wide variety of time series data on any type of banking
or monetary statistics. Included are
statistics on banks in the
332.15 (20) International Monetary Fund. Bureau of Statistics. International Financial
Un85ia Statistics
Yearbook.
(CRR) This
is a very good source for the financial statistics of many countries starting
with data from 1952. Information on
exchange rates, international liquidity, bank assets, interest rates, prices,
production, and imports and exports are given.
This yearbook is updated monthly by the publication International
Financial Statistics (332.15In85i).
332.64 (21) Pierce, Phyllis S., ed. The Dow Jones Averages 1885-1980.
D752 Dow
Jones - Irwin, 1982.
(CRRes,REX) Contains
daily Dow Jones averages for the industrials, the railroads
(transportation),
and the utilities since 1885. Since
1928, it has high, low, and closing prices.
Also includes daily sales and the Dow Jones bond averages.
332.8 (22) Homer,
H75h
1977 This
contains a complete history of interest rates for the
(CRRes) as
well as for other countries. Included
are rates on securities, bonds, real estate, and other types of issues since,
in some cases, the beginning of recorded history.
336 (23) International Monetary Fund. Bureau of Statistics. Government Finance
In82g Statistics
Yearbook.
(CRR) This source gives coverage
on 124 countries and includes information on units of government and the
accounts through which governments work; 10 years of data on government
operations and expenditures is given for all countries when possible.
338 (24) Emery, Walter L., ed. Commodity Yearbook.
C73 Research
Bureau, annual.
(CRRes,REX) Time
series statistics from public and private sources on production, prices and
trade of food and goods. Supplemented
three times a year by Commodity Yearbook Statistical Abstract Service
(same call number).
338.1 (25) Food and Agriculture Organization of the
United Nations. FAO Production
P9451 Yearbook.
(REX) annual.
Provides
international statistics on crop production and livestock products. Previous title: Production Yearbook (338.1P945).
338.2 (26) American Metal Market. Metal Statistics.
M564 annual.
(ENR) Contains
data on all types of metals including for each production, consumption, and
price information. Each annual volume
contains several years of data.
338.27 (27) Twentieth Century Petroleum Statistics.
T91 MacNaughton,
annual.
(CRRes, ENR) Provides
large range of worldwide statistics and charts arranged by country. Coverage includes oil production, reserves,
demand, and refining capacity.
338.4 (28) Organization for Economic Cooperation and
Development. Industrial
Or14indu Production: Historical Statistics.
(CRR) operation
and Development, 1976.
This
source complements the Main Economic Indicators published by the
OECD. It contains for member countries
data relating to all aspects of industrial production. The data is given quarterly and monthly if
appropriate. There is a supplementary
volume to this which includes data left out of the original publication. [338.40r14indu sup. (CRR)].
388.4 (29) United Nations. Department of International Economic and
Social Affairs.
Un27g Statistical
Office. Yearbook of Industrial
Statistics. 2 vols.
(CRR) Nations,
annual.
The
first volume of this series contains basic data on industrial production for
each member country or area and a selection of indicators showing global and
regional trends in industrial activity.
Volume II contains detailed time series data on world production of
industrial commodities.
338.5 (30) Predicasts.
Basebook.
B29 Each
volume covers approximately 12 years of data.
Arranged by
(CRR) Standard
Industrial Classification, this source gives statistical data on all types of
products and industries. Also included,
at the beginning of the volumes, are statistical data on general economic and
demographic indicators such as population, marriages, deaths, and income.
338.7 (31) Edison Electric Institute. Statistical Yearbook of the Electric
Utility Industry.
Ed42e
(ENR) Gives
data dealing with the electrical utility industry. Updates the Historical Statistics
volume (below).
338.7 (32) Edison Electric Institute. Historical Statistics of the Electric
Utility Industry
Ed42h Through
1970.
(CRR) This
contains historical data, some tables starting in 1920, on all aspects of the
electric utility industry such as generating capacity, electric power and
energy resources, sales, and revenues.
For data after 1970 see above title.
339.3 (33) Organization for Economic Co-operation and
Development. National
Or3n Accounts
Statistics. 2 vols.
(CRR,REX) and
Development, annual.
Main
aggregates are given for each OECD country for the period 1950-1979. Volume 2 contains detailed national account
statistics for each member country.
339.3 (34) United Nations. Department of International Economic and
Social Affairs.
Un2y Statistical
office. Yearbook of National Accounts
Statistics. 2 vols. New
(CRR,REX)
Includes
detailed national account statistics for approximately 155 countries for the
period 1968-1979. Length of time series
varies from table to table.
370 (35) United Nations Educational, Scientific and
Cultural Organization. Statistics
Un35s Yearbook.
(REX,UGR) Organization,
annual.
Provides
data on education, science and technology, and cultural activities for
about
200 countries.
382 (36) International Monetary Fund. Bureau of Statistics. Direction of Trade
In875dal Statistics
Yearbook .
(CRR) This
volume gives data on the "distribution by trade partners of total exports
and imports of 154 countries, as well as area and world aggregates showing
trade flows among major areas of the world." Each year's issue covers seven years of data.
382 (37) Food and Agriculture Organization of the
United Nations. FAO Trade
T6751 Yearbook.
(REX) annual.
Agricultural and aggregate food
product statistics are presented in tables arranged by trade index numbers,
commodities, and countries. Previous
title: Trade Yearbook (382T675).
Q.382 (38)
Un3ds
(CRR) Provides
current business statistics in numerous areas such as foreign trade of the
DOC Historical
data is available in its biennial supplement Business Statistics,
C59.11: Q.382Un3ds
sup. (see Core List).
(DOC)
382 (39) United Nations. Department of International Economic and
Social Affairs.
Un17y Statistical
Office. Yearbook on International
Trade Statistics. 2 vols. New
(REX)
Detailed
trade data for individual countries and world trade statistics for specific
commodities.
614.1 (40) World Health Organization. World Health Statistics Annual. 3 vols.
W893s World Health Organization, annual.
(REX) This
publication ". . . provides data on Vital Statistics and Causes of Death,
Cases
of Infectious Diseases and Health Personnel and Hospital
Establishments." Covers many
countries. Time series statistics may be
gathered from volumes of previous years.
NOTE: Many books
in the Documents Library have been shelved according to the Superintendent of
Documents Classification System (SuDocs).
SuDocs classifies documents according to the government agency which
issues the document: e.g. call numbers
for Department of Labor publications begin with L; call numbers for Department
of Commerce publications begin with C.
Government documents are thus shelved alphabetically by issuing
agency. They are then shelved in
numerical order according to the number which follows the agency letter. Some older documents still have Dewey Decimal
Classification numbers. Thus there are
two sets of shelves in the Documents Library--Dewey and SuDocs. Ask for assistance form the library staff if
you have difficulty finding a government document.
DOC. (41)
A1.34: Expenditures,
1960-1980.
672 Provides
20 years of national data on per capita consumption of various food
(DOC) products. Also gives consumer price indexes (1960-1980)
of selected categories of food products.
Breakdown of supply services and utilization characteristics for
specific commodities are included as well.
DOC. (42)
A1.47: Gov't.
Printing Office, annual.
(DOC) Area,
yield, production, export, value and other statistics for all manner of
agricultural goods. Also give statistics
on various financial aspects of farming.
Most tables provide time series; number of years given varies.
DOC. (43)
C3.186: Several
series provide annual updates, at the national level primarily,
(DOC) of
population characteristics collected in more detail in the decennial
census. Subjects covered include income
(P-60 series) and family/household characteristics (P-20 series).
DOC. (44)
C59.2:
In2+4 Gives
per capita personal income, disposable personal income and
(DOC) total
personal income for states from 1929-1982.
Also includes time series data on sources of income and earnings in each
state (e.g., farm/nonfarm; manufacturing; services; construction, etc.).
DOC. (45)
C59.11+4: Accounts
of the United States, 1929-1976: Statistical Tables.
In2+929-76
(DOC) Provides
time series data on numerous aspects of the
DOC. (46)
ED1.109:
(DOC) Presents
statistics on a variety of topics and issues affecting education at all
levels. Subjects covered vary from year
to year. Time periods for which
statistics are presented also vary.
DOC. (47)
HE3.3+3: Statistical
Supplement.
(DOC) Many
time series data tables providing data on payments under both social security
programs and related social and health programs. Some tables relate social security benefits
to other areas such as employment, earnings, and poverty status.
DOC. (48) U.S. National Institute of Education. Higher Education Financing in the Fifty
HE19.202: States.
F49+3+976 Appendix
B contains cross-sectional data for states on amounts and sources of public
funds expended for institutions of higher education.
DOC. (49) U.S. National Institute of Education. Tax Wealth in Fifty States.
HE19.202:
T19 Statistics
on tax capacity and tax collected for various types of taxation
(DOC) (e.g.
income, property) for the 50 states.
Provides cross-sectional rather than time series data.
DOC. (50)
L2.3:1312-11 1909-1978.
(DOC) Includes
historical, national data for individual non-agricultural industries. Arrangement is by Standard Industrial
Classification Code.
DOC. (51)
L2.3:1370-17 and
Areas 1939-1982.
(DOC) A
companion volume to the above, this provides similar information for all
states, the
DOC. (52)
L2.3: Current
Population Survey: a Databook. 2 vols.
2096 Printing
Office, 1982.
(DOC) Contains
many tables providing detailed demographic data related to
the
employment, unemployment, size of the labor force, etc. Income and employment is correlated with sex,
age, and racial characteristics of the population.
DOC. (53) U.S. Bureau of Labor Statistics. Employment and Earnings. Washington:
L2.41+2: U.S.
Gov't. Printing Office, monthly.
(DOC) Provides
monthly and quarterly statistics on employment, unemployment, and weekly
earnings of persons in the labor force.
Presents data for most categories by age, sex, race and marital
status. Employment by industry is also
given.
DOC. (54) U.S. Federal Highway administration. Highway Statistics, Summary to 1975.
TD2.23: Washington: U.S. Gov't. Printing Office, 1977.
(DOC) Annual
supplements published under title, Highway Statistics (625.7H54). Provides annual data on topics such as motor
vehicle registration, motor fuel consumption, highway finance, highway usage
and traffic fatality rates. Most tables
have state and national data, but some contain statistics at national level
only.
DOC. (55) Economic Indicators. Washington:
U.S. Gov't. Printing Office, monthly.
Y4.Ec7: Each
issue provides annual and monthly or quarterly data concerning
Ec7+ employment
and wages, prices, money supply and interest rates, as well as
(DOC) income
and production. Several issues need to
be used to locate 30 units of data in any of these areas.
DOC. (56) U.S. House of Representatives. Committee on Interstate and Foreign
Y4.In8+4: Commerce. The Energy Factbook. Washington:
U.S. Gov't. Printing Office,
(DOC) 1980.
"Data
on energy resources, reserves, production, consumption, prices, processing, and
industry structure." Time period
covered varies considerably from table to table.
INDEX
Numbers
refer to the numbers of titles in the Extended List only.
A. General
Sources: 1, 2, 4, 5, 6, 7, 8, 9, 35
B. Agriculture: 41, 42
C. Economy: 3, 6, 10, 13, 14, 15, 16, 18, 30, 38, 44, 45,
55
D. Education: 35, 46, 48
E. Energy: 27, 31, 32, 56
F. Finance: 18, 19, 20, 21, 22, 33, 34, 49, 55
G. Labor: 15, 16, 17, 50, 51, 52, 53
H. Politics,
Government: 11, 12, 23, 33, 34, 49
I. Productivity
and Commodity Data: 24, 25, 26, 27, 28,
29, 30, 41, 42
J. Social: 6, 35, 43, 47, 52
K. Trade: 36, 37, 39
L. Transporation: 54
M. Vital and
Health Statistics: 40
3.2 MODELLING
PROJECT 2 (MP2)
3.2.1 SAS AND THE MP2 ASSIGNMENT. For the MP1 assignment, students are given
the following: a) the dependent and
independent variables, and by implication, the population model; b) a set of 29
observations on the system with which the student can evaluate the model; and
c) a SAS program to analyze this data.
In contrast, MP2 is essentially unstructured. Here, the student constructs a regression
model of a real world system of interest to him/her. First, the student must decide what is to be
explained (i.e., sales of a firm, highway deaths, capital spending in the
economy) and the factors which are related to or explain this variable. Next, data must be found to evaluate this
hypothesized model. Finally, a SAS
program must be written to undertake a statistical analysis of the data.
In order to
place this discussion in the context of a specific problem, suppose you are
interested in the factors which explain highway deaths. The dependent variable in this case would be
highway deaths, expressed in thousands per year. Let us refer to this variable in the SAS program
as HDEATH. After researching the problem
you hypothesize the following factors explain highway deaths: a) the number of licensed drivers in millions
(DRIVE), the number of miles of limited access highway in hundreds of miles
(LWAY), expenditure on police enforcement in millions of dollars (POL), whether
or not there is a highway speed limit (a dummy variable, HSL), and whether
there is a seatbelt requirement (a dummy variable, BELT). After reviewing the section on "Library
Sources for Statistical Data" which appears in the Guide to Economic
Statistics you find data on each of these variables for the past 40 years.
It will now
be necessary to change the SAS program used with MP1 to run this new data. We recommend that you change the MP1 program,
rather than attempting to write a new program from scratch. You will see shortly that only a limited
number of lines must be changed in that program to accommodate the new data and
the new model you wish to analyze.
The first
step will be to enter the new data into the file called SASDT INPUT A. Assume you use the following columns (fields)
for the variables in the highway deaths example:
Variable Fields:
HDEATH 1-10
DRIVE 11-20
LWAY 21-30
POL 31-40
HSL 41-43
BELT 44-46
Obviously, you
cannot enter this data into the file which has the data from MP1. Instead, issue the command ERASE SASDT INPUT
A, and then use xedit to create a new file called SASDT INPUT A. This is done with the following two commands:
ERASE
SASDT INPUT A
XEDIT
SASDT INPUT A
You will now follow the instructions given earlier for
entering the new data into this newly created file. Be sure to enter the data in the columns
specified by the fields above.
Once the
data has been typed into the file, you must now change the SAS program to read
the new data, and to construct the required regression model. Most of the lines in this job can be used as
printed to build other models. However,
lines dealing with input, specification of the model, and plots must be
changed. These lines are:
INPUT
line (line 5)
MODEL
line (line 8)
PLOT
lines (lines 17-24)
To change these lines in the SAS program, you must enter
the file called SASPRG SAS A using the xedit command:
XEDIT
SASPRG SAS A
Assuming that you are now in the SAS program file, we
now consider each of these line changes in turn.
3.2.1.1 THE NEW INPUT LINE. The input line must now tell the computer the
columns of SASDT INPUT A which contain each of your
variables. To change this line in the
program move the cursor to the INPUT line and retype the variables and fields
for the MP2 problem. For our example,
the INPUT line becomes:
INPUT
HDEATH 1-10 DRIVE 11-20 LWAY 21-30 POL 31-40 HSL 41-43 BELT 44-46;
3.3.1.2 THE NEW MODEL LINE(S). To instruct the computer to build a new
regression model involving the new variables, you would change the model line
to:
MODEL
HDEATH=BELT HSL POL LWAY DRIVE/R DW;
As before, the order of the independent variables is not
important.
SAS is
structured so that one can estimate several regression models on the same
run. For example, if you wish to build a
regression model involving only the variables DRIVE, LWAY, and POL, then the
following model line would be inserted after the one above:
MODEL
HDEATH=DRIVE LWAY POL/R DW;
A total of five models can be built on any given run,
but there is one limitation in specifying more than one model in a single
run. If multiple models are specified,
the SASPRG program will only produce a set of plots for the last model
specified in the program. Thus, in the
present case, if we request plots, they will be run for the model line:
MODEL
HDEATH=DRIVE LWAY POL/R DW;
3.2.1.3. THE NEW PLOT LINES. The graphs produced by the PLOT commands are
used primarily to look at problems of heteroscedasticity and to spot possible
specification errors. These graphs plot
the residuals against the independent and dependent variables.
Given the
level of computer funds available in ECN 3972 for each student and the high cost
of computing and printing, you should only obtain one graph for each of your
dependent and independent variables.
This means that if you construct a second model where you have added to the variables used in your
first run, you should delete the PLOT lines which caused graphs to be printed
out for all variables used in your first model.
Assume for a minute your first run had only the following
"model" line:
MODEL
HDEATH=DRIVE LWAY POL/R DW;
and you obtained graphs for all the independent variables. Now if you make a second run with the
following model line:
MODEL
HDEATH=BELT HSL POL LWAY DRIVE/R DW;
you would only list plot lines for variables not in the
first run (i.e., BELT and HSL). Your
plot lines in this second run would be:
PROC
PLOT;
PLOT RSDULS*BELT='+'/VREF=0.0;
PROC
PLOT;
PLOT RSDULS*HSL='+'/VREF=0.0;
These statements will cause residual graphs to be
produced for the seatbelt and highway speed limit variables.
Students
interested in other refinements in the use of the SAS statistical package
should purchase a SAS manual and investigate them at their convenience.
3.2.2 COMPUTER COMMANDS FOR MP2. This section lists the computer commands
required to make a first run for MP2 using a different example. To illustrate these computer commands,
suppose we wish to explain the annual crop of pigs (y) on Old MacDonald's farm
for the last 40 years. The independent
variables are the number of sows (x1) and the number of boars (x2)
in residence that year, and tons of food purchased for pig consumption during
the year (x3). In addition,
we include two dummy variables: whether
MacDonald used a temperature control system for the barn (x4), and
whether spring came early or late that year, measured by the groundhog factor
(x5).
Assuming
the y-xi relationships are linear, we may write the population
regression model as:
3) y = a + b1x1 + b2x2 + b3x3+ b4x4 + b5x5 + e.
The first
run will estimate this model using 40 observations. Once you have logged on, you must erase the
data file containing the MP1 data, create a new data file for 40 lines of data
and enter the data into it. Next, you
must change the SAS program to fit the problem for MP2. This entails changing the INPUT line and the
fields, changing the MODEL line, and adding the PROC PLOT lines. Finally, the MP2 SAS program is run on the MP2
data, a file containing a listing of the regression results is checked and a
hard copy is made. Note that the
variables in this program are the ones used in the example above. You will of course replace these with your
own variables.
commands response
(turn terminal on) "e" VMD logo, then CP READ
DIAL PVM "e" L
VMD N VME N VMC
(check cursor is under VME VME logo, then CP HEAD
"e" "e"
logon ec173xxx "e" PASSWORD
xxxx "e" VM
READ
profile "e" R;T=stuff
erase sasdt input a "e" R;T=stuff
xedit sasdt input a "e" SASDT INPUT A, etc.
00000
* * * TOP OF FILE * * *
:...+....1....+..etc..:
00001
* * * END OF FILE * * *
(move cursor to line 00000 prefix 00000 * * * TOP OF FILE * * *
and type: 40a "e" :...+....1....+..etc..:
00001
00002
.
.
00020
(move cursor to line 00001; input
data for
each observation,
taking care
to enter data in
columns
consistent with the
fields to
be specified in
the INPUT
line of the MP2
SAS
program. Use "r" or
the cursor
to move to next
line.
Use ALT-PF8" to scroll the
screen
forward.):
commands (cont'd.) response
00000 * * *
TOP OF FILE * * *
:...+....1....+....2....+....3....+....4....+....+..etc..:
00001 146 91 10 500.6 1 0 "r"
125 82 10 482.7 0 0 "r"
139 74 13 522.9 0 1 "r"
. .
. .
00040 157 79 13 509.3 1 1 "r"
00041 * * *
END OF FILE * * *
"e" moves
cursor to command line
file "e"
R;T=stuff
xedit sasprg sas a "e" SASPRG SAS A, etc.
(move cursor to input line,
line 00005,
and type over):
input pigs
1-10 sows 11-20 boars 21-30 food 31-40
temp 41-44
ghog 45-48;
(move cursor to model line,
line 00008,
and type over):
model
pigs=sows boards food temp ghog/r dw;
(move cursor to prefix of adds 10 more input lines for proc
last input
line, line statements
00012 and
type): 10a "e"
(move cursor to line 00013;
input proc
plot statements
for each
variable);
00012
option 1s=130;
00013 proc
plot; "r"
00014 plot
rsduls*pigs='+'/vref = 0.0; "r"
00015 proc
plot; "r"
00016 plot
rsduls*sows='+'/vref = 0.0; "r"
.
.
00021 proc
plot; "r"
00022 plot
rsduls*ghog='+'/vref=0.0; "r"
00023 * * *
END OF FILE * * *
commands (cont'd.) response
"e" moves
cursor to command line
file "e" R;T=stuff
sas sasprg "e" tlog
option ignored
R;T=stuff
xedit sasprg listing a "e" SASPRG LISTING A,
etc.
(ALT-PF8 and ALT-PF7 to
check for
sample size,
n=40; the
Durbin-Watson,
statistics,
plots, etc.)
quit "e" R;T=stuff
nprint sasprg listing a (cc FILE (SASPRG LISTING A) PRINTED
ej dest
rm06 bin xx "e" BIN
xx DEST=RM06
R;T=stuff
FROM UIUCVMD(ROUTER): JOB etc.
logoff "e" L
VMD N VME N VMC
For the
second run, drop the variables which are insignificant or of the wrong sign and
run the regression again. The computer
procedures for this entail logging on, changing the model line and deleting all
the plot commands in the MP2 SAS program file, running the modified MP2 SAS
program on the MP2 data, checking the file which lists the results of the
regression, and printing it. For the MP2
second run, apply the procedure outlined for the second run of MP1.