Schedule
I. Introduction to the course (Class 1)
A. Who should take the class
B. The textbook and class logistics
C. Topics to be covered during the semester
D. Introduction to WebCT & CENTRA
E. Brief introduction to SPSS on the PC
II. Chapter 1 Drawing statistical conclusions (1) (Class 2)
A. Case studies
 Motivation and Creativity
 Sex discrimination in employment
B. Statistical inference and study design
C. Measuring uncertainty in randomized experiments
D. Measuring uncertainty in observational studies
E. Related issues
F. Summary
III. Chapter 2 Inference using tdistributions (28) (Class 34)
A. Case studies
 Bumpus’s Data on Natural Selection — An observational study
 Anatomical abnormalities associated with schizophrenia — An observational study
B. Onesample ttools and the paired ttest
 The sampling distribution of a sample average
 The standard error of an average
 The tratio based on a sample average
 Unraveling the t ratio
C. A tratio for two sample inference
 Sampling distribution of the difference between tow independent sample averages
 Standard error for the difference of two averages
D. Inferences in a twotreatment randomized experiment
E. Related issues
F. Summary
G. Exercises
IV. Chapter 3 A closer look at assumptions (56) (Class 5)
 Cloud seeding to increase rainfall – A randomized experiment
 Effects of Agent Orange on troops in Viet Nam – An observational study
 The Logarithmic Transformation
 Interpretation after a Log Transformation
 Other Transformations for Positive Measurements
 V.Chapter 4 Alternatives to the ttools (85) (Class 6)

A. Case studies
A. Case studies
1. Space shuttle ORing Failures – An observational study
2. Cognitive Load Theory in Teaching – A Randomized Experiment
B. The ranksum test
 The Rank Transformation
 The Ranksum statistic
 Finding a pvalue by normal approximation
 A confidence interval based on the ranksum test
C. Other alternatives for two independent samples
 Permutation tests
 The Welch ttest for comparing two normal populations with unequal spreads
D. Alternatives for paired data
 The Sign test
 The Wilcoxon Signed Rank Test
E. Related issues
 Practical and Statistical Significance
 The Presentation of Statistical Findings
 Levene’s test for Equality of two variances
 Survey sampling
F. Summary
G. Exercises
VI. Chapter 5 Comparisons among several samples (113) (Class 79)
A. Case studies
 Diet restriction and longevity
 Benjamin Spock conspiracy trial
B. Comparing any two of the several means
 An ideal model
 The pooled standard deviation
 ttests and confidence limits for differences of means
C. The oneway analysis of variance Ftest
 The extra sum of squares principle
 the ANOVA table
 More applications of the extra sum of squares principle
D. Robustness and model checking
 Robustness to assumptions
 Diagnostics using residuals
E. Related issues
 Further illustrations of different sources of variability
 KruskalWallis nonparametric ANOVA
 Random effects
 Separate confidence intervals and significant differences.
F. Summary
G. Exercises
1. Ex 5.23 T. rex temperature
VII. Chapter 6 Linear combinations and multiple comparisons of means (149) (Class 9 & 10)
A. Case studies
 Discrimination against the handicapped
 Sexual selection in swordtails
B. Inferences about linear combinations of group means
C. Simultaneous inferences
D. Some multiple comparison procedures
E. Related issues
F. Summary
G. Exercises
VIII. Chapter 7 Simple linear regression: a model for the Mean (174) (Class 11)
A. Case studies
 The Big Bang
 Meat Processing and pH
B. The simple linear regression model
C. Least squares regression estimation
D. Inferential tools
 Tests and confidence limits for slope and intercept
 Describing the distribution of the response at some value of explanatory variable
 Prediction of a future response
 Calibration: Estimating the X that results in Y=Y {See also Draper & Smith 1998, Chapter 3 }
E. Related issues
F. Summary
G. Exercises
IX. Chapter 8 A Closer look at assumptions for simple linear regression (206) (Class 1214)
A. Case studies
 Island area and number of species – an observational study
 Breakdown times for insulating fluid under different voltages – a controlled experiment
B. Robustness of leastsquares inferences
C. Graphical tools for model assessment
D. Interpretation after log transformations
E. Assessment of fit using the analysis of variance
F. Related issues
G. Summary
H. Exercises
X. Chapter 9 Multiple Regression (235) (Class 14)
A. Case studies
 Effect of light on meadowfoam flowering – a randomized experiment
 Why do some mammals have large brains for their size – an observational study
B. Regression coefficients
 The multiple linear regression model
 Interpretation of regression coefficients
C. Specially constructed explanatory variables
 A squared term for curvature
 An indicator variable to distinguish between two groups
 Sets of indicator variables for categorical explanatory variables with more than two categories
 A product term for interaction
 A shorthand notation for model description
D. A strategy for data analysis
E. Graphical methods for data exploration and presentation
F. Related issues
G. Summary
H. Exercises
XI. Midterm Exam (3/22/06 W) (Class 15)
XII. Chapter 10 Inferential tools for multiple regression (267) ( Class 16)
A. Case studies
 Galileo’s data on the motion of falling bodies – a controlled experiment
 The Energy costs of echolocation by bats – an observational study.
B. Inferences about regression coefficients
 Least squares estimates and standard errors
 Tests and confidence intervals for single coefficients
 Tests for confidence limits for linear combinations of coefficients
 Prediction
C. Extrasumofsquares F tests
D. Related issues
E. Summary
F. Exercises
XIII. Chapter 11 Model checking and refinement (304) (Class 17)
A. Case studies
 Alcohol metabolism in men and women – an observational study
 The bloodbrain barrier – a controlled experiment
B. Residual plots
C. A strategy for dealing with influential observations
 Assessment of whether observations are influential
 What to do with influential observations
D. Caseinfluenced statistics
 Leverages for flagging cases with unusual explanatory variable values
 Studentized residuals for flagging outliers
 Cook’s distances for flagging influential cases
 A strategy for using case influence statistics
E. Refining the model
 Testing terms
 Partial residual plots
F. Related Issues
 Weighted regression for certain types of nonconstant variance
 Measurement errors in explanatory variables
G. Summary
H. Exercises
XIV. Chapter 12 Strategies for variable selection (338) (Class 1819)
A. Case Studies
 State average SAT scores – an observational study
 Sex discrimination in employment – an observational study
B. Specific issues relating to many explanatory variables
 Objectives
 Loss of precision
 A strategy for dealing with many explanatory variables
C. Sequential variable selection techniques
 Forward selection
 Backward elimination
 Stepwise regression
 Sequential variable selection with the SAT data
 Compounded uncertainty in stepwise procedures
D. Model selection among all subsets
E. Analysis of the Sex discrimination data
F. Related issues
G. Summary
H. Exercises
XV. Chapter 13 The Analysis of Variance for Twoway classifications (374) (Class 2021)
A. Case studies
 Intertidal seaweed grazers – A randomized experiment
 The Pygmalion effect in training programs – A randomized experiment
B. Additive and nonadditive models for twoway tables
 The Additive Model A Regression Parameterization for the additive twoway model
 The Saturated, nonadditive model
 A strategy for analyzing twoway tables with several observations per cell.
 The analysis of variance Ftest for additivity
C. Analysis of the seaweed grazer data
 Initial assessment of additivity, outliers and the need for transformation
 The analysis of variance table from the fit to the saturated model
 The analysis of variance table for the fit to the additive model
 Answers to specific questions of interest using contrasts
 Answers to specific questions of interest using multiple regression with indicator variables
D. Analysis of the Pygmalion data
 Initial Exploration and check on the additive model
 Answering the question of interest with regression
 A closer look at the regression estimate of treatment effect
 The pvalue in the randomization distribution
E. Related Issues
 Additivity and nonadditivities
 Orthogonal contrasts
 Randomized blocks and pairedt analyses
 Should insignificant block effects be eliminated from the model?
 Multiple comparisons
 An alternate parameterization for the additive model
F. Summary
G. Exercises
XVI. Chapter 14 Multifactor studies without replication (409) (Class 22)
A. Case Studies
 Chimpanzees Learning Sign language – a controlled experiment Fouts (1973)
 Effects of ozone in conjunction with sulfur dioxide and water stress on soybean yield – a randomization experiment
B. Strategies for analyzing tables with one observation per cell
C. Analysis of the Chimpanzee learning times study
D. Analysis of the soybean data
E. Related issues
1. Nested ANOVA
F. Summary
G. Exercises
XVII. Chapter 15 Adjustment for serial correlation (436) (Class 23)
A. Case Studies
 Logging practices and water quality – an observational study
 Measuring global warming – an observational study
B. Comparing the means of two time series
 Serial correlation and its effect on the average of a time series
 The standard error of an average in a serially correlated time series
 The first serial correlation coefficient
 Pooling estimates and comparing means of two independent time series with the same first serial correlation
C. Regression after Transformation in the AR(1) model
 The serial correlation coefficient based on regression residuals
 Regression with filtered variables
D. Determining if serial correlation is present
 An easy largesample test for serial correlation
 The nonparametric runs test
 The DurbinWatson test statistic
E. Diagnostic procedures for judging the adequacy of the AR(1) model
 When is a transformation of a time series indicated
 The partial autocorrelation function (PACF)
 Bayesian information criterion
F. Related Issues
G. Summary
H. Exercises
XVIII. Chapter 16 Repeated Measures (462) (Class 24)
A. Case Studies
 Sites of short and longterm memory — A controlled experiment
 Oat Bran and cholesterol — A randomized crossover experiment
B. Tools and strategies for analyzing repeated measures
 Types of repeated measures studies
 Profile plots for graphical exploration
 Strategies for analyzing repeated measures
C. Comparing the means for bivariate responses in two groups
 Summary statistics for bivariate responses
 Pooled variability estimates
 Hotelling’s T^{2 } statistic.
 Checking on assumptions
 Confidence ellipses
D. Related Issues
E. Summary
F. Exercises
XIX. Chapter 17 Exploratory tools for summarizing multivariate responses (497) (Class 25)
A. Case studies
 Magnetic force on rods in printers
 Love and Marriage — an observational study
B. Linear combinations of variables
C. Principal components analysis
 The PCA train
 Principal components
 Variables suggested by PCA
 Scatterplots in PCA space
 The factor analysis model and PCA
 PCA usage
D. Canonical correlation analysis
E. Introduction to other multivariate tools
 Discriminant function analysis
 Multidimensional scaling
 Correspondence analysis
 PCA and Empirical Orthogonal Functions (EOFs)
F. Summary
G. Exercises
XX. Chapter 18 Comparisons of proportions or odds (529)
A. Case Studies
 Obesity and heart disease in American Samoa
 Vitamin C and the common cold
B. Inferences for the difference of two proportions
C. Inference about the ratio of two odds
D. Inference from retrospective studies
E. Summary
F. Exercises
XXI. Chapter 19 More tools for tables of counts (552)
A. Case studies
 Sex role stereotypes and personnel decisions – a randomized experiment
 Death penalty and race of murder victim – an observational study
B. Population models for 2 x2 tables of counts
 Hypotheses of homogeneity and independence
 Sampling schemes leading to 2 x 2 tables
 Testable hypotheses and estimable parameters
C. The ÷squared test
 The Pearson ÷squared test for Goodness of Fit
 ×squared test of independence in a 2 x 2 table
 Equivalence of several tests for 2 x 2 tables
D. Fisher’s exact test: the randomization (permutation) test for 2 x 2 tables
 The randomization distribution of the difference in sample proportions
 The hypergeometric formula for onesided Pvalues
 Fisher’s exact test for observational studies
 Fisher’s exact test versus other tests
E. Combining results from several tables with equal odds ratios
 The MantelHaenszel Excess
 The MantelHaenszel test for equal odds in several 2 x 2 tables
 Estimate of the common odds ratio
F. Related issues
 r x c tables of counts
 Higher dimensional tables of counts
 Analysis of SUV fatalities & the Ford Explorer (new problem)
G. Summary
H. Exercises
XXII. Chapter 20 Logistic regression for binary response variables (579)
A. Case studies
 Survival in the Donner Party – An observational study
 Birdkeeping and lung cancer – A retrospective observational study
B. The logistic regression model
C. Estimation of the logistic regression coefficients
D. the Dropindeviance test
E. Strategies for data analysis using logistic regression
F. Analysis of case studies
G. Related issues
H. Summary
I. Exercises
XXIII. Chapter 21 Logistic regression for binomial counts (609)
A. Case studies
 Island size and bird extinctions – an observational study
 Moth coloration and natural selection – A randomized experiment
B. Logistic regression for binomial responses
C. Model assessment
D. Inferences about logistic regression coefficients
E. Extrabinomial variation
F. Analysis of moth predation data
G. Related issues
H. Summary
I. Exercises
XXIV. Chapter 23 Elements of Research design (669)
A. Case study Biological control of a noxious weed – a randomized experiment
B. Considerations for forming research objectives
C. Research design tool kit
 Controls and placebos
 Blinding
 Blocking
 Stratification
 Covariates
 Randomization
 Random sampling
 Replication
 Balance
D. Design choices that affect accuracy and prediction
 Attaching desired precision to practical significance
 How to improve a confidence interval
E. Choosing a sample size
 Studies with a numerical response
 Studies comparing two proportions.
 Sample size for estimating a regression coefficient
F. Steps in designing a study
 Stating the objective
 Determining the scope of inference
 What experimental units will be used?
 What are the populations of interest
 Understanding the system
 Deciding how to measure a response
 Listing factors that can affect the response
 Planning the conduct of the experiment
 Outlining the statistical analysis
 Determining the sample size
G. Related issues – a factor of four
H. Summary
I. Exercises
XXV. Chapter 24 Factorial treatment arrangements and blocking designs
A. Case study
1. Amphibian crisis linked to ultraviolet – a randomized experiment
B. Treatments
 Choosing treatment levels
 The rationale for several factors
C. Factorial arrangement of treatment levels
1.Definition and terminology for a factorial arrangement
2.The 2^2 factorial structure
3.The 2^3 factorial structure
4.The 3^2 factorial structure
5.Higher order factorial arrangements
D. Blocking
 Randomized blocks
 Latin square blocking
 Split Plot designs
E. Summary
F. Exercises
XXVI. Chapter 22 Loglinear regression for Poisson counts
A. Case studies
 Age and elephant mating success
 Treatment for epileptic seizures
B. Loglinear regression for Poisson responses
C. Model assessment
D. Inferences about loglinear regression coefficients
E. ExtraPoisson variation and the loglinear model
F. Further issues
G. Summary
H. Exercises