intermediate_stats

Intermediate Statistics	27:202:543
Monday, Wednesday, 1:00-2:20	Room: CLJ 572
frank.edwards@rutgers.edu
Office hours Wednesday, 11:00 - 1:00	Room: CLJ 579B

Quick links

Lecture slides

Homework

Slack

Course description

This is the course syllabus for Intermediate Statistics, Spring 2025. Continuous outcomes that meet the assumptions of ordinary least squares regression are relatively rare in the social sciences. This course focuses our attention on how to estimate regression models for discrete outcomes including binary and count variables. These flexible tools allow us to more accurately model a wide range of outcomes.

We also introduce Bayesian inference as an alternative to frequentist inference and practice simulation as a core practice for statistical research. The class ends with discussions of multilevel models and multiple imputation for missing data.

Communication

All course communication will occur over Slack. Check it routinely.

Course goals

Master data analysis with linear and generalized linear regression models
Develop expertise in advanced statistical programming and data visualization
Develop the ability to design and conduct quantitative research
Improve your ability to work in teams to conduct research

Expectations

Come prepared. This is a relatively small and advanced course. I expect everyone to participate actively.
Be respectful and professional. Be mindful of the space you take up in the classroom.
Collaborate with your colleagues. I encourage you all to work together to complete assignments. However, I do expect you each to submit your own homework writeups.

Prerequisites

A prior graduate-level course in statistics is required. This course assumes students are comfortable with multivariate linear regression, basic probability, and statistical computing.

Review resources

These math camp materials from UChicago neatly cover the math you need for graduate-level statistics courses.

Khan Academy’s courses in calculus, statistics, and probability are great reviews for the course.

Jenny Bryan’s STAT 545 course at UBC provides a very comprehensive overview of programming in R and efficient data science workflows.

Rohan Alexander’s Telling Stories with Data provides a great introduction to practical data analysis with R.

Software

All instruction will be conducted in the R statistical programming language. R is free and open-source, and can be downloaded here.

We will be using the RStudio integrated development environment. RStudio provides a powerful text editor and a range of very useful utilities.

In addition to writing code, it is a great tool for writing reports, papers, and slides using RMarkdown. This syllabus, most of my course materials, and most of my academic papers are based on Markdown. All course assignments should be completed in RMarkdown.

Book (required)

Gelman, Hill and Vehtari, Regression and other stories. You may either purchase the book, or access it via free pdf here

Books (recommended)

These recommended books are very useful, and some examples are pulled from them:

Wickham, R for Data Science

Healy, Data Visualization: A Practical Introduction

McElreath, Statistical Rethinking: A Bayesian Course with Examples in R and Stan

Assignments and grading

Course grading is based on a combination of homework assignments (50 percent) and a final project (50 percent)

Homework

All assignments will be posted on the course website. Homeworks should be completed in RMarkdown. Submit both your compiled html file and source code.

Problem sets: I will assign weekly homework. Assignments are due each Monday before class.
All students get 2 no-questions-asked extensions for homework. Just let me know before the due date if you are using one.
Research project: You will write or revise a quantitative paper as a final paper. This can be either collaborative or individual.

Course topics and schedule

1/22	Lab: Introduction and software: Reading Ch1-Ch4; HW1
1/27	Lecture: Inference and simulation: Reading Ch5; HW2
1/29	Lab: Simulation practice
2/3	Lecture: Linear regression review: Reading Ch 6, 7, 10; HW3
2/5	Lab: Linear regression practice
2/10	Lecture: Bayesian inference: Reading Ch 8, 9; HW4
2/12	Lab: Regression with stan_glm
2/17	Lecture: Advanced linear regression: Reading Ch 11, 12; HW5
2/19	Lab: Linear regression diagnostics
2/24	Lecture: Logistic regression (1): Reading Ch 13; HW6
2/26	Lab: Fitting logistic GLMs
3/3	Lecture: Logistic regression (2): Reading Ch 14; HW7
3/5	Lab: Interpreting logistic regression models
3/10	Lecture: Models for count data: Reading Ch 15.1-15.3; HW8
3/12	Lab: Fitting and interpreting Poisson and Negative Binomial models
3/17	SPRING BREAK
3/24	Lecture: Advanced count models
3/26	Canceled
3/31	NO CLASS: Eid
4/2	Lecture: Models for categorical outcomes: Reading Ch 15.4 - 15.8
4/7	Lecture: Design, statistical power, and missing data: Reading Ch 16, 17: HW 10
4/9	Lab: Multiple imputation with mice
4/14	Lecture: Causal inference with experimental data: Reading Ch 18, 19: HW 11
4/16	Lab: Analyzing data from an experiment
4/21	Lecture: Causal inference with observational data : Reading Ch 20, 21
4/23	Lab: Techniques in R for observational causal inference
4/28	Lecture: Introduction to multilevel models: Reading Ch 22
4/30	Lab: Basics of multilevel modeling
5/5	NO CLASS: Conference