Intermediate statistics - Rutgers School of Criminal Justice (27:202:543)
Intermediate Statistics | 27:202:543 |
Lecture: Friday, 10:00AM - 12:40PM | Room: CLJ-574 |
Lab: Tuesday 1:00PM - 2:30PM | Room: CLJ-567 |
frank.edwards@rutgers.edu | Office hours: Wednesday 12:00PM - 2:00PM |
This course introduces students to Bayesian data analysis and applied regression modeling.
I’ve set up a Slack page for us to communicate about the course. This can be a resource for you to collaborate and ask me questions about homework, and will also be a spot where course announcements are posted. Invites will be circulated before the course begins.
Come prepared. This is a relatively small and advanced course. I expect everyone to participate actively in course discussions.
Please complete and submit assignments on time.
Be respectful and professional. Be mindful of the space you take up in the classroom. Food and drink are allowed, but please keep the cell phone use and non-course related computer use to a minimum.
Bring your computer. Most of the work we’ll be doing involves writing code, so bring a computer with you to class. Let me know if access to a laptop is an issue.
Collaborate with your colleagues. I encourage you all to work together to complete assignments. However, I do expect you each to submit your own homework writeups.
A prior graduate-level course in statistics is required. This course assumes students are comfortable with multivariate linear regression, basic probability, and statistical computing.
These math camp materials from UChicago neatly cover the math you need for graduate-level statistics courses.
Jenny Bryan’s STAT 545 course at UBC provides a very comprehensive overview of programming in R and efficient data science workflows.
All instruction will be conducted in the R statistical programming language. R is free and open-source, and can be downloaded here.
We will be using the RStudio integrated development environment. RStudio provides a powerful text editor and a range of very useful utilities.
In addition to writing code, it is a great tool for writing reports, papers, and slides using RMarkdown. This syllabus, most of my course materials, and most of my academic papers are based on Markdown and occasionally LaTeX. I strongly recommend that you use RMarkdown to complete course assignments. Other plaintext editors (emacs, vim, sublime, atom, etc) are acceptable substitutes for RStudio, but try to avoid using MS Word or other WSIWYG editors for assignments.
Lastly, I recommend learning some form of version control to ensure your work is a) backed up, b) easily accessible to collaborators and c) reproducible. Git and GitHub are great and flexible tools for software development that have powerful applications for researchers. Here’s a useful intro to GitHub for R users.
We will work primarily from two books.
McElreath, Statistical Rethinking: A Bayesian Course with Examples in R and Stan
Wickham’s R for Data Science is available for free online textbook, though there are print versions available if you prefer to purchase a copy.
Course grading is based on a combination of course participation (20 percent) and homework assignments (80 percent).
Homework should be submitted to me via email by 10AM on the due date.
For each week, I’ll provide a list of homework questions for you to complete. Students will have a choice to attempt the medium or hard problem set. Students attempting the medium problem set can obtain a maximum grade of 90. Students attempting the hard problem set can obtain a maximum grade of 100.
Each student may request, without penalty, one 5-day extension during the semester. I must recieve an email requesting this extension before the homework due date.
Late homework will be penalized at 5 points per day late.
1/24 | Introduction | McElreath Preface, 1, 2 |
1/31 | Sampling from the posterior | McElreath 3 |
2/7 | Linear regression | McElreath 4 |
2/14 | Multiple regression | McElreath 5 |
2/21 | Causality | McElreath 6 |
2/28 | Overfitting and comparison | McElreath 7 |
3/6 | Interactions | McElreath 8 |
3/13 | Markov Chain Monte Carlo | McElreath 9 |
3/20 | Spring break | McElreath 10 |
3/27 | Generalized Linear Models (1) | McElreath 10, 11.1, 11.2 |
4/3 | Generalized Linear Models (2) | McElreath 11.3, 11.4 |
4/10 | Mixture models | McElreath 12 |
4/17 | Multilevel models (intercepts) | McElreath 13 |
4/24 | Multilevel models (slopes) | McElreath 14 |
5/1 | Measurement error and missing data | McElreath 15 |
5/8 | Bayesian data analysis using tidyverse and brms | McElreath 17 |