Please note the copyright notice below. Most of the material on this site will be part of a book on statistics/econometric methods to be published by Cambridge University Press. Read more about it at Happy to receive feedback.

Older version (pre 2020) of this class. The new version is here. Previous to 2020: This class is the first of a two-class sequence on methods in health services research and policy evaluation. It emphasizes both statistical theory and its implementation. Topics are covered from different methodological traditions: econometrics, biostatistics, and (some) epidemiology. There is a lot of "translation" from one discipline to another. For example, causality is covered using the new causal inference literature but also the way economists have traditionally understood causality (i.e. zero conditional mean assumption, selection on observables). The linear regression model is covered in the traditional way in econometrics (ordinary least squares, the Gauss-Markov theorem) but also in the way the "general" linear model is presented in analysis of experiments (ANOVA, etc). Maximum likelihood estimation is covered early on as a general framework for model selection using likelihood ratio tests. Models are interpreted in several ways with emphasis on using analytical and numerical derivatives (that is, marginal effects--see Lecture 23). Simulations are used for every topic, including using the estimated model for a Bayesian-like hypothesis testing. Other topics include exploratory data analysis, logit/probit, Poisson regression, GLM models, model selection, and bootstrapping. This class is also an introduction to Stata but we do compare Stata to R and SAS when relevant.

Email me if you want the Stata code for lectures. Lecture are written in LaTeX. Code. Problem sets, answer keys, and readings are available on Canvas for registered students but happy to share. Please note the copyright notice below.


Lecture Notes

Helpful before the class starts: Concepts to know.

HSR syllabus (older version, before 2020)

Lecture 1: Overview of regression analysis and class

Lecture 2: Introduction to Stata

Lectures 3 and 4: Review of probability and mathematical statistics

Lecture 5: Causal inference

Lecture 6: Simple linear regression

Lecture 7: Simple linear regression (properties, testing)

Lecture 8: Simple linear regression (fit, confidence intervals, simulations)

Lecture 9: Multiple linear regression

Lecture 10: Multiple linear regression II

Lecture 11: Maximum likelihood estimation (MLE)

Lecture 12: Regression assumptions diagnostics I

Lecture 13: Regression diagnostics II

Lecture 14: Qualitative predictors (ANOVA, effects coding, etc)

Lecture 15: Modeling I

Lecture 16: Modeling II (variable transformations, etc)

Lecture 17: Heteroskedasticity I

Lecture 18: Heteroskedasticity II

Lecture 19: Collinearity

Lecture 20: Bias-variance, adjusting, plus other things

Lecture 21: Linear probability model, logistic, probit

Lecture 22: Logistic regression

Lecture 23: Margins and marginal effects

Lecture 24: Probit, variable selection (AIC, BIC)

Lecture 25: Bootstrap and methods II

Lecture 26: Review for final

© Marcelo Coca Perraillon, 2021. No part of the materials available through the site may be copied, photocopied, reproduced, translated or reduced to any electronic medium or machine-readable form, in whole or in part, without prior written consent of the author. Any other reproduction in any form without the permission of the author is prohibited. All materials contained on this site are protected by United States copyright law and may not be reproduced, distributed, transmitted, displayed, published or broadcast without the prior written permission of author.