LSE MY457: Causal Inference for Observational and Experimental Studies

Course Website for Winter Term 2025

Main course repo

Moodle page

Instructors

Office hour may be booked via LSE’s StudentHub. If you have questions or concerns about class material, problem sets, or the exam, please use the class forum on Moodle. We will generally not reply to emails about the course material, but we will reply promptly to questions posted on the forum. Of course, if you questions or concerns are of a private or personal nature, please email or attend office hours.

Readings and textbooks

There is a reasonable amount of reading for this class, especially in the early weeks. You are strongly encouraged to do the reading before class, paying close attention to details (i.e., do not skim over equations). In addition to some key articles, throughout the term we will dip into three main textbooks, which we will refer to by their acronyms:

Note: The three textbooks have very different flavours, and are pitched at different technical levels. MHE is the classic graduate-level text for causal inference, and is challenging but very accessible, though now a little out of date as it was published in 2009. CIS is very dense and very technical, and serves as a reference text for much of the foundational material in the class (weeks 1-5). TE is the most accessible textbook and is very applied, while being lighter on details and generally less technically focused. The reading list is designed to allow you pick your own adventure, to a degree.

If you are particularly interested in the course material, there will be additional readings set from the following textbooks (as well as a few articles):

Note: if you are particularly interested in graphical models and their application to causal inference, it is strongly recommended that you do all the readings from either CMRI or CISAP. CMRI is extremely technical and dense, while CISAP is a gentler (though not that gentle) introduction to some of the basics introduced in CMRI. If there are suggested readings from both books, you should choose either, not both.

Formative problem sets

Statistics is best learned by doing. There will be six problem sets, released at 5pm on Wednesdays. You must submit one weeks later at 11am, on Moodle. Your submission should be written in RMarkdown, and must be a knitted .pdf, formatted as shown in this problem set template, which produces a pdf that looks like this. If you do not follow the formatting requirements your problem set will not be marked. Comments will be returned via Moodle within two weeks of submission.

  Type Release date Due date
1 Formative problem set 1 29 January 2025 - 5pm 5 February 2025 - 11am
2 Formative problem set 2 12 February 2025 - 5pm 19 February 2025 - 11am
3 Formative problem set 3 5 March 2025 - 5pm 12 March 2025 - 11am
4 Formative problem set 4 19 March 2025 - 5pm 26 March 2025 - 11am
5 Formative problem set 5 2 April 2025 - 5pm 9 April 2025 - 11am
Week Topic
1 Causal Frameworks
2 Randomization
3 Selection on Observables 1
4 Selection on Observables 2
5 Selection on Observables 3
6 Reading week
7 Instrumental Variables 1
8 Instrumental Variables 2
9 Regression Discontinuity
10 Difference-in-Differences 1
11 Difference-in-Differences 2
Week Topic
2 Causality and Randomization
4 Selection on Observables
7 Instrumental Variables
9 Regression Discontinuity
11 Difference-in-Differences

Detailed course schedule

Note: Links to slides and code will be updated/added in advance of each week’s teaching.

1. Causal Frameworks

We begin with an introduction to the class, both substantively and administratively.

We then introduce the potential outcomes framework, which will provide the technical foundations that are used throughout the rest of the class. We will also briefly introduce the graphical model for causal inference.

Lecture
Readings
Additional readings

2. Randomization

We introduce the concept of randomization and its value for causal inference. We discuss, at a high level, design, analysis, and inference for randomized experiments.

Lecture
Seminar: Causality and Randomization
Readings
Additional readings
Problem Set 1

3. Selection on Observables 1

We depart from the safe shores of controlled randomization, into the treacherous waters of observational research design. We will begin with a theoretical exploration of the selection on observables design (SOO) – its assumptions and identification results – using both potential outcomes and graphical theory.

Lecture
Readings
Additional readings

4. Selection on Observables 2

We consider the three most frequently seen estimation strategies for selection-on-observables designs: matching (including propensity scores), weighting, and regression.

Lecture
Seminar: Selection on Observables
Readings
Additional readings
Problem Set 2

5. Selection on Observables 3

We consider what happens if we are willing to weaken the assumptions underpinning our research designs, exploring partial identification and sensitivity analysis.

Lecture
Readings
Additional readings

6. Reading Week

7. Instrumental Variables 1

We now move onto a new research design: instrumental variables (IV). We introduce the basic architecture of modern IV, learn about the various assumptions needed to admit a causal interpretation, and explore some of the weaknesses and fragilities of the approach.

Lecture
Readings
Additional readings
Seminar: Instrumental Variables
Problem Set 3

8. Instrumental Variables 2

Extending our investigation of IV designs, we focus on the interpertation and estimation of continuous IV settings, shift-share (Bartik) instruments, examiner designs, and recentered IV.

Lecture
Readings
Additional readings

9. Regression Discontinuity

We move to the next core research design, regression discontinuity (RD), considering modern approaches to both sharp and fuzzy RD settings. We briefly consider the regression kink (RK) design.

Lecture
Readings
Additional readings
Seminar: Regression Discontinuity
Problem Set 4

10. Difference-in-Differences 1

We now introduce one of the most popular research designs for applied causal inference, difference-in-differences (DiD). We consider cases in which a treatment is rolled out such that we have variation over two dimensions: time and units. We focus almost exclusively on canonical cases in which we have only two time-periods and two treatment groups of units. We will close by briefly considering falsification tests in situations with two pre-treatment periods. Diverging from the previous approaches in which we rely on assumptions about the nature of treatment assignment, we introduce new assumptions about trends in potential outcomes over time.

Lecture
Readings
Additional readings

11. Difference-in-Differences 2

We continue our exploration of DiD, broadening our focus to cases with more than 2 time periods. We discuss first the two-way fixed effects estimator that has been a dominant tool for estimating ‘generalised difference-in-differences’ and then explore the implied assumptions in this approach and its weaknesses, specifically for staggered and non-saturating treatments, and cases with heterogeneous treatment effects. We introduce alternative ‘modern’ estimators that are robust to these settings. We conclude with a very brief foray into the synthetic control method.

Lecture
Readings
Additional readings
Seminar: Difference-in-Differences
Problem Set 5

[COURSE ENDS]