Lecture notes and sample chapter on marginal effects. The book chapter draft is the most updated version with better notation. All the data files are available online.
Sample (draft) Chapter 6: Marginal effects to interpret regression parameters
This version is more technical, including analytical and delta-method standard errors, plus interactions in logit models:
Older with more examples:
Marginal effects and the margins command
This version has code for marginal effects using two-part models:
Interpreting Model Estimates: Marginal Effects
A brief explanation (see sample book chatper above for details): Marginal effects are helpful to interpret model results or, more precisely, model parameters. Marginal effects are (counterfactual) predictions. If you can obtain predictions from a statistical model, you can calculate marginal effects. Marginal effects are especially useful when you want to interpet models in the scale of interest and not in the scale of estimation, which in non-linear models are not the same (e.g. log-odds versus probabilities in logistic models; counts versus log coutns in Poisson models). In essence, you use model predictions to understand what happens when covariate values change (that's the counterfactual part).
A little bit longer: It’s not that difficult to interpret basic linear, additive models, but things get complicated in non-linear models. Anything other than the vanilla OLS model is a non-linear model (but an OLS model with interactions is also “non-linear,” as in non-constant effects). For example, in logistic regression, the estimated coefficients are in the log-odds scale, which are not that helpful. You can take the exponent to get odds ratios, but odds ratios are often misinterpreted — under most circumstances, they are not a ratio of probabilities; many times, not even close. With marginal effects you get estimates in the probability scale. So if you are wondering what is the difference between marginal effects vs odds ratios, the answer is that they are just different ways of understanding parameter estimates. One is confusing (odds ratios); the other (marginal effect) is measured in the probability scale, which is often the scale of interest. When you estimate a logistic model, you are interested in understanding what is the effect of X on the probability of the outcome. See the examples in the PDF lecture notes above.
Although most people encounter marginal effects in the context of logistic models (the way I explained them above), marginal effects can be used with any parametric regression model (Poisson, probit, all combinations of GLMs, etc). It's all about using a model to make predictions and then summarizing those predictions to make sense of the model. In more precise terms, it's about computing numerical rather than analytical derivatives. (Stata’s margins command actually implements a two-side derivative; see slides above). If the covariate is not continuous (i.e. dummy variable like treatment or race), then you compute incremental changes rather than derivatives.
There are other important considerations to keep in mind. In non-linear models like logit or probit, interactions in different scales can be very different (affecting conclusions and statistical significance). In other words, conclusions about magnitude/sign and statistical significance can be different in the log-odds scale than in the probability scale.
Stata implements marginal effects (and predictive margins) using the margins command. The margins command is great because it does a lot of very useful things, the problem is that because it does a lot of useful things, it can be difficult to understand. A small change in syntax produces very different results, and you need to understand how to interpret them. Back in the days Stata had two different commands: mfx and adjust instead of margins.
As an example, you get completely different (but related; they are, again, predictions) things with these statements:
qui reg y i.hispanic age i.female i.diabetes
margins, dydx(hispanic) /* incremental effect of hispanic */
margins, dydx(age) /* Marginal effect of age or the the numerical two-sided derivative dy/d age */
margins hispanic /* adjusted means for hispanic or predictive margins */
margins, over(hispanic) /*mean of predicted outcome by hispanic. If you combine with the “at” option, the margins command is a handy way of calculating predictions */
Digression: An alternative to marginal effects for logistic models becoming popular in epi is using GLM models with binomial family and different links, which changes how parameters are interpreted, but models and SEs can be unstable. In particular, the log link (not logit) produces relative risks rather than odds-ratios. See here for example. But... you can get relative risks with predictive margins anyway, so why bother?:
webuse lbw, clear
/// ------ Logistic model
logit low i.smoke age i.ht
*** Marginal effect of smoking
margins, dydx(smoke)
*** Predictive margins
margins smoke, post
* Relative risk
gen double rel_risk = _b[1.smoke] / _b[0.smoke]
sum rel_risk