Complete Case Analysis
Complete case analysis is the term used to describe a statistical analysis that only includes participants for which we do not have missing data on the variables of interest.
Explicit Single Imputation
Explicit Single imputation denotes a method based on an explicit model which replaces a missing datum with a single value. In this method the sample size is retrieved. However, the imputed values are assumed to be the real values that would have been observed when the data would have been complete.
Implicit Single Imputation
Implicit Single imputation denotes a method not based on an explicit model which replaces a missing datum with a single value. In this method the sample size is retrieved. However, the imputed values are assumed to be the real values that would have been observed when the data would have been complete.
Available Case Analysis
Available-case analysis also arises when a researcher simply excludes a variable or set of variables from the analysis because of their missing-data rates.
Inverse Probability Weighting
The inverse probability weighting (IPW) approach preserves the semiparametric structure of the underlying model of substantive interest and clearly separates the model of substantive interest from the model used to account for the missing data.
Augmented Inverse Probability Weighting
Augmented Inverse Probability Weighting (AIPW) is a IPW technique that derives estimators using a combination of the propensity score and the regression model. This approach has the attractive doubly robust property that estimators are consistent as long as either the propensity score or the outcome regression model is correctly specified.
Weighting Adjustments
Weighting to compensate for nonresponse attaches weights to subjects included in the analysis to restore the representation in the original sample which is distorted because of missing values.
Multiple Imputation by Chained Equations
Multiple Imputation by Chained Equations (MICE) allows most models to be fit to a dataset with missing values on the independent and/or dependent variables, and provides rigorous standard errors for the fitted parameters. The basic idea is to treat each variable with missing values as the dependent variable in a regression, with some or all of the remaining variables as its predictors.
Joint Multiple Imputation
Joint Multiple Imputation (JOMO) commonly assumes that the incomplete variables follow a multivariate normal distribution, often referred to as multivariate normal imputation and, under this assumption, provides rigorous standard errors for the fitted parameters.
Introduction to Maximum Likelihood Estimation
Maximum Likelihood Estimation (MLE) is a method of estimating the parameters of a probability distribution by maximising a likelihood function, so that under the assumed statistical model the observed data is most probable.
Introduction to Bayesian Inference
Bayesian inference is a method of statistical inference in which Bayes theorem is used to update the probability for a hypothesis as more evidence or information becomes available.
Likelihood Based Inference with Incomplete Data
When making inference with missing data, any statistical method must rely on either explicit or implicit assumptions about the mechanism which lead some of the values to be missing.
Expectation Maximisation Algorithm
An Expectation–Maximization (EM) algorithm is an iterative method to find maximum likelihood or maximum a posteriori estimates of parameters in statistical models, where the model depends on unobserved latent variables.
Bayesian Iterative Simulation Methods
The most popular class of Bayesian iterative methods is called Markov chain Monte Carlo (MCMC), which comprises different algorithms for sampling from a probability distribution. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution.
Likelihood Based Inference with Incomplete Data (Nonignorable)
Specific methods are required to make inference under nonignorable nonresponse assumptions, that is when the value of the variable that is missing is related to some values which are not observed by the analyst (e.g. the missing values themselves).
Selection Models
Selection Models (SM) are typically used to handle nonignorable missingness. They factorise the joint likelihood of measurement process and missingness process into a marginal density of the measurement process and the density of the missingness process conditional on the outcomes, which describes the missing data selection based on the complete data.
Pattern Mixture Models
Pattern Mixture Models (PMM) are typically used to handle nonignorable missingness. They factorise the joint likelihood of measurement process and missingness process into a marginal density of the missingness process and the density of the measurement process conditional on the missing data patterns, where the model of interest is fitted for each pattern.
Shared Parameter Models
Shared Parameter Models (SPM) are typically used to handle nonignorable missingness. In these models a random effect is shared between the repeated measures model and the missing data mechanism model.
Complete Case Analysis
Complete case analysis is the term used to describe a statistical analysis that only includes participants for which we do not have missing data on the variables of interest.
Available Case Analysis
Available-case analysis also arises when a researcher simply excludes a variable or set of variables from the analysis because of their missing-data rates.
Inverse Probability Weighting
The inverse probability weighting (IPW) approach preserves the semiparametric structure of the underlying model of substantive interest and clearly separates the model of substantive interest from the model used to account for the missing data.
Augmented Inverse Probability Weighting
Augmented Inverse Probability Weighting (AIPW) is a IPW technique that derives estimators using a combination of the propensity score and the regression model. This approach has the attractive doubly robust property that estimators are consistent as long as either the propensity score or the outcome regression model is correctly specified.
Weighting Adjustments
Weighting to compensate for nonresponse attaches weights to subjects included in the analysis to restore the representation in the original sample which is distorted because of missing values.
Explicit Single Imputation
Explicit Single imputation denotes a method based on an explicit model which replaces a missing datum with a single value. In this method the sample size is retrieved. However, the imputed values are assumed to be the real values that would have been observed when the data would have been complete.
Implicit Single Imputation
Implicit Single imputation denotes a method not based on an explicit model which replaces a missing datum with a single value. In this method the sample size is retrieved. However, the imputed values are assumed to be the real values that would have been observed when the data would have been complete.
Multiple Imputation by Chained Equations
Multiple Imputation by Chained Equations (MICE) allows most models to be fit to a dataset with missing values on the independent and/or dependent variables, and provides rigorous standard errors for the fitted parameters. The basic idea is to treat each variable with missing values as the dependent variable in a regression, with some or all of the remaining variables as its predictors.
Joint Multiple Imputation
Joint Multiple Imputation (JOMO) commonly assumes that the incomplete variables follow a multivariate normal distribution, often referred to as multivariate normal imputation and, under this assumption, provides rigorous standard errors for the fitted parameters.
Introduction to Maximum Likelihood Estimation
Maximum Likelihood Estimation (MLE) is a method of estimating the parameters of a probability distribution by maximising a likelihood function, so that under the assumed statistical model the observed data is most probable.
Introduction to Bayesian Inference
Bayesian inference is a method of statistical inference in which Bayes theorem is used to update the probability for a hypothesis as more evidence or information becomes available.
Likelihood Based Inference with Incomplete Data
When making inference with missing data, any statistical method must rely on either explicit or implicit assumptions about the mechanism which lead some of the values to be missing.
Expectation Maximisation Algorithm
An Expectation–Maximization (EM) algorithm is an iterative method to find maximum likelihood or maximum a posteriori estimates of parameters in statistical models, where the model depends on unobserved latent variables.
Bayesian Iterative Simulation Methods
The most popular class of Bayesian iterative methods is called Markov chain Monte Carlo (MCMC), which comprises different algorithms for sampling from a probability distribution. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution.
Likelihood Based Inference with Incomplete Data (Nonignorable)
Specific methods are required to make inference under nonignorable nonresponse assumptions, that is when the value of the variable that is missing is related to some values which are not observed by the analyst (e.g. the missing values themselves).
Selection Models
Selection Models (SM) are typically used to handle nonignorable missingness. They factorise the joint likelihood of measurement process and missingness process into a marginal density of the measurement process and the density of the missingness process conditional on the outcomes, which describes the missing data selection based on the complete data.
Pattern Mixture Models
Pattern Mixture Models (PMM) are typically used to handle nonignorable missingness. They factorise the joint likelihood of measurement process and missingness process into a marginal density of the missingness process and the density of the measurement process conditional on the missing data patterns, where the model of interest is fitted for each pattern.
Shared Parameter Models
Shared Parameter Models (SPM) are typically used to handle nonignorable missingness. In these models a random effect is shared between the repeated measures model and the missing data mechanism model.