In social research and other fields, research data often have a hierarchical structure. That is, the individual subjects of study may be classified or arranged in groups which themselves have qualities that influence the study. In this case, the individuals can be seen as level-1 units of study, and the groups into which they are arranged are level-2 units. This may be extended further, with level-2 units organized into yet another set of units at a third level. Examples of this abound in areas such as education (students at level 1, schools at level 2, and school districts at level 3) and sociology (individuals at level 1, neighborhoods at level 2). It is clear that the analysis of such data requires specialized software. Hierarchical linear and nonlinear models (also called multilevel models) have been developed to allow for the study of relationships at any level in a single analysis, while not ignoring the variability associated with each level of the hierarchy.

The HLM program can fit models to outcome variables that generate a linear model with explanatory variables that account for variations at each level, utilizing variables specified at each level. HLM not only estimates model coefficients at each level, but it also predicts the random effects associated with each sampling unit at every level. While commonly used in education research due to the prevalence of hierarchical structures in data from this field, it is suitable for use with data from any research field that have a hierarchical structure. This includes longitudinal analysis, in which an individual's repeated measurements can be nested within the individuals being studied. In addition, although the examples above implies that members of this hierarchy at any of the levels are nested exclusively within a member at a higher level, HLM can also provide for a situation where membership is not necessarily "nested", but "crossed", as is the case when a student may have been a member of various classrooms during the duration of a study period.

The HLM program allows for continuous, count, ordinal, and nominal outcome variables and assumes a functional relationship between the expectation of the outcome and a linear combination of a set of explanatory variables. This relationship is defined by a suitable link function, for example, the identity link (continuous outcomes) or logit link (binary outcomes).

What's new in HLM 8

In HLM8, the ability to estimate an HLM from incomplete data was added. This is a completely automated approach that generates and analyses multiply imputed data sets from incomplete data. The model is fully multivariate and enables the analyst to strengthen imputation through auxiliary variables. This means that the user specifies the HLM; the program automatically searches the data to discover which variables have missing values and then estimates a multivariate hierarchical linear model (”imputation model”) in which all variables having missed values are regressed on all variables having complete data. The program then uses the resulting parameter estimates to generate M imputed data sets, each of which is then analysed in turn. Results are combined using the “Rubin rules”.

Another new feature of HLM 8 is that flexible combinations of Fixed Intercepts and Random Coefficients (FIRC) are now included in HLM2, HLM3, HLM4, HCM2, and HCM3. A concern that can arise in multilevel causal studies is that random effects may be correlated with treatment assignment. For example, suppose that treatments are assigned non-randomly to students who are nested within schools. Estimating a two-level model with random school intercepts will generate bias if the random intercepts are correlated with treatment effects. The conventional strategy is to specify a fixed effects model for schools. However, this approach assumes homogeneous treatment effects, possibly leading to biased estimates of the average treatment effect, incorrect standard errors, and inappropriate interpretation. HLM 8 allows the analyst to combine fixed intercepts with random coefficients in models that address these problems and to facilitate a richer summary including an estimate of the variation of treatment effects and empirical Bayes estimates of unit-specific treatment effects. This approach was proposed in Bloom, Raudenbush, Weiss and Porter (2017)