Generalized Linear Model: Example

In this example, we consider a small fictitious data set that appears in Werner and Modlin (2010). The data consists of loss and loss adjustment expenses (LossLAE), decomposed by three levels of an amount of insurance (AOI) and three territories (Terr). For each combination of AOI and Terr, we have available the number of policies issued, given as the exposure.

Show Table

Our objective is to fit a generalized linear model (GLM) to the data using LossLAE as the dependent variable. We would like to understand the influence of the amount of insurance and territory on LossLAE.

We now specify two factors and estimate a generalized linear model using a gamma distribution with a logarithmic link function. In the R output that follows, the relevel command allow us to specify the reference level. For this example, a medium amount of insurance (AOI = ``medium'') and the second territory (Terr = 2) are chosen as the reference levels. Logarithmic exposure is used as an offset variable so that cells (combinations of the two categorical variables) with larger number of exposures/policies will have larger expected losses.

With the relativities and exposures, it straightforward to compute predictions. For example, for a high amount of insurance in territory 1, the exposure is 179, so the fitted value is \begin{eqnarray*} 179 \times 65.366 \times 1.430 \times 0.631 = 10,562. \end{eqnarray*} This is close to the actual value 10,565.98.

By comparing all actual to fitted values, or the null to the residual deviance, or examining the $t$-values or $p$-values, we see that we have done a pretty amazing job of fitting this data. In fact, these data are artificially constructed by Werner and Modlin to prove that various univariate methods of identifying relativities can do poorly. A multivariate method such as GLM is usually preferred in practice. Recall that the purpose of linear, as well as generalized linear, modeling is to simultaneously fit several factors to a set of data, not each in isolation of the others. As will be discussed in the following subsection, we should pay attention to the variability when introducing exposures. However, weighting for changing variability is not needed for this artificial example.