Mortality Model

Data

To obtain the mortality data for each year, we used one table from Statistics Canada:

Past Data: 1996 - 2021

For past years, we used Table 13-10-00837-01 from StatCan.

The *.csv file can be downloaded from here: 13100837-eng.zip and is saved as: LEAP/leap/original_data/13100837.csv.

The relevant columns are:

Column

Type

Description

REF_DATE

int

the calendar year

AGE_GROUP

str

the age of the person in years

GEO

str

the province or terriroty full name

SEX

str

one of “Both sexes”, “Females”, or “Males”

ELEMENT

str

describes what the variable of interest is; we want "Death probability between age x and x+1 (qx)"

VALUE

int

the probability of death between age x and x+1 in that year, province, sex, and age group

Projected Data: 2021 - 2068

Statistics Canada doesn’t provide annual projections for death probabilities, but does provide a projection for specific years (which we call calibration years):

Region

Year

Projection Scenario

Mortality Scenario

Canada

2028

LG

HM

Canada

2028

M1

MM

Canada

2028

M2

MM

Canada

2028

M3

MM

Canada

2028

M4

MM

Canada

2028

M5

MM

Canada

2028

HG

LM

Canada

2028

SA

HM

Canada

2028

FA

LM

Canada

2048

LG

HM

Canada

2048

M1

MM

Canada

2048

M2

MM

Canada

2028

M3

MM

Canada

2048

M4

MM

Canada

2048

M5

MM

Canada

2048

HG

LM

Canada

2048

SA

HM

Canada

2048

FA

LM

Canada

2073

LG

HM

Canada

2073

M1

MM

Canada

2073

M2

MM

Canada

2073

M3

MM

Canada

2073

M4

MM

Canada

2073

M5

MM

Canada

2073

HG

LM

Canada

2073

SA

HM

Canada

2073

FA

LM

Provinces / Territories

2028

HG, FA

LM

Provinces / Territories

2028

M1, M2, M3, M4, M5, M6

MM

Provinces / Territories

2028

LG, SA

HM

Provinces / Territories

2033

HG, FA

LM

Provinces / Territories

2033

M1, M2, M3, M4, M5, M6

MM

Provinces / Territories

2033

LG, SA

HM

Provinces / Territories

2038

HG, FA

LM

Provinces / Territories

2038

M1, M2, M3, M4, M5, M6

MM

Provinces / Territories

2038

LG, SA

HM

Provinces / Territories

2043

HG, FA

LM

Provinces / Territories

2043

M1, M2, M3, M4, M5, M6

MM

Provinces / Territories

2043

LG, SA

HM

Provinces / Territories

2048

HG, FA

LM

Provinces / Territories

2048

M1, M2, M3, M4, M5, M6

MM

Provinces / Territories

2048

LG, SA

HM

Provinces / Territories

2053

HG, FA

LM

Provinces / Territories

2053

M1, M2, M3, M4, M5, M6

MM

Provinces / Territories

2053

LG, SA

HM

Provinces / Territories

2058

HG, FA

LM

Provinces / Territories

2058

M1, M2, M3, M4, M5, M6

MM

Provinces / Territories

2058

LG, SA

HM

Provinces / Territories

2063

HG, FA

LM

Provinces / Territories

2063

M1, M2, M3, M4, M5, M6

MM

Provinces / Territories

2063

LG, SA

HM

Provinces / Territories

2068

HG, FA

LM

Provinces / Territories

2068

M1, M2, M3, M4, M5, M6

MM

Provinces / Territories

2068

LG, SA

HM

Provinces / Territories

2073

HG, FA

LM

Provinces / Territories

2073

M1, M2, M3, M4, M5, M6

MM

Provinces / Territories

2073

LG, SA

HM

This data can be found in the Statistics Canada Population Projections Technical Report: Table 3.1, Table 3.2, Table 5.2.1, Table 5.2.2, Table 5.2.3.

Model

We have mortality data for past years (1996 - 2020), and life expectancy projections for specific future years; but we would like to have mortality data for all future years in our model. Statistics Canada describes how they model mortality here: Methods for Constructing Life Tables for Canada, Provinces and Territories.

In particular, the model they use is the Kannisto-Thatcher model, described in this paper: On the use of Kannisto model for mortality trajectory modelling at very old ages.

According to the Kannisto-Thatcher model, the instantaneous probability of death at age \(x\) is given by:

\[\begin{split}\mu(x) &= \dfrac{a e^{\beta x}}{1 + a e^{\beta x}} \\ &= \lim_{\Delta x \to 0} \dfrac{P(\text{death between age $x$ and $x + \Delta x$} \mid \text{survived till $x$})}{\Delta x}\end{split}\]

In mathematical terms, \(\mu(x)\) is the hazard rate. Let’s break this down further. Let \(F_X(x)\) be the cumulative distribution function for age at death, \(X\):

\[\begin{split}F_X(x) :&= P(\text{age at death} \leq \text{given age}) \\ &= P(X \leq x)\end{split}\]

We want the conditional probability of death between age \(x\) and \(x + \Delta x\), given that the person has survived till age \(x\). This is given by:

\[P(x < X \leq x + \Delta x \mid X > x)\]

Recall that for a conditional probability:

\[P(A \mid B) = \dfrac{P(A \cap B)}{P(B)}\]

and so:

\[P(x < X \leq x + \Delta x \mid X > x) = \dfrac{P(x < X \leq x + \Delta x \bigcap X > x)}{P(X > x)}\]

Since \(F_X(x)\) is the cumulative distribution function, by definition it must sum to 1:

\[P(X > x) = 1 - F_X(x)\]

Since \(X > x\) if \(x < X \leq x + \Delta x\), we can rewrite the numerator as:

\[\begin{split}P(x < X \leq x + \Delta x \bigcap X > x) &= P(x < X \leq x + \Delta x) \\ &= F_X(x + \Delta x) - F_X(x)\end{split}\]

Putting it all together, we have:

\[P(x < X \leq x + \Delta x \mid X > x) = \dfrac{F_X(x + \Delta x) - F_X(x)}{1 - F_X(x)}\]

Now, we want to find the instantaneous rate of death; the probability of death per unit time. If we take the limit as \(\Delta x \to 0\), we will find the instantaneous probability of death at age \(x\). To get the probability of death per unit time, we need to divide by \(\Delta x\):

\[\mu(x) = \lim_{\Delta x \to 0} \dfrac{F_X(x + \Delta x) - F_X(x)}{\Delta x (1 - F_X(x))}\]

You will recognize the derivative of \(F_X(x)\):

\[\dfrac{d}{dx} F_X(x) = \lim_{\Delta x \to 0} \dfrac{F_X(x + \Delta x) - F_X(x)}{\Delta x}\]

and so:

\[\mu(x) = \dfrac{F_X'(x)}{1 - F_X(x)}\]

The data in the Statistics Canada mortality table is the probability of death between age \(x\) and \(x + 1\), which is denoted as \(q_x\). This is the same as the probability \(P(x < X \leq x + \Delta x \mid X > x)\), with \(\Delta x = 1\). We would like to solve for \(q_x\), using the Kannisto-Thatcher Equation for \(\mu(x)\). First, we can write \(q_x\) in terms of \(F_X(x)\):

\[\begin{split}q_x &= P(x < X \leq x + 1 \mid X > x) \\ &= \dfrac{F_X(x + 1) - F_X(x)}{1 - F_X(x)}\end{split}\]

Let us define \(S_X(x)\), the survival function, for convenience:

\[\begin{split}S_X(x) &:= 1 - F_X(x) \\ &= P(X > x)\end{split}\]

Then we have:

\[\dfrac{dS}{dx} = -F_X'(x)\]

and so \(\mu(x)\) can be rewritten as:

\[\mu(x) = -\dfrac{dS}{dx}\dfrac{1}{S_X(x)}\]

Solving this first order separable linear differential equation, we have:

\[\begin{split}\int \dfrac{dS}{S_X} &= -\int \mu(x) dx \\ \ln(S_X(x)) &= -\int \mu(x) dx + C \\ &= -\int \dfrac{a e^{\beta x}}{1 + a e^{\beta x}} dx + C\end{split}\]

Letting \(u(x) := 1 + a e^{\beta x}\), we have:

\[\begin{split}\ln(S_X(x)) &= - \dfrac{1}{\beta} \int \dfrac{du}{u} + C \\ &= - \dfrac{1}{\beta} \ln(u(x)) + C \\ S_X(x) &= e^C (1 + a e^{\beta x})^{-\frac{1}{\beta}} \\ &= k (1 + a e^{\beta x})^{-\frac{1}{\beta}} \\ 1 - F_X(x) &= k (1 + a e^{\beta x})^{-\frac{1}{\beta}} \\ F_X(x) &= 1 - k (1 + a e^{\beta x})^{-\frac{1}{\beta}}\end{split}\]

Now, we can substitute this into the equation for \(q_x\):

\[\begin{split}q_x &= \dfrac{F_X(x + \Delta x) - F_X(x)}{1 - F_X(x)} \\ &= \dfrac{ 1 - k (1 + a e^{\beta (x + \Delta x)})^{-\frac{1}{\beta}} - 1 + k (1 + a e^{\beta x})^{-\frac{1}{\beta}} }{k (1 + a e^{\beta x})^{-\frac{1}{\beta}}} \\ &= \dfrac{ - k (1 + a e^{\beta (x + \Delta x)})^{-\frac{1}{\beta}} + k (1 + a e^{\beta x})^{-\frac{1}{\beta}} }{k (1 + a e^{\beta x})^{-\frac{1}{\beta}}} \\ &= 1 - \left(\dfrac{1 + a e^{\beta (x + \Delta x)}}{1 + a e^{\beta x}}\right)^{-\frac{1}{\beta}} \\ &= 1 - \left(\dfrac{1 + a e^{\beta x}}{1 + a e^{\beta (x + \Delta x)}}\right)^{\frac{1}{\beta}}\end{split}\]

If we take the logit of \(q_x\), we have:

\[\begin{split}\sigma^{-1}(q_x) &= \ln\left(\dfrac{q_x}{1 - q_x}\right) \\ &= \ln\left(\dfrac{ 1 - \left(\dfrac{1 + a e^{\beta x}}{1 + a e^{\beta (x + \Delta x)}}\right)^{\frac{1}{\beta}} }{ \left(\dfrac{1 + a e^{\beta x}}{1 + a e^{\beta (x + \Delta x)}}\right)^{\frac{1}{\beta}} }\right) \\ &= \ln\left( \left(\dfrac{1 + a e^{\beta (x + \Delta x)}}{1 + a e^{\beta x}}\right)^{\frac{1}{\beta}} - 1 \right) \\ &= \ln \left(\dfrac{ (1 + \alpha e^{\beta (x + \Delta x)})^{\frac{1}{\beta}} - (1 + \alpha e^{\beta x})^{\frac{1}{\beta}} }{(1 + \alpha e^{\beta x})^{\frac{1}{\beta}}}\right) \\ &= \ln\left( (1 + \alpha e^{\beta (x + \Delta x)})^{\frac{1}{\beta}} - (1 + \alpha e^{\beta x})^{\frac{1}{\beta}} \right) - \dfrac{1}{\beta}\ln(1 + \alpha e^{\beta x})\end{split}\]

Let us now look at \(\sigma^{-1}(q_x) - \sigma^{-1}(q_{x_0})\):

\[\begin{split}\sigma^{-1}(q_x) - \sigma^{-1}(q_{x_0}) &= \ln\left( (1 + \alpha e^{\beta (x + \Delta x)})^{\frac{1}{\beta}} - (1 + \alpha e^{\beta x})^{\frac{1}{\beta}} \right) - \dfrac{1}{\beta}\ln(1 + a e^{\beta x}) \\ &- \ln\left( (1 + \alpha e^{\beta (x_0 + \Delta x)})^{\frac{1}{\beta}} - (1 + \alpha e^{\beta x_0})^{\frac{1}{\beta}} \right) + \dfrac{1}{\beta}\ln(1 + \alpha e^{\beta x_0})\end{split}\]

Now, based on fitting the model to empirical data, typically we have [Appendix D, Table 5, [Kannisto, 1994]]:

  1. \(\beta \approx \mathcal{O}(10^{-1})\)

  2. \(\alpha \approx \mathcal{O}(10^{-5})\)

We can use the binomial approximation to simplify the above equation. Let us take:

\[(1 + \alpha e^{\beta x})^{\frac{1}{\beta}}\]

In order to use the binomial approximation, we must have:

\[\begin{split}\left|\alpha e^{\beta x}\right| &< 1 \\ \left|\dfrac{\alpha e^{\beta x}}{\beta}\right| &\ll 1 \\\end{split}\]

Since \(x\) represents the age in years, we have \(x \in [0, 120]\). These conditions hold for all ages. Using the binomial approximation, we have:

\[(1 + \alpha e^{\beta x})^{\frac{1}{\beta}} \approx 1 + \dfrac{\alpha e^{\beta x}}{\beta}\]

Going back to our equation for \(\sigma^{-1}(q_x) - \sigma^{-1}(q_{x_0})\), we have:

\[\begin{split}\sigma^{-1}(q_x) - \sigma^{-1}(q_{x_0}) &\approx \ln\left( 1 + \dfrac{\alpha e^{\beta (x + \Delta x)}}{\beta} - 1 - \dfrac{\alpha e^{\beta x}}{\beta} \right) - \ln\left(1 + \dfrac{\alpha e^{\beta x}}{\beta}\right) \\ &- \ln\left( 1 + \dfrac{\alpha e^{\beta (x_0 + \Delta x)}}{\beta} - 1 - \dfrac{\alpha e^{\beta x_0}}{\beta} \right) + \ln\left(1 + \dfrac{\alpha e^{\beta x_0}}{\beta}\right) \\ &= \ln\left( \dfrac{\alpha e^{\beta (x + \Delta x)}}{\beta} - \dfrac{\alpha e^{\beta x}}{\beta} \right) - \ln\left(1 + \dfrac{\alpha e^{\beta x}}{\beta}\right) \\ &- \ln\left( \dfrac{\alpha e^{\beta (x_0 + \Delta x)}}{\beta} - \dfrac{\alpha e^{\beta x_0}}{\beta} \right) + \ln\left(1 + \dfrac{\alpha e^{\beta x_0}}{\beta}\right) \\ &= \textcolor{orange}{\cancel{\ln\left(\dfrac{\alpha}{\beta}\right)}} + \ln(e^{\beta x})+ \textcolor{magenta}{\cancel{\ln\left(e^{\beta \Delta x} - 1\right)}} - \ln\left(1 + \dfrac{\alpha e^{\beta x}}{\beta}\right) \\ &- \textcolor{orange}{\cancel{\ln\left(\dfrac{\alpha}{\beta}\right)}} - \ln(e^{\beta x_0}) - \textcolor{magenta}{\cancel{\ln\left(e^{\beta \Delta x} - 1\right)}} + \ln\left(1 + \dfrac{\alpha e^{\beta x_0}}{\beta}\right) \\ &= \ln(e^{\beta x})+ \ln\left(1 + \dfrac{\alpha e^{\beta x_0}}{\beta}\right) - \ln\left(1 + \dfrac{\alpha e^{\beta x}}{\beta}\right) - \ln(e^{\beta x_0}) \\ &= \beta (x - x_0) + \ln\left(\dfrac{1 + \dfrac{\alpha e^{\beta x_0}}{\beta}}{1 + \dfrac{\alpha e^{\beta x}}{\beta}}\right)\end{split}\]

The last term is much smaller than the first term, and so we can ignore it. Thus, we have:

\[\sigma^{-1}(q_x) \approx \sigma^{-1}(q_{x_0}) + \beta (x - x_0)\]

If \(x_0\) is the age of the person in the starting year of the simulation, then \((\text{year} - \text{year}_0) = (x - x_0)\):

\[\sigma^{-1}(q_x(\text{sex}, \text{age})) = \sigma^{-1}(q_{x_0}(\text{sex}, \text{age})) - \beta_{\text{sex}}(\text{year} - \text{year}_0)\]

The parameter \(\beta_{\text{sex}}\) is unknown, and so we first need to calculate it. To do so, we set \(\text{year} = \text{year}_C\), the calibration year, and use the Brent root-finding algorithm to optimize \(\beta_{\text{sex}}\) such that the life expectancy in the calibration year (which is known) matches the predicted life expectancy.

Once we have found \(\beta_{\text{sex}}\), we can use this formula to find the projected death probabilities.