Mortality Model¶
Data¶
To obtain the mortality data for each year, we used one table from Statistics Canada:
Past Data: 1996 - 2021¶
For past years, we used Table 13-10-00837-01 from StatCan.
The *.csv file can be downloaded from here:
13100837-eng.zip
and is saved as:
LEAP/leap/original_data/13100837.csv.
The relevant columns are:
Column |
Type |
Description |
|---|---|---|
|
|
the calendar year |
|
|
the age of the person in years |
|
|
the province or terriroty full name |
|
|
one of “Both sexes”, “Females”, or “Males” |
|
|
describes what the variable of interest is; we want |
|
|
the probability of death between age |
Projected Data: 2021 - 2068¶
Statistics Canada doesn’t provide annual projections for death probabilities, but does
provide a projection for specific years (which we call calibration years):
Region |
Year |
Projection Scenario |
Mortality Scenario |
|---|---|---|---|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Canada |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
Provinces / Territories |
|
|
|
This data can be found in the Statistics Canada Population Projections Technical Report:
Table 3.1, Table 3.2, Table 5.2.1, Table 5.2.2, Table 5.2.3.
Model¶
We have mortality data for past years (1996 - 2020), and life expectancy projections for
specific future years; but we would like to have mortality data for all future years in our
model. Statistics Canada describes how they model mortality here:
Methods for Constructing Life Tables for Canada, Provinces and Territories.
In particular, the model they use is the Kannisto-Thatcher model, described in this paper:
On the use of Kannisto model for mortality trajectory modelling at very old ages.
According to the Kannisto-Thatcher model, the instantaneous probability of death at age \(x\) is given by:
In mathematical terms, \(\mu(x)\) is the hazard rate. Let’s break this down further.
Let \(F_X(x)\) be the cumulative distribution function for age at death, \(X\):
We want the conditional probability of death between age \(x\) and \(x + \Delta x\), given that the person has survived till age \(x\). This is given by:
Recall that for a conditional probability:
and so:
Since \(F_X(x)\) is the cumulative distribution function, by definition it must sum to 1:
Since \(X > x\) if \(x < X \leq x + \Delta x\), we can rewrite the numerator as:
Putting it all together, we have:
Now, we want to find the instantaneous rate of death; the probability of death per unit time. If we take the limit as \(\Delta x \to 0\), we will find the instantaneous probability of death at age \(x\). To get the probability of death per unit time, we need to divide by \(\Delta x\):
You will recognize the derivative of \(F_X(x)\):
and so:
The data in the Statistics Canada mortality table is the probability of death between age
\(x\) and \(x + 1\), which is denoted as \(q_x\). This is the same as the probability
\(P(x < X \leq x + \Delta x \mid X > x)\), with \(\Delta x = 1\). We would like to solve
for \(q_x\), using the Kannisto-Thatcher Equation for \(\mu(x)\). First, we can
write \(q_x\) in terms of \(F_X(x)\):
Let us define \(S_X(x)\), the survival function, for convenience:
Then we have:
and so \(\mu(x)\) can be rewritten as:
Solving this first order separable linear differential equation, we have:
Letting \(u(x) := 1 + a e^{\beta x}\), we have:
Now, we can substitute this into the equation for \(q_x\):
If we take the logit of \(q_x\), we have:
Let us now look at \(\sigma^{-1}(q_x) - \sigma^{-1}(q_{x_0})\):
Now, based on fitting the model to empirical data, typically we have [Appendix D, Table 5, [Kannisto, 1994]]:
\(\beta \approx \mathcal{O}(10^{-1})\)
\(\alpha \approx \mathcal{O}(10^{-5})\)
We can use the binomial approximation to simplify the above equation. Let us take:
In order to use the binomial approximation, we must have:
Since \(x\) represents the age in years, we have \(x \in [0, 120]\). These conditions hold for all ages. Using the binomial approximation, we have:
Going back to our equation for \(\sigma^{-1}(q_x) - \sigma^{-1}(q_{x_0})\), we have:
The last term is much smaller than the first term, and so we can ignore it. Thus, we have:
If \(x_0\) is the age of the person in the starting year of the simulation, then \((\text{year} - \text{year}_0) = (x - x_0)\):
The parameter \(\beta_{\text{sex}}\) is unknown, and so we first need to calculate it.
To do so, we set \(\text{year} = \text{year}_C\), the calibration year, and use the Brent
root-finding algorithm to optimize \(\beta_{\text{sex}}\) such that the life expectancy in the
calibration year (which is known) matches the predicted life expectancy.
Once we have found \(\beta_{\text{sex}}\), we can use this formula to find the projected death probabilities.
Processed Data¶
The past and projected death probabilities are combined by leap/data_generation/death_data.py into a single processed file saved as: leap/processed_data/life_table.csv.
Past data (from 13100837.csv) covers years 1996 to the last available year using
death probabilities directly from Statistics Canada.
For projected years (up to 2068), death probabilities for every annual increment are filled in by fitting a linear trend (in logit space) that connects the last historical year to Statistics Canada’s life expectancy targets at the calibration years. A separate slope is fitted for each sex and province.
Column |
Type |
Description |
|---|---|---|
|
|
the age of the person in years |
|
|
the probability of death between age |
|
|
the standard error of the probability of death; for projected years, this is scaled proportionally from the base year’s standard error |
|
|
one of |
|
|
the calendar year |
|
|
the 2-letter province or territory ID
(e.g., |