Immigration / Emigration Model

Data

Statistics Canada does not contain immigration/emigration data broken down by the necessary groups (age, sex, etc), so we do not have exact data for this category. Instead, we use the data from the birth and death models.

Population Data

We use the Statistics Canada population data that was generated and saved as: processed_data/{time_delta_tag}/birth/initial_pop_distribution_prop.csv.

This table contains the number of people in a given age, sex, province, and projection scenario, along with the number of births for that timepoint. This data is the net number of people, factoring in death, immigration, and emigration.

Column

Type

Description

timepoint

int

the starting date / time of the time interval that the data applies to

age

int

the age of the person in years

province

str

the province of the person (e.g., AB = Alberta, BC = British Columbia, etc.)

n_age

int

the number of people in a given age group, time interval, province, and projection scenario

n_birth

int

the number of births in that time interval, province, and projection scenario

prop

float

the proportion of the population in that age group, time interval, province, and projection scenario relative to the number of births in that time interval, province, and projection scenario

prop_male

float

the proportion of the population in a given age group, time interval, province, and projection scenario who are male

projection_scenario

str

the projection scenario used to generate the data

Mortality Data

We use the Statistics Canada population data that was generated and saved as: processed_data/{time_delta_tag}/life_table.csv.

Column

Type

Description

timepoint

int

the starting date / time of the time interval that the data applies to

age

int

the age of the person in years

province

str

the province of the person (e.g., AB = Alberta, BC = British Columbia, etc.)

sex

str

F = female, M = male

prob_death

float

the probability that a person of the given age and sex, living in the given province, will die during the given time interval.

se

float

the standard error on the probability of death

Model

To obtain the net migration, for anyone aged > 0, we compute the number of people in each age group where \(\Delta n\) is the net migration, \(n_{(\text{age},\ \text{timepoint})}\) is the number of people at the current age and timepoint, \(n_{(\text{age}-1,\ \text{timepoint}-1)}\) is the number of people one year younger at the previous year, and \(q_{x_{(\text{age}-1,\ \text{year}-1)}}\) is the sex-specific probability of death for that younger cohort. This is computed separately for each combination of age, sex, province, and projection scenario. Age 0 is excluded because newborns are handled separately by the birth model.

\[\Delta n_{(a,\ s,\ t)} = n_{(a,\ s,\ t)} - n_{(a-1,\ s,\ t-1)} \cdot \left(1 - q_{x_{(a-1,\ s,\ t-1)}}\right)\]

where \(a\) = age, \(s\) = sex, \(t\) = year.

If \(\Delta n > 0\), the surplus is attributed to immigration. If \(\Delta n < 0\), the deficit is attributed to emigration.

Processed Data

The migration model produces a single processed data file generated by leap/data_generation/migration_data.py, covering the provinces CA (all of Canada) and BC (British Columbia).

Migration Table

Saved as: leap/processed_data/{time_delta_tab}/migration/migration_table.csv.

Each row corresponds to a unique combination of year, province, age, sex, and projection_scenario. The table records the signed net migration (\(\Delta n\)) for each group, along with derived columns used by the simulation at runtime for both immigration and emigration.

\(\Delta n\) is computed independently for each sex. Because males and females are computed separately, it is possible for one sex to have net immigration while the other has net emigration for the same age and year.

Column

Type

Description

year

int

the calendar year

province

str

the province abbreviation (BC or CA)

age

int

the age in years

sex

str

M = male, F = female

projection_scenario

str

the StatCan population projection scenario

delta_n

float

the signed net migration for this age, sex, year, province, and projection scenario; positive values indicate net immigration, negative values indicate net emigration

prop_migrants_birth

float

delta_n divided by the number of births that year; signed — positive for net immigration cells, negative for net emigration cells

prop_immigrants_year

float

for cells where delta_n > 0, each age and sex group’s share of all immigrants arriving in a given year (denominator is the sum of positive delta_n values only); zero for emigration cells

prop_emigrants_year

float

for cells where delta_n < 0, each age and sex group’s share of all emigrants leaving in a given year (denominator is the sum of negative delta_n values only); zero for immigration cells

prob_emigration

float

for cells where delta_n < 0, the per-person annual probability of emigrating, computed as \(|\Delta n| / N\); zero for immigration cells

prop_migrants_birth is computed as:

\[\text{prop migrants birth}_{(a,\ s,\ t)} = \dfrac{\Delta n_{(a,\ s,\ t)}}{n^{\text{birth}}_{(t)}}\]

where \(a\) = age, \(s\) = sex, \(t\) = year.

The total number of immigrant agents created in a given year is:

\[\begin{split}i_{(t)} = \left\lceil n_{(t)} \cdot \sum_{\substack{a,\ s \\ \Delta n > 0}}\ \text{prop migrants birth}_{(a,\ s,\ t)} \right\rceil\end{split}\]

where \(n_{(t)}\) is the number of simulated births in that year.

prop_immigrants_year is computed as:

\[\begin{split}\text{prop immigrants year}_{(a,\ s,\ t)} = \dfrac{\Delta n_{(a,\ s,\ t)}}{\sum_{\substack{a,\ s \\ \Delta n > 0}}\ \Delta n_{(a,\ s,\ t)}} \quad \text{if } \Delta n > 0, \text{ else } 0\end{split}\]

prob_emigration is computed as:

\[\text{prob emigration}_{(a,\ s,\ t)} = \dfrac{|\Delta n_{(a,\ s,\ t)}|}{N_{(a,\ s,\ t)}} \quad \text{if } \Delta n < 0, \text{ else } 0\]

At runtime, the simulation uses prob_emigration directly in a Bernoulli trial each year to determine whether an agent emigrates:

\[\text{emigrates} \sim \text{Bernoulli}(p_{\text{emigrate}}(\text{sex}, \text{age}, \text{year}))\]

Agents aged 0 are excluded — newborns never emigrate.

Both immigration and emigration are rooted in the same StatCan population-level counts (\(\Delta n\)), and are converted into agent-level operations to fit LEAP’s microsimulation framework:

  • For immigration, rows where delta_n > 0 are used. prop_migrants_birth (positive) determines how many immigrant agents to create at each timepoint, and prop_immigrants_timepoint determines the age and sex of each agent.

  • For emigration, rows where delta_n < 0 are used. prob_emigration is applied to each existing agent individually via a Bernoulli trial for each timepoint.