Birth Data

To obtain the population data for each year, we used two tables from StatCan:

  1. 1999 - 2021:

    For past years, we used Table 17-10-00005-01 from StatCan.

    The *.csv file can be downloaded from here: 17100005-eng.zip

    and is saved as: LEAP/leap/original_data/17100005.csv

  2. 2021 - 2065:

    For future years, we used Table 17-10-0057-01 from StatCan.

    The *.csv file can be downloaded from here: 17100057-eng.zip.

    and is saved as: LEAP/leap/original_data/17100057.csv

To run the data processing for the population data:

cd LEAP
python3 leap/data_generation/birth_data.py

This will update the following data files:

  1. leap/processed_data/birth/birth_estimate.csv

  2. leap/processed_data/birth/initial_pop_distribution_prop.csv

leap.data_generation.birth_data module

leap.data_generation.birth_data.get_projection_scenario_id(projection_scenario: str) str[source]

Convert the long form of the projection scenario to the 2-letter ID.

Parameters:
projection_scenario: str

The long form of the projection scenario, e.g. Projection scenario M1.

Returns:

The 2-letter ID of the projection scenario, e.g. M1.

leap.data_generation.birth_data.filter_age_group(age_group: str) bool[source]

Filter out grouped categories such as “Median”, “Average”, “All”, “to”, “over”.

Parameters:
age_group: str

The age group string.

Returns:

True if the age group is not a grouped category, False otherwise.

leap.data_generation.birth_data.load_past_births_population_data() pandas.core.frame.DataFrame[source]

Load the past birth data from the CSV file.

Returns:

The past birth data. Columns:

  • year: The year of the data.

  • province: The 2-letter province ID.

  • N: The total number of births in that year.

  • prop_male: The proportion of births in that year that are male.

  • projection_scenario: The projection scenario; all values are "past".

leap.data_generation.birth_data.load_projected_births_population_data(min_year: int) pandas.core.frame.DataFrame[source]

Load the projected births data from the CSV file from StatCan.

Parameters:
min_year: int

The starting year for the projected data.

Returns:

The projected births data. Columns:

  • year: The year of the data.

  • province: The 2-letter province ID.

  • N: The total number of births predicted for that year.

  • prop_male: The proportion of predicted births in that year that are male.

  • projection_scenario: The projection scenario, one of:

    • LG: low-growth projection

    • HG: high-growth projection

    • M1: medium-growth 1 projection

    • M2: medium-growth 2 projection

    • M3: medium-growth 3 projection

    • M4: medium-growth 4 projection

    • M5: medium-growth 5 projection

    • M6: medium-growth 6 projection

    • FA: fast-aging projection

    • SA: slow-aging projection

leap.data_generation.birth_data.load_past_initial_population_data() pandas.core.frame.DataFrame[source]

Load the past initial population data from the CSV file.

Returns:

The past initial population data. Columns:

  • year: The calendar year.

  • province: The 2-letter province ID, e.g. BC.

  • age: The age of the population.

  • prop_male: The proportion of the population in that age group that are male.

  • n_age: The total number of people in that age group for the given year, province, and projection scenario.

  • n_birth: The total number of births in the given year, province, and projection scenario.

  • prop: The proportion of the total number of people in that age group to the total number of births in that year.

  • projection_scenario: The projection scenario; all values are “past”.

leap.data_generation.birth_data.load_projected_initial_population_data(min_year: int) pandas.core.frame.DataFrame[source]

Load the projected initial population data from the CSV file.

Parameters:
min_year: int

The starting year for the projected data.

Returns:

The projected initial population data. Columns:

  • year: The calendar year.

  • province: The 2-letter province ID, e.g. BC.

  • age: The age of the population.

  • prop_male: The proportion of the population in that age group that are male.

  • n_age: The total number of people in that age group for the given year, province, and projection scenario.

  • n_birth: The total number of births in the given year, province, and projection scenario.

  • prop: The proportion of the total number of people in that age group to the total number of births in that year.

  • projection_scenario: The projection scenario, one of:

    • LG: low-growth projection

    • HG: high-growth projection

    • M1: medium-growth 1 projection

    • M2: medium-growth 2 projection

    • M3: medium-growth 3 projection

    • M4: medium-growth 4 projection

    • M5: medium-growth 5 projection

    • M6: medium-growth 6 projection

    • FA: fast-aging projection

    • SA: slow-aging projection

leap.data_generation.birth_data.generate_birth_estimate_data()[source]

Create/update the birth_estimate.csv file.

leap.data_generation.birth_data.generate_initial_population_data()[source]

Create/update the initial_pop_distribution_prop.csv file.