Exacerbation Calibration Data¶
The number of exacerbations in a given year is modelled using a Poisson distribution. The formula is:
Here \(\lambda\) is the expected number of exacerbations per year. To obtain \(\lambda\), we must perform a Poisson regression. The Poisson regression assumes that the value we are interested in can be approximated using the following formula:
where:
\(\alpha\): calibration multiplier
\(a\): age
\(\beta_a\): age constant
\(s\): sex
\(\beta_s\): sex constant
\(c_i\): relative time spent in control level \(i\)
\(\beta_i\): control level constant
In the exacerbation_data.py
file, we are interested in calculating \(\alpha\). If we
rewrite the equation, the meaning of \(\alpha\) becomes more apparent:
How do we obtain \(\alpha\)? We again assume that the mean value has the same form as in a Poisson regression, with the following formula:
\(\lambda_C\): the average number of exacerbations in a given year
\(c_i\): relative time spent in control level \(i\)
\(\gamma_i\): control level constant (different from \(\beta_i\) above)
Here, the \(\gamma_i\) values were calculated from the Economic Burden of Asthma (EBA) study and are given by:
The number of exacerbations predicted by the model is then:
\(N_{\text{asthma}}\): the number of people in a given year, age, and sex
and number of hospitalizations is:
\(N_{\text{hosp}}^{\text{(pred)}}\): the predicted number of hospitalizations for a given year, age, and sex
\(P(\text{hosp})\): the probability of hospitalization due to asthma given the patient has an asthma exacerbation
Finally, \(\alpha\) can be computed:
To run the data generation for the exacerbation data:
cd LEAP
python3 leap/data_generation/exacerbation_data.py
leap.data_generation.exacerbation_data module¶
-
leap.data_generation.exacerbation_data.exacerbation_prediction(sex: str, age: int, gamma_control: list[float] | None =
None
)[source]¶ Calculate the mean number of exacerbations for a given age and sex.
\[\ln(\lambda_{C}) = \sum_{i=1}^3 \gamma_i c_i\]where:
\(\lambda_{C}\) is the predicted average number of asthma exacerbations per year.
\(\gamma_i\) is the control parameter.
\(c_i\) is the relative time spent in control level \(i\).
Here the \(\gamma_i\) values were calculated from the Economic Burden of Asthma (EBA) study and are given by:
\begin{align*} \gamma_1 &:= 0.1880058 & \text{rate(exacerbation | fully controlled)}\\ \gamma_2 &:= 0.3760116 & \text{rate(exacerbation | partially controlled)}\\ \gamma_3 &:= 0.5640174 & \text{rate(exacerbation | uncontrolled)} \end{align*}
- leap.data_generation.exacerbation_data.parse_sex(x: str) str | float [source]¶
Reformat a string containing sex information.
- leap.data_generation.exacerbation_data.parse_age(x: str) int | float [source]¶
Reformat a string containing age information.
-
leap.data_generation.exacerbation_data.load_hospitalization_data(province: str =
'CA'
, starting_year: int =2000
, min_age: int =3
) pandas.core.frame.DataFrame [source]¶ Load the hospitalization data for the given province and starting year.
The data is from the
Hospital Morbidity Database (HMDB)
from the Canadian Institute for Health Information (CIHI).The hospitalization data was collected from patients presenting to a hospital in Canada due to an asthma exacerbation. We will use this data to calibrate the exacerbation model.
- Parameters:¶
- province: str =
'CA'
¶ The province for which to load the hospitalization data.
- starting_year: int =
2000
¶ The starting year for which to load the hospitalization data.
- min_age: int =
3
¶ The minimum age for to be used in the data. We are assuming that asthma diagnoses are made at age 3 and older, so the default is 3.
- province: str =
- Returns:¶
The hospitalization data for the given province and starting year. Columns:
year
: The year of the data.sex
: One ofM
= male,F
= female.age
: Integer age, a value in[3, 90]
.hospitalization_rate
: The observed number of hospitalizations per100 000
people for a given year, age, and sex.
-
leap.data_generation.exacerbation_data.load_population_data(province: str, starting_year: int, projection_scenario: str, max_year: int, min_age: int =
3
, max_age: int =90
) pandas.core.frame.DataFrame [source]¶ Load the population data for the given province, starting year, and projection scenario.
The population data was generated by the
leap/data_generation/birth_data.py
script.- Parameters:¶
- province: str¶
The 2-letter abbreviation for the province.
- starting_year: int¶
The starting year for the population data.
- projection_scenario: str¶
The projection scenario for the population data.
- max_year: int¶
The maximum year for the population data.
- min_age: int =
3
¶ The minimum age for the population data.
- max_age: int =
90
¶ The maximum age for the population data.
- Returns:¶
A dataframe containing the Canadian population data. Columns:
year
: The year of the data.age
: A value in[min_age, max_age]
.province
: The 2-letter province abbreviation.sex
: One ofM
= male,F
= female.n
: The number of people in a given year, age, sex, province, and projection scenario.
-
leap.data_generation.exacerbation_data.exacerbation_calibrator(province: str =
'CA'
, starting_year: int =2000
, max_year: int =2065
, min_age: int =3
, max_age: int =90
, prob_hosp: float =0.026
, projection_scenario: str ='M3'
) pandas.core.frame.DataFrame [source]¶ Compute the ratio between the observed and predicted hospitalization rates.
- Parameters:¶
- province: str =
'CA'
¶ The 2-letter abbreviation for the province.
- starting_year: int =
2000
¶ The starting year for the calibration.
- max_year: int =
2065
¶ The maximum year for the calibration.
- min_age: int =
3
¶ The minimum age for the calibration.
- max_age: int =
90
¶ The maximum age for the calibration.
- prob_hosp: float =
0.026
¶ The probability of a very severe exacerbation, defined as an exacerbation that requires hospitalization.
- projection_scenario: str =
'M3'
¶ The projection scenario for the population data. One of:
LG
: low-growth projectionHG
: high-growth projectionM1
: medium-growth 1 projectionM2
: medium-growth 2 projectionM3
: medium-growth 3 projectionM4
: medium-growth 4 projectionM5
: medium-growth 5 projectionM6
: medium-growth 6 projectionFA
: fast-aging projectionSA
: slow-aging projection
- province: str =
- Returns:¶
year
: The year of the data, a value in[starting_year, max_year]
.age
: The integer age, a value in[min_age, max_age]
.sex
: One ofM
orF
.calibrator_multiplier
: The ratio between the observed and predicted number of hospitalizations.
- Return type:¶
A dataframe with the following columns