Antibiotic Data¶
leap.data_generation.antibiotic_data module¶
-
leap.data_generation.antibiotic_data.estimate_alpha(df: pandas.core.frame.DataFrame, formula: str, offset: numpy.ndarray | None =
None
, maxiter: int =5000
) float [source]¶ Estimate the alpha parameter for the negative binomial model.
The \(\alpha\) parameter is the dispersion parameter for the negative binomial model:
\[\alpha := \dfrac{1}{\theta} = \dfrac{\sigma^2 - \mu}{\mu^2}\]- Parameters:¶
- df: pandas.core.frame.DataFrame¶
A Pandas dataframe with data to be fitted.
- formula: str¶
The formula for the GLM model. See the statsmodels documentation for more information.
- offset: numpy.ndarray | None =
None
¶ The offset to use in the model, if desired.
- maxiter: int =
5000
¶ The maximum number of iterations to perform while fitting the model.
- Returns:¶
The estimated alpha parameter for the negative binomial model.
-
leap.data_generation.antibiotic_data.load_birth_data(province: str =
'BC'
, min_year: int =2000
, max_year: int =2018
) pandas.core.frame.DataFrame [source]¶ Load the StatCan birth data.
- leap.data_generation.antibiotic_data.load_antibiotic_data() pandas.core.frame.DataFrame [source]¶
Load the antibiotic dose data.
The antibiotic prescription data is from the BC Ministry of Health and contains the total number of courses of antibiotics dispensed to infants, stratified by year and sex, ranging from 2000 to 2018.
The birth data is from StatCan census data and contains the number of births in BC, stratified by year and sex.
- Returns:¶
year (int)
: The calendar year.sex (str)
: One ofM
= male,F
= female.n_abx (int)
: The number total number of courses of antibiotics dispensed toinfants in BC for the given year and sex.
n_birth (int)
: The number of births in BC for the given year and sex.
- Return type:¶
A Pandas dataframe. Columns
-
leap.data_generation.antibiotic_data.generate_antibiotic_model(df_abx: pandas.core.frame.DataFrame, formula: str =
'n_abx ~ year + sex + heaviside(year, 2005) * year'
, alpha: float =1.0
, maxiter: int =1000
) statsmodels.genmod.generalized_linear_model.GLMResultsWrapper [source]¶ Generate a generalized linear model for antibiotic dose.
In this function, we fit a generalized linear model (GLM) to the antibiotic prescription data using the negative binomial family. The model predicts the number of courses of antibiotics dispensed to infants in BC, given the year and sex.
For more details, see Antibiotic Exposure Model.
- Parameters:¶
- df_abx: pandas.core.frame.DataFrame¶
The antibiotic prescription data. Contains the following columns:
year (int)
: The calendar year.sex (str)
: One ofM
= male,F
= female.n_abx (int)
: The number total number of courses of antibiotics dispensed to infants in BC for the given year and sex.n_birth (int)
: The number of births in BC for the given year and sex.
- formula: str =
'n_abx ~ year + sex + heaviside(year, 2005) * year'
¶ The formula for the GLM model. See the statsmodels documentation for more information.
- alpha: float =
1.0
¶ The alpha parameter for the negative binomial model. This is the dispersion parameter, which controls the variance of the model.
- maxiter: int =
1000
¶ The maximum number of iterations to perform while fitting the model.
- Returns:¶
The fitted
GLM
model.
-
leap.data_generation.antibiotic_data.get_predicted_abx_data(model: statsmodels.genmod.generalized_linear_model.GLMResultsWrapper, df: pandas.core.frame.DataFrame | None =
None
, min_year: int =2000
, max_year: int =2019
) pandas.core.frame.DataFrame [source]¶ Get predicted data from a GLM model.
The GLM model must be fitted on the following columns:
year (int)
: The calendar year.sex (int)
: One of0
= female,1
= male.
- Parameters:¶
- model: statsmodels.genmod.generalized_linear_model.GLMResultsWrapper¶
The fitted GLM model for predicting the number of courses of antibiotics during the first year of life, given year and sex.
- df: pandas.core.frame.DataFrame | None =
None
¶ (optional) If provided, the function will use this dataframe to predict the data. The dataframe must contain the following columns:
year (int)
: The calendar year.sex (str)
: One ofM
= male,F
= female.
If not provided, the function will generate a dataframe with all combinations of year and sex in the range of
min_year
tomax_year
.- min_year: int =
2000
¶ The minimum year to predict.
- max_year: int =
2019
¶ The maximum year to predict.
- Returns:¶
A dataframe containing the predicted number of antibiotics prescribed per person during infancy for a given birth year and sex. Columns:
year (int)
: The calendar year.sex (str)
: One ofM
= male,F
= female.n_abx_μ (float)
: The predicted number of antibiotics prescribed per person during infancy for the given birth year and sex.
-
leap.data_generation.antibiotic_data.generate_antibiotic_data(return_type: str =
'csv'
) statsmodels.genmod.generalized_linear_model.GLMResultsWrapper | None [source]¶ Fit a
GLM
for antibiotic prescriptions in the first year of life and generate data.- Parameters:¶
- return_type: str =
'csv'
¶ The type of data to return. If
csv
, the function will save a CSV file with the predicted data. Ifmodel
, the function will return the fitted GLM model.
- return_type: str =
- Returns:¶
None
ifreturn_type
iscsv
, otherwise a fittedGLM
model for predicting the number of antibiotic prescriptions during the first year of life.