Home page

Objectives

The aim of this tutorial is to run scenario analyses with an individual-based model in R that accounts for location specific mixing.

In this session, you will

  1. Run an IBM with household, school/work and community mixing.
  2. Discuss the impact of transmission parameters and features

The best thing to do is to read each section and type (or copy and paste) the R commands (in grey boxes) into your own R session to check that you understand the code. Hands-on questions are indicated in red

Modelling concepts

The individual-based model will make use of a virtual population in which each individual is part of a household, school/work (based on age), and the general community. Each day (= time step), individuals can interact at home, school/work and in the community. We specify for each location a contact probability or average number of contacts per day.

Set working directory

The first step when creating a new R file is to specify a working directory. You need to tell the R session where you want it to work. To do this you use the command to “set the working directory” or setwd().

You can label this working directory as ‘home’, and have other locations stored for plotting or for getting data. This location should be an existing folder on your computer. For example

home <- "~/Documents/Modelling_intro/" ## on a OS X
home <- "C:\\Documents\\Modelling_intro\\" ## on windows
setwd(home)

Load model code

If the curl package is not installed yet, you can do so by

install.packages(curl) # install the curl library (to import data)

Please load the IBM functions we prepared on the GitHub repository. Using your skills from the previous tutorial(s), you should be able to code this model yourselves.

# load package 'curl'
library('curl')

# download R file with IBM function
curl_download("https://github.com/lwillem/modelling-intro/raw/master/lib/ibm_functions.R", destfile = "ibm_functions.R")

# load the IBM functions
source('ibm_functions.R')

Household setup

In our simulation model, we include households composed of two adults, as well as households with two adults and two children. Ages within a household are not randomly distributed. Typically, there is a limited age gap between partners, and if children are present, there is also a fertility-based age gap between the children and their parents. In addition, households tend to have specific age gaps between siblings. The location-specific IBM uses the create_population_matrix(...) function to generate a population. Inspect the initial lines of this function to determine the rules that govern the stochastic setup of the population:

head(create_population_matrix,9)
##                                                                        
## 1 function (pop_size, num_schools, target_school_ages, num_workplaces, 
## 2     bool_show_demographics = TRUE)                                   
## 3 {                                                                    
## 4     ages_adult <- 19:60                                              
## 5     ages_child <- 1:18                                               
## 6     adult_age_tolerance <- 0:5                                       
## 7     child_age_tolerance <- 1:4                                       
## 8     household_age_gap_min <- 22                                      
## 9     household_age_gap_max <- 35

School and workplace setup

In daily life, we visit locations suited to our tasks, which are age-specific. In our model, we assume age-based school classes but no age-specific selection of workplaces. We provide options to specify the number of schools (set_schools(...)) and the number of workplaces (set_workplaces(...)). Assignment to a specific school or workplace is carried out randomly using the sample() function. Inspect the functions for school classes and workplaces that govern the stochastic setup of the population:

# inspect the set_schools function (by omitting the brackets = function call)
set_schools
## function (pop_data, num_schools, target_school_ages) 
## {
##     if (num_schools > 0) {
##         pop_data$classroom_id <- paste0("class", pop_data$age, 
##             "_", sample(num_schools, nrow(pop_data), replace = TRUE))
##         boolean_school_pop <- pop_data$age %in% target_school_ages
##         pop_data$classroom_id[!boolean_school_pop] <- NA
##     }
##     else {
##         pop_data$classroom_id <- NA
##     }
##     return(pop_data)
## }
# inspect the set_workplaces function (by omitting the brackets = function call)
set_workplaces
## function (pop_data, num_workplaces, target_school_ages) 
## {
##     if (num_workplaces > 0) {
##         pop_data$workplace_id <- sample(num_workplaces, nrow(pop_data), 
##             replace = TRUE)
##         boolean_workplace_pop <- pop_data$age <= max(target_school_ages)
##         pop_data$workplace_id[boolean_workplace_pop] <- NA
##     }
##     else {
##         pop_data$workplace_id <- NA
##     }
##     return(pop_data)
## }

Base case scenario

First, let’s run the model with the default parameters:

# you can run the location-specific IBM with default parameters (see console)
run_ibm_location()

## [1] "MODEL PARAMETERS"
## [1] "pop_size: 2000"
## [1] "num_days: 50"
## [1] "num_infected_seeds: 3"
## [1] "vaccine_coverage: 0"
## [1] "rng_seed: 51"
## [1] "num_days_infected: 7"
## [1] "transmission_prob: 0.1"
## [1] "num_contacts_community_day: 4"
## [1] "contact_prob_household: 1"
## [1] "contact_prob_school: 0.5"
## [1] "contact_prob_workplace: 0.3"
## [1] "num_schools: 2"
## [1] "target_school_ages: 3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18"
## [1] "num_workplaces: 150"
## [1] "bool_show_demographics: TRUE"
## [1] "-------------"
## [1] "MODEL RESULTS"
## [1] "total incidence: 100%"
## [1] "Peak prevalence: 58%"
## [1] "Peak day:        13"
## [1] "Total run time:  3s"

What can you learn from these graphs and output? Can you confirm the population rules based on the graphs above?

Explore some parameters

The “model parameters” from the model printout above can be used as function arguments to modify the default settings. For example, you can change the number of community contacts per day by adjusting num_contacts_community_day.

# increase the number of contacts in the community per day to 6 (= 'num_contacts_community_day')
run_ibm_location(num_contacts_community_day = 6)

# set num_schools to 5
run_ibm_location(num_schools = 5)

# set transmission_prob to 0.15 (and switch off the demographic plots)
run_ibm_location(transmission_prob = 0.15,
                 bool_show_demographics = FALSE)

Scenarios

Now, what happens if you…

  • change the population size to 100 or 2000?
  • start with more infected individuals?
  • add more schools
  • decrease the number of workplaces
  • allow only household contacts?
  • allow only community contacts?
  • allow only school contacts?
  • allow only household and school contacts?
  • allow only household and workplace contacts?
  • … etc.

(TIP: the run_ibm_location function has an option to add_baseline = TRUE to add model results with the default parameters. However, this increases the run time since the model is executed twice.)