-
Notifications
You must be signed in to change notification settings - Fork 220
Network Generation
The Basic SEIRS Network Model and Extended SEIRS Network Model implement models of epidemic dynamics for populations with structured contact networks (as opposed to standard mean-field compartment models, which assume uniform mixing of the population). When using these network models, a graph specifying the contact network must be specified, where each node represents an individual in the population and edges connect individuals who have regular interactions.
In the SEIRS+ framework, the contact network defines the set of close contacts for each individual in the population (black edges). Close contacts are individuals with whom one has non-cursory (e.g., repeated, sustained, and/or physical) interactions on a regular basis, such as housemates, family members, close coworkers, close friends, etc. Casual contacts -- individuals with whom one has incidental, brief, or superficial contact on an infrequent basis (e.g., at the grocery store, on transit, at a public event, in the elevator) -- are also represented in these models in the form of a parallel mode of mean-field global transmission. The product of the network locality parameter p and the the respective global and local transmissibility parameters set the relative frequency and weight of transmission among close (local network) and casual (global) contacts in the modeled population.
Every population has unique patterns of interactions, and the unique properties of different contact network structures can have important impacts on epidemic dynamics and outcomes. It is important to carefully consider the contact patterns of each population of interest as well as the relevance of assumptions made by networks defined to represent them.
There are some properties that are shared by many human interaction networks, which the authors make an effort to capture in the contact networks generated for use with the SEIRS+ models:
- Heterogeneity: Degree (number of contacts per individual) varies across individuals and groups of like-individuals (e.g., age groups). Groups of individuals may differ in the numbers of within- and between-group contacts they make.
- Broad degree distribution: Most individuals have roughly average connectivity (degree), but there is individual variation around the mean degree (this is in contrast with, scale-free networks where most individuals have very low degree and the mode is often well below the mean).
- Heavy-tailed degree distribution: A small number of individuals have many more contacts than average, so the degree distribution tends to have a relatively long right tail.
- Assortativity: There tends to be correlation in degree between adjacent nodes in the contact network. That is, highly-connected individuals tend to have highly-connected contacts.
- Transitivity (aka clustering): Individuals A and B are relatively likely to be contacts of each other if they both share a mutual contact C.
- Community Structure: Contact networks often have communities of individuals (groups of nodes) that are more likely to be contacts of each other than they are to be with individuals from another community.
Contact networks can be defined and generated by any method that is appropriate for representing the user's population and scenario of interest. Network generation is not a focus of the SEIRS+ package itself, but a few network tools are briefly described here.
The networkx
package includes network generation functions for a number of classes of networks. Many of these are not particularly relevant to contact network structures of interest, but a few of them can be useful. In general it's best to closely tailor network definition to your population of interest, but networkx
is a readily available package and its off-the-shelf generators can be handy for quick exploration.
See networkx LFR_benchmark_graph
The LFR algorithm generates networks that have a known community structure and a broad (roughly bell-shaped) degree distribution with an exponential-like right tail. For an off-the-shelf generator, the LFR network has is in the ballpark.
Caveats: LFR networks typically have very low assortativity and transitivity, which are important features of many real contact networks. In addition, the implementation of the LFR algorithm in the networkx
generator function has a known bug that causes it to randomly fail to converge on a generated network in some attempts (calling the generator function until it successfully returns is one workaround).
See networkx barabasi_albert_graph
The Barabasi-Albert algorithm generates random scale-free networks using a preferential attachment mechanism. The power law degree distribution of the BA network is relevant to some human networks (e.g., the internet, citation networks, some social networks). However, BA networks do not have broad degree distributions or levles of assortativity, transitivity, or community structure that are reasonable for many such networks. That said, the networkx
BA generator is fast and reliable, so it can sometimes be useful for rapid testing and protoyping with the SEIRS+ models.
The FARZ algorithm generates networks with built-in community structure and broad, heavy-tailed distributions for the degree of nodes and sizes of communities. The FARZ algorithm has parameters for average degree, number of communities, strength of the community structure, the transitivity, assortativity (degree correlation), and the distribution of the community sizes. The tunability of these properties makes the FARZ generator an attractive method for generating contact networks for use with SEIRS+.
Code implementing the FARZ algorithm can be found on github, and a version of their generator function is included in the FARZ.py
module of the SEIRS+ package.
Parameter | Description | Data Type |
---|---|---|
n | number of nodes | REQUIRED |
m | number of edges created per node | REQUIRED |
k | number of communities | REQUIRED |
beta | probability of edges formation within communities, rather than between (strength of community structure) | 0.8 |
alpha | strength of common neighbor's effect on edge formation (tunes transitivity, clustering) | 0.5 |
gamma | strength of degree similarity effect on edge formation (tunes assortativity) | 0.5 |
r | maximum number of communities each node can belong to | 1 |
q | probability of a node belonging to the multiple communities | 0.5 |
phi | constant added to all community sizes, higher number makes the communities more balanced in size, 1 results in power law community size distribution |
10 |
epsilon | probability of noisy/random edges | 0.0000001 |
t | probability of also connecting to the neighbors of a node each nodes connects to (tunes transitivity, clustering) | 0 |
We define a function for generating community-level contact networks with realistic network properties as well as age-stratification, households, and communities (e.g., schools, workplaces) that are calibrated to demographic statistics for a population of interest.
Each node is assigned an age bracket (0-9, 10-19, … 70-79, 80+) according to population-level age distribution (e.g. from census data). FARZ network layers are generated to represent the out-of-household regular contacts amongst individuals of certain age groups (i.e., children, adults, seniors). FARZ networks have a community structure, parameterized in this function such that half of an individuals connections are with members of their own community and half of their connections are with individuals from outside their own community. Separate FARZ network layers are generated for the 0-9 age group (communities can be thought of as primary schools), the 10-19 age group (communities can be thought of as secondary schools), the 20-59 age group (communities can be thought of as workplaces), and the 60+ age group. The degree distribution of these networks are broad with a heavy tail. The mean degree for each layer is calibrated to avaverage number of contacts by age group from this study.
Nodes are divvied up into households, such that the distribution of household sizes and the household age demographics data provided to the function. All of the nodes in a household are strongly connected, which rivots together the LFR layers for each age group. The resulting graph ends up resembling age-age interaction matrices estimated by this study (but this data isn’t used directly).
In the SEIRS+ network models, there is also a probability p of well-mixed global interactions (nodes interacting with a randomly drawn node from anywhere in the network), which is an avenue for both within- and between-age-group contacts.
This network generation function can also return versions of the same contact network where social distancing and/or age group isolation ("cocooning") has been applied.
This function generates networks that are calibrated to age distribution, household size, and household age composition figures that are specified by the user. The function expects these statistics to be provided in a dict
that has the following structure (figures shown are from US census data).
household_data = {
'age_distn':{'0-9': 0.121, '10-19': 0.131, '20-29': 0.137, '30-39': 0.133, '40-49': 0.124, '50-59': 0.131, '60-69': 0.115, '70-79': 0.070, '80+' : 0.038 },
'household_size_distn':{ 1: 0.284, 2: 0.345, 3: 0.151, 4: 0.128, 5: 0.058, 6: 0.023, 7: 0.012 },
'household_stats':{ 'pct_with_under20': 0.337, # percent of households with at least one member under 60
'pct_with_over60': 0.380, # percent of households with at least one member over 60
'pct_with_under20_over60': 0.034, # percent of households with at least one member under 20 and at least one member over 60
'pct_with_over60_givenSingleOccupant': 0.110, # percent of households with a single-occupant that is over 60
'mean_num_under20_givenAtLeastOneUnder20': 1.91 # number of people under 20 in households with at least one member under 20
}
}
The age brackets to be included in each network layer and the target mean degrees for each layer are defined by the layer_info
dictionary. The default dictionary defining the layers is the following, which is based on degree data from this study:
layer_info = { '0-9': {'ageBrackets': ['0-9'], 'meanDegree': 8.6, 'meanDegree_CI': (0.0, 17.7) },
'10-19': {'ageBrackets': ['10-19'], 'meanDegree': 16.2, 'meanDegree_CI': (12.5, 19.8) },
'20-59': {'ageBrackets': ['20-29', '30-39', '40-49', '50-59'], 'meanDegree': ((age_distn_given20to60['20-29']+age_distn_given20to60['30-39'])*15.3 + (age_distn_given20to60['40-49']+age_distn_given20to60['50-59'])*13.8), 'meanDegree_CI': ( ((age_distn_given20to60['20-29']+age_distn_given20to60['30-39'])*12.6 + (age_distn_given20to60['40-49']+age_distn_given20to60['50-59'])*11.0), ((age_distn_given20to60['20-29']+age_distn_given20to60['30-39'])*17.9 + (age_distn_given20to60['40-49']+age_distn_given20to60['50-59'])*16.6) ) },
'60+': {'ageBrackets': ['60-69', '70-79', '80+'], 'meanDegree': 13.9, 'meanDegree_CI': (7.3, 20.5) } }
The user can provide their own dictionary with the same structure to override the default layer definitions above.
This function can optionally return a version of the generated network where social distancing has been applied by using the edge pruning mechanism of the custom_exponential_graph()
function (also included in this package). The user provides a list of distancing magnitude values to the distancing_scales
argument of the generate_demographic_contact_network()
function (which are passed to the scale
argument of the custom_exponential_graph()
function; the smaller the scale value, the more edge pruning and thus distancing is applied. A version of the generated network is returned for every distancing scale in the list provided to distancing_scales
(in addition to the baseline network).
This function can optionally return a version of the generated network where some age groups have been isolated by having their out-of-household connections removed (within-household connections remain). The user provides a list of age group labels ('0-9'
, '10-19'
, '20-29'
, etc.) to the isolation_groups
argument of the generate_demographic_contact_network()
function. If one or more age groups are provided to isolation_groups
, a version of the generated network is returned where all specified age groups have had their out-of-household edges removed (in addition to the baseline network).
The function that performs this network generation has the following arguments
Argument | Description | Data Type | Default Value |
---|---|---|---|
N |
total number of nodes in the population | int |
REQUIRED |
demographic_data |
dictionary specifying age and household composition distributions See Demographic calibration for more info |
dict |
REQUIRED |
layer_generator |
The algorithm to use in generating in network layer ('FARZ' or 'LFR'
|
string |
'FARZ' |
layer_info |
dictionary specifying the age groups and mean degree targets for each network layer See Layer definitions for more info |
dict |
None *(use default layers) |
distancing_scales |
list of social distancing scales for which versions of the network should be returned See Social distancing for more info |
list |
[] |
isolation_groups |
list of age groups for which a version of the network should be returned with their out-of-household edges removed See Age group isolation for more info |
list |
[] |
This function returns the following
Returned | Description | Data Type |
---|---|---|
graphs |
list of graphs (networks) generated, includes baseline network always, includes distancing and/or age group isolation versions as applicable | list of networkx Graph objects |
individualAgeBracketLabels |
list of the age groups assigned to each of the N nodes |
list of strings
|
households |
list of lists giving the node IDs for each household |
list of lists of ints
|
We define a function for generating contact networks that resemble workplaces and other multi-level modular groups.
FARZ network layers are generated to represent cohorts of employees (e.g., departments, floors, shifts). FARZ networks have a tunable community structure, so each cohort includes some number of communities, which can be thought to represent teams (i.e., groups of employees that work closely with each other). Employees may belong to more than one team (specified by a FARZ parameter), but employees belong to only one cohort. An employee's intra-team and intra-cohort contacts are defined by the FARZ cohort network they belong to. A specified percentage of each employee's total number of workplace contacts can be with individuals from other cohorts. An employee's inter-cohort contacts are drawn randomly from the pool of individuals outside their own cohort.
The number of cohorts, number of employees per cohort, number of teams per cohort, number of teams employees belong to, mean intra-cohort degree, percent of within- and between-team connections, and percent of intra- and inter-cohort connections can be controlled with the arguments to the generate_demographic_contact_network()
function (some of which are passed as parameters to the FARZ generator).
This function can optionally return a version of the generated network where social distancing has been applied by using the edge pruning mechanism of the custom_exponential_graph()
function (also included in this package). The user provides a list of distancing magnitude values to the distancing_scales
argument of the generate_demographic_contact_network()
function (which are passed to the scale
argument of the custom_exponential_graph()
function; the smaller the scale value, the more edge pruning and thus distancing is applied. A version of the generated network is returned for every distancing scale in the list provided to distancing_scales
(in addition to the baseline network).
The function that performs this network modification has the following arguments
Argument | Description | Data Type | Default Value |
---|---|---|---|
num_cohorts |
number of cohort layers to generate | int |
1 |
num_nodes_per_cohort |
number of nodes per cohort number of nodes per FARZ layer, FARZ param n (can provide single value to use in all cohorts or list of values for each cohort) |
int or list
|
100 |
num_teams_per_cohort |
number of teams per cohort number of communities per FARZ layer, FARZ param k (can provide single value to use in all cohorts or list of values for each cohort) |
t | 10 |
mean_intracohort_degree |
mean number of within cohort contacts per individual mean degree per FARZ layer, FARZ param m (can provide single value to use in all cohorts or list of values for each cohort)* |
t | 6 |
pct_contacts_intercohort |
percentage of each employee's total workplace contacts (total degree) that are inter-cohort interactions | float |
0.2 |
farz_params |
dictionary specifying parameters for the FARZ network generator, other than params n , k , and m which are given by the arguments abovedefault: {'alpha':5.0, 'gamma':5.0, 'beta':0.5, 'r':1, 'q':0.0, 'phi':1, 'b':0, 'epsilon':1e-6}
|
dict |
see left |
distancing_scales |
list of social distancing scales for which versions of the network should be returned See Social distancing for more info |
list |
[] |
This function returns the following
Returned | Description | Data Type |
---|---|---|
workplaceNetwork |
dictionary of graphs (networks) generated, includes baseline network always, includes distancing and/or age group isolation versions as applicable |
dict of networkx Graph objects |
cohorts_indices |
list of lists giving the node IDs belonging to each cohort |
list of lists of ints
|
teams_indices |
list of lists giving the node IDs belonging to each team |
list of lists of ints
|
This function defines an edge pruning mechanism that returns a modified version of a graph where a subset of the original edges have been removed.
Here is the process:
- For each node:
- Count the number of neighbors of the node N
- Draw a random number R from an exponential distribution with some mean=M. If R > N, set R=N.
- Randomly select R of this node’s neighbors to keep, delete the edges to all other neighbors.
This results in network whose set of edges are a subset of the original network's edges and where the mean degree has been decreased.
This method is useful for generating quarantine or social distancing versions of a baseline contact network.
The function that performs this network modification has the following arguments
Argument | Description | Data Type | Default Value |
---|---|---|---|
base_graph |
the base graph that is to be modified *(if None provided, this function will generate a BA network as a starting point using to the optional m and n arguments |
networkx Graph object |
None |
scale |
the scale (mean) of the exponential distribution used in the edge pruning method (denoted M above). The smaller the scale, the more edges are pruned. | int |
100 |
min_num_edges |
A minimum number of edges to ensure all nodes are left with after pruning | int |
0 |
m |
the m argument of the networkx BA network generator (only relevant if no base_graph provided |
int |
9 |
n |
the size (num nodes) of the networkx BA network to be generated (only relevant if no base_graph provided |
int |
None |
Extended SEIRS Model
Basic SEIRS Model
Simulation Demos