My first CTMC algorithm

My first attempt at fitting a Continous Time Markov Chain (CTMC) to simulated data in a toy problem. Mathematical background on how to build the likelihood can be found in here.

With all the models in this notebook, I will be simulating data with the Gillespie direct algorithm (see Gillespie 1977, or here for a simpler explanation), and then inferring the parameters of a CTMC model from the data, using a Bayesian approach and HMC through the software package Stan.

First toy problem

In this toy problem, we begin with a simple 3 state model, where we follow a single individual, who can be either susceptible (S), depressed (D) or happy (H).

The transitions will be defined as follows:

\begin{align} S \to D & &\text{with rate} & &\beta,\\ D \to H & &\text{with rate} & &\gamma,\\ H \to S & &\text{with rate} & &\omega.\\ \end{align}

To generate the data, I will first define an object for the model, named ctmc, with attributes for the initial state and the parameters for the CTMC. Within the class, I will also define the functions that will simulate a single transition for the chain, and then simulate up to a certain time.

In [1]:
class ctmc:    
    """Class object to contain the ctmc model"""
    import numpy as np
    
    def __init__(self,beta, gamma,omega,init):
        self.parameters = (beta,gamma,omega)
        self.initial = init
        self.current = list(init)
        self.totalpop = sum(init)
    
    def transition(self):
        #First define transition movements
        
        def depression():
            #being depressed
            self.current[0] -= 1
            self.current[1] += 1

        def recovery():
            #Recovery to happiness 
            self.current[1] -=1
            self.current[2] +=1
            
        def waning():
            #Recovery to happiness 
            self.current[2] -=1
            self.current[0] +=1

        #map the functions to the respective index in the transition
        map_transitions = {
            0 : depression,
            1 : recovery,
            2 : waning,
        }

        #Calculate transitions rates
        trans = beta * self.current[0]
        recov = gamma * self.current[1]
        wan  = omega * self.current[2]
        total = trans + recov + wan
        
        #Record current rates
        self.currentrates = (trans, recov, wan)
        
        #Draw a transition time
        dt = -1*np.log(np.random.random())/total#np.random.exponential(1/total) #numpy uses scale parameter, inverse of rate
        
        #Draw a transition
        p = np.array(self.currentrates)/total
        s = np.random.choice(3, p=p)
        map_transitions[s]() #map choice to correct transition function
        #return time to transition and what kind of transition occured
        return dt, s
    
    def simulate(self,simtime):
        time = 0
        observations = []
        while time < simtime:
            S,D,H = self.current.copy()
            dt, s = self.transition()

            time += dt
            if time > simtime:
                continue
            else:
                #record ctmc state and clock up to the moment of transition
                observations.append((time,dt,s,S,D,H))
        return observations

Now that we have the object, let's simulate some data that we will draw inference from.

In [2]:
#Simulate a path in the CTMC
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd


init = np.array((1,0,0))
beta = 2
gamma = 0.125
omega = 0.3
nu = 1

toy = ctmc(beta, gamma, omega,init)
obs = toy.simulate(200)

data = pd.DataFrame(obs, columns = ['time','dt','transition','S','D','R'])
data.time = data.time.shift(1).fillna(0)
print(data.head())
print("We have " + format(data.shape[0]) +" transitions (data points) in this simulation.")
fig,ax = plt.subplots(figsize=(12,9))
ax.bar(data.time, data.S, width=data.dt, align='edge', label='S', color='b')
ax.bar(data.time, data.D, width=data.dt, align='edge', label='D', color = 'r')
ax.set_xlabel("Days")
ax.set_ylabel("State active")
plt.legend()
plt.show()
       time        dt  transition  S  D  R
0  0.000000  0.438339           0  1  0  0
1  0.438339  2.030657           1  0  1  0
2  2.468996  5.341321           2  0  0  1
3  7.810318  0.022213           0  1  0  0
4  7.832530  5.123805           1  0  1  0
We have 50 transitions (data points) in this simulation.

The above plot visualises this specific instance of the CTMC, with a solid bar for each duration within each state. With white chunks representing time in the H state. Remember that the CTMC is a random process, defined only by its transitions and states, and this is just one possible instance or path.

With this data, now we're going to take only the observed times and the current states to infer the parameters that were used to generate the data. In the next block of code, I construct the Bayesian model in a Stan object, and using the data dataframe for inference, draw samples of the posterior for the paramaters $\beta, \gamma$ and $\omega$. We can then compare the distribution of fit parameters to our original values.

In [3]:
import pystan

stancode = """
data {
    int<lower=0> N; //length of data
    real<lower=0> dt[N]; //time
    int<lower=0> transitions[N]; //transition code
}
parameters {
    real<lower=0> bet;
    real<lower=0> gamm;
    real<lower=0> omeg;
}
model {
    bet ~ gamma(1.5,1.5); //priors
    gamm ~ gamma(1.5,1.5);
    omeg ~ gamma(1.5,1.5);
    
    for (n in 1:N){
        if (transitions[n] == 0) {
            //Susceptible to Depressed
            dt[n] ~ exponential(bet);
            }
        else if (transitions[n] == 1) {    
            //Depressed to Happy
            dt[n] ~ exponential(gamm);
            }
        else if (transitions[n] == 2) {
            //Happy to Susceptible
            dt[n] ~ exponential(omeg);
        }
    } 
}
"""

data_dict = {
    'N': data.shape[0],
    'dt': data['dt'].values,
    'transitions': data.transition.astype(int).values
}

sm = pystan.StanModel(model_code=stancode)
iterations = 1000
chains = 2

fit = sm.sampling(data=data_dict, iter=iterations, chains=chains)
print(fit)
print(toy.parameters)
INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_2a7a4c6f792505cb2e73917e7953371c NOW.
Inference for Stan model: anon_model_2a7a4c6f792505cb2e73917e7953371c.
2 chains, each with iter=1000; warmup=500; thin=1; 
post-warmup draws per chain=500, total post-warmup draws=1000.

       mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
bet    2.65    0.02   0.62   1.54   2.22   2.61   3.05   4.01    983    1.0
gamm   0.15  1.3e-3   0.03   0.09   0.12   0.14   0.17   0.22    674    1.0
omeg   0.24  1.6e-3   0.05   0.14   0.21   0.24   0.28   0.36   1126    1.0
lp__  -97.9    0.06   1.23 -100.9 -98.47 -97.55 -97.02 -96.56    483   1.01

Samples were drawn using NUTS at Tue Feb  4 14:55:34 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
(2, 0.125, 0.3)

Simulate a (harder) toy problem

Now let us extend our simple 3 state CTMC model, before going into epidemic specific models.

We will model a single individual suffering from depression again, however, the individual can now also take precautionary measures (i.e. seek help), and move from being susceptible to happy with rate $\nu$.

The transitions will be defined as follows:

\begin{align} S \to D & &\text{with rate} & &\beta,\\ D \to H & &\text{with rate} & &\gamma,\\ H \to S & &\text{with rate} & &\omega,\\ S \to H & &\text{with rate} & &\nu.\\ \end{align}

This time the CTMC is more complex, as when in state S, the chain can jump to either state D or state H. This needs to be accounted for in the likelihood, where a transition out of state S is an exponential clock of all the rates out of S, times the probability the transition ends up at specifically D or H.

For simplicity, we can define the set of parameters to be $\boldsymbol{\theta} = (\beta, \gamma, \omega, \nu)$. For our toy problem, let us set $\boldsymbol{\theta} = ( 2, 0.125, 0.3, 1)$, with an inital start point at S, i.e. $\boldsymbol{n} = (1,0,0)$. Note that our units of time are days.

In [4]:
class ctmc:    
    """Class object to contain the ctmc model"""
    import numpy as np
    
    def __init__(self,beta, gamma, omega, nu,init):
        self.parameters = (beta, gamma, omega,nu)
        assert len(init)==3, "Number of states given is not 3"
        self.initial = init
        self.current = list(init)
        self.totalpop = sum(init)
    
    def transition(self):
        #First define transition movements
        
        def depression():
            #being depressed
            self.current[0] -= 1
            self.current[1] += 1

        def recovery():
            #Recovery to happiness 
            self.current[1] -=1
            self.current[2] +=1
        def waning():
            #back to susceptible
            self.current[2] -=1
            self.current[0] +=1
        def vacc():
            #susceptible to happy
            self.current[0] -=1
            self.current[2] +=1

        #map the functions to the respective index in the transition
        map_transitions = {
            0 : depression,
            1 : recovery,
            2 : waning,
            3 : vacc,
        }

        #Calculate transitions rates
        trans = beta * self.current[0]
        recov = gamma * self.current[1]
        wane = omega * self.current[2]
        vac = nu * self.current[0]
        total = trans + recov + wane + vac
        
        #Record current rates
        self.currentrates = (trans, recov, wane, vac)
        
        #Draw a transition time
        dt = -1*np.log(np.random.random())/total#np.random.exponential(1/total) #numpy uses scale parameter, inverse of rate
        
        #Draw a transition
        p = np.array(self.currentrates)/total
        s = np.random.choice(4, p=p)
        map_transitions[s]() #map choice to correct transition function
        #return time to transition and what kind of transition occured
        return dt, s
    
    def simulate(self,simtime):
        time = 0
        observations = []
        while time < simtime:
            S,D,H = self.current.copy()
            dt, s = self.transition()

            time += dt
            if time > simtime:
                continue
            else:
                
                observations.append((time,dt,s,S,D,H))
        return observations

Now that we have our functions and simulation coded, let's run a simulation and observe a possible path for this kind of CTMC. Let's plot the path of the state space over 5,000 days.

In [5]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd


init = np.array((1,0,0))
beta = 2
gamma = 0.125
omega = 0.3
nu = 1

toy = ctmc(beta, gamma, omega,nu, init)
obs = toy.simulate(5000)

data = pd.DataFrame(obs, columns = ['time','dt','transition','S','D','H'])
data.time = data.time.shift(1).fillna(0)
#print(data)
print(data.shape)
fig,ax = plt.subplots(figsize=(12,9))
ax.bar(data.time, data.S, width=data.dt, align='edge', label='S', color='b')
ax.bar(data.time, data.D, width=data.dt, align='edge', label='D', color = 'r')
ax.bar(data.time, data.H, width=data.dt, align='edge', label='H', color='g')
ax.set_xlabel("Days")
ax.set_ylabel("State active")
plt.legend()
plt.show()
(1420, 6)
In [6]:
import pystan

stancode = """
functions {
    real ctmc_lpdf( real dt, real bet, real total) {
        return log(bet) - log(total) + exponential_lpdf( dt | total);
    } 
}
data {
    int<lower=0> N; //length of data
    real<lower=0> dt[N]; //time
    int<lower=0> transitions[N]; //transition code
}
parameters {
    real<lower=0> bet;
    real<lower=0> gamm;
    real<lower=0> omega;
    real<lower=0> nu;
}

model {
    bet ~ gamma(1.5,1.5);
    gamm ~ gamma(1.5,1.5);
    omega ~ gamma(1.5,1.5);
    nu ~ gamma(1.5,1.5);
    
    for (n in 1:N){
        if (transitions[n] == 0) {
            //Susceptible to Depressed
            dt[n] ~ ctmc(bet, bet+nu);
            }
        else if (transitions[n] == 1) {    
            //Depressed to Happy
            dt[n] ~ exponential(gamm);
            }
        else if (transitions[n] == 2) {
            //Happy to Susceptible
            dt[n] ~ exponential(omega);
        }
        else if (transitions[n] == 3) {
            //Susceptible to Happy
            dt[n] ~ ctmc(nu, nu+bet);
        }
    } 
}
"""

data_dict = {
    'N': data.shape[0],
    'dt': data['dt'].values,
    'transitions': data.transition.astype(int).values
}

sm = pystan.StanModel(model_code=stancode)
iterations = 1000
chains = 2

fit = sm.sampling(data=data_dict, iter=iterations, chains=chains)
print(fit)
print(toy.parameters)
INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_1fc4828fe1988abf5de967e02af143f1 NOW.
Inference for Stan model: anon_model_1fc4828fe1988abf5de967e02af143f1.
2 chains, each with iter=1000; warmup=500; thin=1; 
post-warmup draws per chain=500, total post-warmup draws=1000.

        mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
bet     2.18  3.5e-3   0.11   1.96    2.1   2.17   2.25    2.4   1077    1.0
gamm    0.12  2.2e-4 6.6e-3   0.11   0.12   0.12   0.13   0.14    935    1.0
omega   0.28  3.6e-4   0.01   0.26   0.27   0.28   0.29    0.3   1100    1.0
nu      0.97  2.7e-3   0.08   0.83   0.92   0.97   1.02   1.13    842    1.0
lp__   -2583    0.06    1.5  -2587  -2584  -2583  -2582  -2581    646    1.0

Samples were drawn using NUTS at Tue Feb  4 14:56:25 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
(2, 0.125, 0.3, 1)
In [7]:
fit.plot(pars=['bet'])
fit.plot(pars=['gamm'])
fit.plot(pars=['omega'])
fit.plot(pars=['nu'])
plt.show()
WARNING:pystan:Deprecation warning. In future, use ArviZ library (`pip install arviz`)
WARNING:pystan:Deprecation warning. In future, use ArviZ library (`pip install arviz`)
WARNING:pystan:Deprecation warning. In future, use ArviZ library (`pip install arviz`)
WARNING:pystan:Deprecation warning. In future, use ArviZ library (`pip install arviz`)

Maximum Likelihood Estimate

We know from our mathematical source (equation 13) that the Maximum Likelihood Estimate for the rates $r_{i,j}$ are:

\begin{align} \hat{r_{i,j}} & = \frac{N_{i,j}} {\sum dt_{i} }\\ \end{align}

where $N_{i,j}$ is the number of transitions from i to j in the data, and $\sum dt_i$ is the total length of time spent in state i.

In [8]:
##MLE
bet_hat = data.loc[data.transition==0].S.sum()/data.loc[data.S==1].dt.sum()
gamm_hat = data.loc[data.transition==1].D.sum() /data.loc[data.D==1].dt.sum()
omeg_hat = data.loc[data.transition==2].H.sum() / data.loc[data.H==1].dt.sum()
nu_hat = data.loc[data.transition==3].S.sum() /data.loc[data.S==1].dt.sum() 
print(bet_hat, gamm_hat, omeg_hat, nu_hat)
print(toy.parameters)
2.1812218168173647 0.12386175615806391 0.28014523557057047 0.9740798798389876
(2, 0.125, 0.3, 1)

Simulate epidemic problem

First we will simulate some data. We will use a a simple 3 state model, a typical SIRS epidemic model with waning immunity, on a closed population of 100 people. This is similar to our Depression-Happiness model earlier, but now with 100 people, and where the rate of transmission is dependent on the specific number of S and I.

The transitions will be defined as follows:

\begin{align} S \to I & &\text{with rate} & &\beta \frac{SI}{N},\\ I \to R & &\text{with rate} & &\gamma I,\\ R \to S & &\text{with rate} & &\omega R.\\ \end{align}

For simplicity, we can define the set of parameters to be $\boldsymbol{\theta} = (\beta, \gamma, \omega)$. For our toy problem, let us set $\boldsymbol{\theta} = ( 2, 0.125, 1/400)$, with an initial state (starting population) completely susceptible with 1 infected seed, i.e. $\boldsymbol{n} = (95,5, 0)$. Note that our units of time are days.

In [9]:
class sirvs:
    
    """Class object to contain the ctmc model"""
    import numpy as np
    
    def __init__(self,beta, gamma, omega, init):
        self.parameters = (beta, gamma, omega)
        assert len(init)==3, "Number of states given is not 3"
        self.initial = init
        self.current = list(init)
        self.totalpop = sum(init)
    
    def transition(self):
        #First define transition movements
        #0 = S, 1 = I, 2 = R, 3 = V 
        def infection():
            #Infected transmission
            self.current[0] -= 1
            self.current[1] += 1

        def recovery():
            #Recovery 
            self.current[1] -=1
            self.current[2] +=1
        def waning():
            #waning immunity
            self.current[2] -=1
            self.current[0] +=1
        def vacc():
            #vaccination
            self.current[0] -=1
            self.current[3] +=1

        #map the functions to the respective index in the transition
        map_transitions = {
            0 : infection,
            1 : recovery,
            2 : waning,
            3 : vacc,
        }

        #Calculate transitions rates
        trans = beta * self.current[0]*self.current[1]/self.totalpop
        recov = gamma * self.current[1]
        wane = omega * self.current[2]
        #vac = nu * self.current[1] * self.current[0]/self.totalpop
        total = trans + recov + wane #+ vac
        
        #Record current rates
        self.currentrates = (trans, recov, wane)
        
        #Draw a transition time
        dt = -1*np.log(np.random.random())/total#np.random.exponential(1/total) #numpy uses scale parameter, inverse of rate
        
        #Draw a transition
        p = np.array(self.currentrates)/total
        s = np.random.choice(3, p=p)
        map_transitions[s]() #map choice to correct transition function
        #return time to transition and what kind of transition occured
        return dt, s
    
    def simulate(self,simtime):
        time = 0
        observations = []
        while time < simtime:
            S,I,R = self.current.copy()
            dt, s = self.transition()

            time += dt
            if time > simtime:
                continue
            else:
                
                observations.append((time,dt,s,S,I,R))
        return observations

Now that we have our functions and simulation coded, let's run a simulation and observe a possible path for this kind of CTMC, or epidemic. Let's see how the state space looks after 100 days.

In [10]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd


init = np.array((95,5,0))
beta = 2
gamma = 0.125
omega = 1/20

toy = sirvs(beta, gamma, omega, init)
obs = toy.simulate(100)

data2 = pd.DataFrame(obs, columns = ['time','dt','transition','S','I','R'])
print(data2.tail())
print(data2.shape)
fig,ax = plt.subplots(figsize=(12,9))
ax.plot(data2.time, data2.S, label='S')
ax.plot(data2.time, data2.I, label='I', color = 'r')
ax.plot(data2.time, data2.R, label='R', color='g')
ax.set_xlabel("Days")
ax.set_ylabel("Number of people")
plt.legend()
plt.show()
           time        dt  transition  S   I   R
1106  99.782295  0.238649           1  6  37  57
1107  99.789082  0.006787           1  6  36  58
1108  99.791667  0.002585           1  6  35  59
1109  99.933892  0.142226           0  6  34  60
1110  99.990288  0.056396           1  5  35  60
(1111, 6)

Now that we have our data, let's see how we would do inference. As a reminder, the parameters we want to be inferring are $\boldsymbol{\theta} = (\beta, \gamma, \omega)$.

In [11]:
import pystan

stancode = """
functions {
    real ctmc_lpdf( real dt, real bet, real total) {
        return log(bet) - log(total) + exponential_lpdf( dt | total);
    }   
}
data {
    int<lower=0> N; //length of data
    int<lower=1> Pop; //total size of population
    real<lower=0> dt[N]; //time
    int<lower=0> Sus[N];
    int<lower=0> I[N];
    int<lower=0> R[N];
    int<lower=0> transitions[N]; //transition code
}
parameters {
    real<lower=0> bet;
    real<lower=0> gamm;
    real<lower=0> omega;
}
model {
    int state;
    real total;
    bet ~ gamma(1,1.5);
    gamm ~ gamma(1,1.5);
    omega ~ gamma(1,1.5);
    
    for (n in 1:N){
        state = transitions[n];
        total = bet*Sus[n]*I[n] / Pop + gamm*I[n] + omega*R[n];
        if (state == 0) {
            //Infection
            dt[n] ~ ctmc(
            bet*Sus[n]*I[n] / Pop, 
            total
            );
            }
        else if (state == 1) {    
            //Recovery
            dt[n] ~ ctmc(
                gamm * I[n],
                total
            );
            }
        else if (state == 2) {
            //Waning
            dt[n] ~ ctmc(
                omega * R[n],
                total
            );
            }
    } 
}
"""

data_dict = {
    'N': data2.shape[0],
    'Pop': sum(init),
    'dt': data2['dt'].values,
    'Sus': data2.S.values.astype(int),
    'I': data2.I.values.astype(int),
    'R': data2.R.values.astype(int),
    'transitions': data2.transition.astype(int)
}

sm = pystan.StanModel(model_code=stancode)
iterations = 1000
chains = 2

fit = sm.sampling(data=data_dict, iter=iterations, chains=chains)
INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_3ffd571e602e697a4bdd265537910246 NOW.
In [12]:
print(fit)
fit.plot(pars=['bet','gamm'])

plt.show()
WARNING:pystan:Deprecation warning. In future, use ArviZ library (`pip install arviz`)
Inference for Stan model: anon_model_3ffd571e602e697a4bdd265537910246.
2 chains, each with iter=1000; warmup=500; thin=1; 
post-warmup draws per chain=500, total post-warmup draws=1000.

        mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
bet     2.08  3.2e-3   0.11   1.88   2.01   2.08   2.15    2.3   1104    1.0
gamm    0.11  2.1e-4 6.1e-3    0.1   0.11   0.11   0.12   0.13    861    1.0
omega   0.05  8.8e-5 3.0e-3   0.05   0.05   0.05   0.06   0.06   1137    1.0
lp__   539.0    0.06   1.31 535.62 538.39 539.35 539.96 540.48    508    1.0

Samples were drawn using NUTS at Tue Feb  4 14:57:08 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
In [ ]:
 
In [ ]: