Section 28.2 Logistic fit
If we simplify the SIR model (Section 20.1) by removing the recovering process, the proportion of infected people in the population grows according to the ODE \(y' = c y(1-y)\) with some coefficient \(c\text{.}\) This ODE has an explicit solution
with \(a, k > 0\) depending on coefficient \(c\) (which we usually don't know in practice) and the initial condition \(y(0)\) (which we might not know either). But if we can collect some data, we can fit a logistic function (28.2.1) to it. As written, the model includes parameters \(a, k\) in a nonlinear way. But applying the “logit” transformation \(\log(y/(1-y))\) we obtain
Thus, we can fit a linear function to the values \(\log(y_k/(1-y_k))\) and then apply the inverse of the transformation \(z = \log(y/(1-y))\text{,}\) which is \(y = 1 / (1+e^{-z})\text{.}\)
The logistic model can reasonable apply to other processes which are slow initially, speed up, and then slow down again. The following example, where “population” consists of Covid-positive students, models the process of their recovery.
Example 28.2.1. Fit a logistic function to Covid data.
There were at least 121 reported Covid infections among SU students between October 8 and October 26 of 2020. The following are the cumulative counts of people recovered, day by day, starting with October 9.
R = [1 1 1 5 8 10 27 44 62 93 101 110 112 115 116 117 118 120]';
Fit a logistic function to these proportion of the total “population” of 121 that recovered on each day.
We need to compute proportions y = R/121
and transform them by yt = log(y./(1-y))
. Then the usual linear regression is applied.
x = (1:numel(R))'; y = R/121; yt = log(y./(1-y)); X = x.^(0:1); beta = X\yt; f = @(x) 1 ./ (1 + exp(-(x.^(0:1))*beta)); t = linspace(min(x), max(x), 1000)'; plot(t, f(t), 'b', x, y, 'r*')
Here the logistic function is written as \(1/(1+\exp(-\beta_1 - \beta_2 x))\) with parameters \(\beta_1, \beta_2\) contained in the vector beta
.
The result in Example 28.2.1 is not entirely satisfactory. The logit transformation over-emphasizes the values near 0 and 1, and de-emphasizes those in between. One way to address this issue is to introduce weights, which we do in next section.