Modeling of a new coronavirus (Wuhan pneumonia) in 2019 (Ⅱ) - epidemic prediction based on Logistic model

Modeling of a new coronavirus (Wuhan pneumonia) in 2019 (Ⅱ) - epidemic prediction based on Logistic model

Introduction

The model was completed on February 6. In consideration of the problem of the model, the author mainly referred to the relevant articles of the bigwigs and HowNet from csdn, carefully considered and determined to use the Logistic model and SEIR latent infection model to analyze and predict the epidemic situation. We hope that readers can correct the shortcomings.

Grab data analysis

First of all, the author analyzes the data in China's daily list, extracts the data in the table, and confirm s with the number of people diagnosed every day. Draw a simple trend chart.

#Reference Convention
import pandas as pd
from pylab import *
mpl.rcParams['font.sans-serif'] = ['SimHei']#Solve the problem of Chinese code disorder
from matplotlib import pyplot as plt
import random
import numpy as np
import matplotlib
import collections
from scipy.optimize import curve_fit
import math
#from matplotlib import pyplot as plt
data=pd.read_csv('china_DailyList_2020_02_03.csv')
print(data.head())
x=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24]
y=data['confirm']
z=data['heal']
plt.plot(x,y,color='green',marker='o',linestyle='solid',label='Number of confirmed cases')
plt.plot(x,z,color='red',marker='o',linestyle='solid',label='Cure number')
plt.title('Epidemic trend chart')#Add title
plt.ylabel('Number of sick people')
plt.xlabel("date")
plt.show()

Here's the x-axis date - the author is a grab data start date based, 1-25 represents the date of 2020-01-23_-02-05.

Connecting the number of confirmed cases into a smooth curve can be seen:

plt.plot(x,y,'r',label='Number of confirmed cases')
plt.ylabel('Number')
plt.xlabel("date")
plt.legend(loc=0)
plt.show()

Logistic model prediction

The mathematical principle of Logistic model is omitted here. If you are interested in learning about it, you can do it yourself
In short, we use the existing data to fit the above equation, P (t): population function; K: maximum; r: growth resistance (such as the growth resistance of epidemic caused by medical isolation).

#logistic model
# a=0.10
# b=0.60
# eor=100


def logistic_increase_function(t, K, P0, r):
     r=0.29
     t0 = 1
     exp_value = np.exp(r * (t - t0))
     return (K * exp_value * P0) / (K + (exp_value - 1) * P0)
     # Date and number of infections
t = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
t = np.array(t)
P = data['confirm']
# Least square fitting
P = np.array(P)
popt, pocv = curve_fit(logistic_increase_function, t, P)
# for i in range(len(P)):
#      print(P[i])
# print(type(P))
# print(type(P))
# print(P)


#All the opt obtained are fit coefficients
print("K:capacity P0:intitial_value r:increase_rate t:time")
print(popt)
#Forecast the future situation after fitting
P_predict=logistic_increase_function(t,popt[0],popt[1],popt[2])
future=[25,27,29,40,60,71,76,82]
future=np.array(future)
future_predict=logistic_increase_function(future,popt[0],popt[1],popt[2])
#Recent situation
tomorrow=[25,26,27,28,29,30,31]
tomorrow=np.array(tomorrow)
tomorrow_predict=logistic_increase_function(tomorrow,popt[0],popt[1],popt[2])
#Image rendering
plot1=plt.plot(t,P,'s',label="Number of confirmed infections")
plot2=plt.plot(t,P_predict,'r',label='Fitting curve of the number of people infected')
plot3=plt.plot(tomorrow,tomorrow_predict,'s',label='Prediction of the number of people infected in the near future')

plt.xlabel('date')
plt.ylabel('Confirmed number')
plt.legend(loc=0)
plt.show()
plot4=plt.plot(future,future_predict,'s',label='Prediction of the number of people infected in the future')
plt.show()
for i in range(25):
     people_sick=int(logistic_increase_function(np.array(i+25),popt[0],popt[1],popt[2]))
     print("2 month%d Estimated number of confirmed cases per day:%d people"%(i+6,people_sick))

Selection of resistance r

In python, the growth resistance of different degrees can affect the quality of curve fitting and the selection of later results. The selection of R and K is worth studying. For convenience, the author discusses r first
r=0.6;

r=0.1;

Fitting result

The author thinks that there are two ways to optimize r method: 1;
2. Carry out binary optimization; (here the author takes 0.29 as the optimization result)

Prediction of the number of patients in the future, and inflection point analysis:


The predicted date of inflection point is from late February to early March, and the peak number of patients is about 50000.

Summary

In the aspect of model selection, Logistic can only predict the date of the inflection point and the peak of the number of patients, and can not predict the whole process of the epidemic. At the same time, the optimization of r and k values still needs to be improved.

Reference resources

HowNet
https://blog.csdn.net/z_ccsdn/article/details/104134358

A section

Hua mingxiaomei, a foreign programmer in Alibaba, said at the yunqi conference that if I want to get married in China, I have to have a house. I am going to marry my northeast girlfriend, so I go to see the house in Jadeite city. The house is 90 square. My wife said to me, "this house is too small."
Xiaomei doesn't understand. His wife says, "I don't mean that. I mean, we may have to add more bricks and tiles in the future? "
Xiaomei: "java? What java? python is the best language in the world. "

Published 18 original articles, won praise and 1210 visitors
Private letter follow

Tags: Python Java

Posted on Thu, 06 Feb 2020 22:06:26 -0800 by acirilo