Modeling of a new coronavirus (Wuhan pneumonia) in 2019 (Ⅱ) - epidemic prediction based on Logistic model
The model was completed on February 6. In consideration of the problem of the model, the author mainly referred to the relevant articles of the bigwigs and HowNet from csdn, carefully considered and determined to use the Logistic model and SEIR latent infection model to analyze and predict the epidemic situation. We hope that readers can correct the shortcomings.
First of all, the author analyzes the data in China's daily list, extracts the data in the table, and confirm s with the number of people diagnosed every day. Draw a simple trend chart.
#Reference Convention import pandas as pd from pylab import * mpl.rcParams['font.sans-serif'] = ['SimHei']#Solve the problem of Chinese code disorder from matplotlib import pyplot as plt import random import numpy as np import matplotlib import collections from scipy.optimize import curve_fit import math #from matplotlib import pyplot as plt data=pd.read_csv('china_DailyList_2020_02_03.csv') print(data.head()) x=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24] y=data['confirm'] z=data['heal'] plt.plot(x,y,color='green',marker='o',linestyle='solid',label='Number of confirmed cases') plt.plot(x,z,color='red',marker='o',linestyle='solid',label='Cure number') plt.title('Epidemic trend chart')#Add title plt.ylabel('Number of sick people') plt.xlabel("date") plt.show()
Here's the x-axis date - the author is a grab data start date based, 1-25 represents the date of 2020-01-23_-02-05.
Connecting the number of confirmed cases into a smooth curve can be seen:
plt.plot(x,y,'r',label='Number of confirmed cases') plt.ylabel('Number') plt.xlabel("date") plt.legend(loc=0) plt.show()
The mathematical principle of Logistic model is omitted here. If you are interested in learning about it, you can do it yourself
In short, we use the existing data to fit the above equation, P (t): population function; K: maximum; r: growth resistance (such as the growth resistance of epidemic caused by medical isolation).
#logistic model # a=0.10 # b=0.60 # eor=100 def logistic_increase_function(t, K, P0, r): r=0.29 t0 = 1 exp_value = np.exp(r * (t - t0)) return (K * exp_value * P0) / (K + (exp_value - 1) * P0) # Date and number of infections t = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24] t = np.array(t) P = data['confirm'] # Least square fitting P = np.array(P) popt, pocv = curve_fit(logistic_increase_function, t, P) # for i in range(len(P)): # print(P[i]) # print(type(P)) # print(type(P)) # print(P) #All the opt obtained are fit coefficients print("K:capacity P0:intitial_value r:increase_rate t:time") print(popt) #Forecast the future situation after fitting P_predict=logistic_increase_function(t,popt,popt,popt) future=[25,27,29,40,60,71,76,82] future=np.array(future) future_predict=logistic_increase_function(future,popt,popt,popt) #Recent situation tomorrow=[25,26,27,28,29,30,31] tomorrow=np.array(tomorrow) tomorrow_predict=logistic_increase_function(tomorrow,popt,popt,popt) #Image rendering plot1=plt.plot(t,P,'s',label="Number of confirmed infections") plot2=plt.plot(t,P_predict,'r',label='Fitting curve of the number of people infected') plot3=plt.plot(tomorrow,tomorrow_predict,'s',label='Prediction of the number of people infected in the near future') plt.xlabel('date') plt.ylabel('Confirmed number') plt.legend(loc=0) plt.show() plot4=plt.plot(future,future_predict,'s',label='Prediction of the number of people infected in the future') plt.show() for i in range(25): people_sick=int(logistic_increase_function(np.array(i+25),popt,popt,popt)) print("2 month%d Estimated number of confirmed cases per day:%d people"%(i+6,people_sick))
In python, the growth resistance of different degrees can affect the quality of curve fitting and the selection of later results. The selection of R and K is worth studying. For convenience, the author discusses r first
The author thinks that there are two ways to optimize r method: 1;
2. Carry out binary optimization; (here the author takes 0.29 as the optimization result)
Prediction of the number of patients in the future, and inflection point analysis:
The predicted date of inflection point is from late February to early March, and the peak number of patients is about 50000.
In the aspect of model selection, Logistic can only predict the date of the inflection point and the peak of the number of patients, and can not predict the whole process of the epidemic. At the same time, the optimization of r and k values still needs to be improved.
Hua mingxiaomei, a foreign programmer in Alibaba, said at the yunqi conference that if I want to get married in China, I have to have a house. I am going to marry my northeast girlfriend, so I go to see the house in Jadeite city. The house is 90 square. My wife said to me, "this house is too small."
Xiaomei doesn't understand. His wife says, "I don't mean that. I mean, we may have to add more bricks and tiles in the future? "
Xiaomei: "java? What java? python is the best language in the world. "