Introduction to pyecharts and visualization of epidemic data (drawing geographic charts)

Preface

Generally, you can do visualization with matplotlib and seaborn. Why do you prefer pyecharts?

  • It's made by Chinese!
  • What's more, it's very easy to use and can make a lot of cool moving pictures. For example, when you brush the epidemic situation, you often see the picture

    About pyecharts tutorial, first Official Chinese Course . If you want to get started quickly, you may as well continue to look down.
  • Installation: I use pycharm to search for pyecharts installation directly. Of course, I need to surf the Internet scientifically (pay VPN). If there is no scientific way to surf the Internet, my experience is that I often fail to download~
  • python version: 3.5 and above. It is recommended that you install the latest version directly to avoid many troubles

Preparation

This blog is followed by the last one. In the last blog( New coronavirus data capture and collation detailed flow )In, we get epidemic data by grabbing Tencent News data, and store the data we need into three dataframes

  • Historical data of epidemic situation in China: chinaDayData
  • Epidemic data of cities in various provinces of China on the same day: china_info
  • Overall epidemic data of other countries: Foreign
    (the above data will be put into the resources. If you don't have time to read a blog, you can download the data directly for operation.)

In order to retrieve the data conveniently, we save the data to a csv file. Note that since the city and province information contains Chinese, the encoding method encoding = 'gbk' should be specified here whether it is saved or read.

# Save data
chinaDayData.to_csv(r'F:\Let's start the class.\RS Basic course\2019_CoV_chinaDayData.csv', encoding='gbk', index=None);
china_info.to_csv(r'F:\Let's start the class.\RS Basic course\2019_CoV_china_info.csv', encoding='gbk', index=None);
foreigns.to_csv(r'F:\Let's start the class.\RS Basic course\2019_CoV_foreigns.csv', encoding='gbk', index=None);

# Read data
chinaDayData = pd.read_csv(r'F:\Let's start the class.\RS Basic course\2019_CoV_chinaDayData.csv', encoding='gbk');
china_info = pd.read_csv(r'F:\Let's start the class.\RS Basic course\2019_CoV_china_info.csv', encoding='gbk');
foreigns = pd.read_csv(r'F:\Let's start the class.\RS Basic course\2019_CoV_foreigns.csv', encoding='gbk');

(the above data will be put into the resources. If you don't have time to read a blog, you can download the data directly for operation.)

Visualization with pyecharts

By consulting pyecharts tutorial You will find many interesting pictures, such as dashboard, water polo, sunrise and word cloud. Do you think it's cool to be able to show the analysis results through these graphs?

Ha ha, I don't need to talk about much nonsense. Just tell us the graph we want to use (click to jump to the relevant page of the official tutorial):

# Import pyecharts third-party library
from pyecharts.charts import * 
from pyecharts import options as opts

Pie chart

  • Objective: to show the number of deaths, the number of people cured and the number of people to be cured.
  • Explanation: some big guys make pie charts with the number of dead, cured and confirmed people, which is confusing. Because the number of confirmed people should include the number of deaths and the number of cured people, the number of people to be cured should be the same as the number of dead people and the number of cured people. Among them,
Number to be cured = number of confirmed - number of deaths - number of cured
  • Data: in chinaDayData, we record the national epidemic data every day. Here, we choose the data of the latest day as pie chart data.
# Create the number of people to be cured
chinaDayData['to_be_healed'] = chinaDayData['confirm'] - chinaDayData['dead'] - chinaDayData['heal'];
# Use data from the last day as pie chart data
pie_data = chinaDayData.loc[len(chinaDayData)-1, ['to_be_healed', 'suspect', 'dead', 'heal']];
pie_data.index = ['Number of people to be cured', 'Suspected number', 'death toll', 'Cure number']
print(pie_data);
37693.0 people to be cured
 Suspected population: 21675.0
 1017.0 deaths
 Number of cured 3998.0
 Name: 28, dtype: float64 insert code piece here

Next, we will make a pie chart.

# Pie chart
my_pie = (
    Pie()
    # Loading data
    .add("", [list(z) for z in zip(pie_data.index, pie_data.values)])
    # Title
    .set_global_opts(title_opts=opts.TitleOpts(title= 'Proportion of epidemic population in China(%)'))
    # Label format
    .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {d}"))
);
# The generated pie chart is saved in a my pie.html file
my_pie.render(r'F:\Let's start the class.\RS Basic course\my_pie.html');

Click on the local document my_pie.html, and you can see


The pie chart's label format formatter="{b}: {d}" looks a little strange. Here, make a note: {a} (series name), {b} (data item name), {c} (value), {d} (percentage).

Broken line

  • Objective: To observe the changing trend of the number of confirmed cases, suspected cases, deaths and cured cases with respect to time;
  • Data: chinaDayData

To view data types:

print(chinaDayData.info());
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 29 entries, 0 to 28
Data columns (total 9 columns):
date            29 non-null float64
confirm         29 non-null int64
suspect         29 non-null int64
dead            29 non-null int64
heal            29 non-null int64
deadRate        29 non-null float64
healRate        29 non-null float64
add             29 non-null int64
to_be_healed    29 non-null int64
dtypes: float64(3), int64(6)
memory usage: 2.2 KB
None

As you can see, in chinaDayData, the data type of "date" column is float64, which means that we need to convert it into str for drawing.

chinaDayData['date'] = chinaDayData['date'].map(str);
print(list(chinaDayData.date));
['1.13', '1.14', '1.15', '1.16', '1.17', '1.18', '1.19', '1.20', '1.21', '1.22', '1.23', '1.24', '1.25', '1.26', '1.27', '1.28', '1.29', '1.30', '1.31', '2.01', '2.02', '2.03', '2.04', '2.05', '2.06', '2.07', '2.08', '2.09', '2.10']

After the above data processing, it is now possible to start drawing.

# Make a line chart of the number of confirmed and suspected cases
my_line1 = (
    Line()
    # x axis
    .add_xaxis(list(chinaDayData['date']))
    # y axis
    .add_yaxis('Confirmed number', list(chinaDayData.confirm))
    .add_yaxis('Suspected number', list(chinaDayData.suspect))
    # Title
    .set_global_opts(title_opts=opts.TitleOpts(title="Time trend of the number of confirmed cases and suspected cases in China"))
)
# Make a line chart of the number of deaths, the number of people cured and the number of new people
my_line2 = (
    Line()
    # x axis
    .add_xaxis(list(chinaDayData['date']))
    # y axis
    .add_yaxis('death toll', list(chinaDayData.dead))
    .add_yaxis('Cure number', list(chinaDayData.heal))
    .add_yaxis('New number', list(chinaDayData['add']))
    # Title
    .set_global_opts(title_opts=opts.TitleOpts(title="Death toll and healing toll in China\n Time trend with new employees"))
)
# Two line graphs are saved into html file respectively
my_line1.render(r'F:\Let's start the class.\RS Basic course\my_line1.html');
my_line2.render(r'F:\Let's start the class.\RS Basic course\my_line2.html');

Open my line1.html and my line2.html of the local document respectively, and you can see the line graph:

Geographic Map

  • Objective: To observe the epidemic situation on the world map and China map respectively
  • Data: China info for epidemic distribution in China and foreign for epidemic distribution in the world

First, according to China's domestic data, the distribution of the epidemic situation in Sichuan Province is made.

  • Note that before doing so, the city name must be regularized. This is because the city name built in the map in pyecharts may be different from the city name in the data, which will result in no data display in the map.
# Calculate the confirmed number of people according to the cities in Sichuan Province
# The Chinese name of the city is regulated here
def city_name(city):
    if city == 'ABA':
        city = 'Aba Tibetan and Qiang Autonomous Prefecture';
    elif city == 'Liangshan':
        city = 'Liangshan Yi Autonomous Prefecture';
    elif city == 'Ganzi':
        city = 'Ganzi Tibetan Autonomous Prefecture';
    else:
        city = city + 'city';
    return city;

# Extract the data of Sichuan Province from China info, and select the city and confirm columns.
sichuanMap_data = china_info[china_info['province']=='Sichuan'];
sichuanMap_data.drop(['province', 'healRate(%)', 'deadRate(%)'], axis=1, inplace=True);
sichuanMap_data.index = range(len(sichuanMap_data));
sichuanMap_data['city'] = sichuanMap_data['city'].map(city_name);
#print(sichuanMap_data);

# Mapping
sichuan_map = (
    Map()
    .add('', [list(z) for z in zip(sichuanMap_data.city, sichuanMap_data.confirm)], 'Sichuan')
    .set_series_opts(label_opts=opts.LabelOpts(is_show=False))
    .set_global_opts(
            # Title
            title_opts=opts.TitleOpts(title="Map of epidemic situation in Sichuan"),
            tooltip_opts=opts.TooltipOpts(formatter='{b}: {c}'),
            # Visual effect
            visualmap_opts=opts.VisualMapOpts(is_piecewise=True,
                                              pieces=[
{'min': 51, 'label': '>50', "color": "#ff585e"},
                                                  {'min': 21, 'max': 50, 'label': '21-50', "color": "#ffb248"},
                                                  {'min': 11, 'max': 20, 'label': '11-20', "color": "#ffb248"},
                                                  {'min': 1, 'max': 10, 'label': '1-10', "color": "#fff2d1"}
                                              ]),
        )
)

# Store in html file
sichuan_map.render(r'F:\Let's start the class.\RS Basic course\sichuan_map.html')


Secondly, according to China's domestic data, the epidemic situation of each province is made.

# Number of confirmed people by province
chinaMap_data = china_info[['province', 'confirm', 'heal', 'dead']].groupby('province').sum().sort_values(by='confirm', ascending=False);
#print(chinaMap_data);

china_map = (
    Map()
    .add('', [list(z) for z in zip(chinaMap_data.index, chinaMap_data.confirm)], 'china')
    .set_series_opts(label_opts=opts.LabelOpts(is_show=False))
    .set_global_opts(
            # Title
            title_opts=opts.TitleOpts(title="Map of epidemic situation in China"),
            tooltip_opts=opts.TooltipOpts(formatter='{b}: {c}'),
            # Visual effect
            visualmap_opts=opts.VisualMapOpts(is_piecewise=True,
                                              pieces=[
                                                  {'min': 20001, 'label': '>20000', "color": "#893448"},
                                                  {'min': 1000, 'max': 20000, 'label': '1000-20000', "color": "#ff585e"},
                                                  {'min': 500, 'max': 999, 'label': '500-999', "color": "#ffb248"},
                                                  {'min': 100, 'max': 499, 'label': '100-499', "color": "#ffb248"},
                                                  {'min': 1, 'max': 99, 'label': '1-99', "color": "#fff2d1"}
                                              ]),
        )
)

china_map.render(r'F:\Let's start the class.\RS Basic course\china_map.html')

The image is as follows:

Finally, according to the international data, we make the distribution of the epidemic situation in the world.

  • Note that we also need to convert Chinese names into English names here. This is because the names in pyecharts are in English, so if the data is in Chinese, there will be no data display in the chart.

According to my blog Capturing Chinese and English names of countries around the world from 360 Library Can make a dictionary corresponding to the Chinese country name and English. We record it as dict_countries

dict_countries = {'China': 'China', 'Mongolia': 'Mongolia', 'Korea': 'Dem. Rep. Korea', 'The Republic of Korea': 'Korea', 'Japan': 'Japan', 'The Philippines': 'Philippines', 'Vietnam?': 'Vietnam', 'Laos': 'Laos', 'Cambodia': 'Cambodia', 'Myanmar': 'Myanmar', 'Thailand': 'Thailand', 'Malaysia': 'Malaysia', 'Brunei': 'Brunei Darussalam', 'Singapore': 'Singapore', 'Indonesia': 'Indonesia', 'Nepal': 'Nepal', 'Bhutan': 'Bhutan', 'The People's Republic of Bangladesh': 'Bengal', 'India': 'India', 'Pakistan': 'Pakistan', 'Sri Lanka': 'Sri Lanka', 'Maldives': 'Maldives', 'Kazakhstan': 'Kazakhstan', 'Kyrgyzstan': 'Kyrgyzstan', 'Tajikistan': 'Tajikistan', 'Uzbekistan': 'Uzbekistan', 'Turkmenistan': 'Turkmenistan', 'Afghanistan': 'Afghanistan', 'Iraq': 'Iraq', 'Iran': 'Iran', 'Syria': 'Syria', 'Jordan': 'Jordan', 'Lebanon': 'Lebanon', 'Israel': 'Israel', 'Palestine': 'Palestine', 'Saudi Arabia': 'Saudi Arabia', 'Bahrain': 'Bahrain', 'Qatar': 'Qatar', 'Kuwait': 'Kuwait', 'United Arab Emirates, United Arab Emirates, Arabia': 'United Arab Emirates', 'Oman': 'Oman', 'Yemen': 'Yemen', 'Georgia': 'Georgia', 'Armenia': 'Armenia', 'Azerbaijan': 'Azerbaijan', 'Turkey': 'Turkey', 'Cyprus': 'Cyprus', 'Finland': 'Finland', 'Sweden': 'Sweden', 'Norway': 'Norway', 'Iceland': 'Iceland', 'Denmark': 'Danmark Faroe Islands', 'Estonia': 'Estonia', 'Latvia': 'Latvia', 'Lithuania': 'Lithuania', 'Belarus': 'Belarus', 'Russia': 'Russia', 'Ukraine': 'Ukraine', 'Moldova': 'Moldova', 'poland': 'Poland', 'Czech': 'Czech', 'Slovakia': 'Slovakia', 'Hungary': 'Hungary', 'Germany': 'Germany', 'Austria': 'Austria', 'Switzerland': 'Switzerland', 'Liechtenstein()': 'Liechtenstein L I E', 'Britain': 'Britain', 'Ireland': 'Ireland', 'Netherlands': 'Holand', 'Belgium': 'Belgium', 'Luxembourg': 'Luxemburg', 'France': 'France', 'Monaco': 'Monaco', 'Romania': 'Romania', 'Bulgaria': 'Bulgaria', 'Serbia': 'Serbia', 'Macedonia': 'Macedonia', 'Albania': 'Albania', 'Greece': 'Greece', 'Slovenia': 'Slovenia', 'Croatia': 'Croatia', 'Bosnia and Mexico(Bosnia). ': 'Bosnia Herzegovina', 'Italy': 'Italy', 'Vatican': 'Vatican', 'San Marino': 'San Marino', 'Malta': 'Malta', 'Spain': 'Spain', 'Portugal': 'Portugal', 'Andorra': 'Andorra', 'Egypt': 'Egypt', 'Libya': 'Libya', 'Sultan': 'Sudan', 'Tunisia': 'Tunis', 'Algeria': 'Algeria', 'Morocco': 'Morocco', 'Azores()Portuguese': 'Azores Portugal', 'Portugal, Madeira': 'M A D E I R A I S L A N D S', 'Ethiopia': 'Ethiopia', 'Eritrea': 'Eritrea', 'Somalia': 'Somalia', 'Djibouti': 'Djibouti', 'Kenya': 'Kenya', 'Tanzania': 'Tanzania', 'Uganda': 'Uganda', 'Rwanda': 'Rwanda', 'burundi': 'Burundi', 'Seychelles': 'Seychelles', 'Chad': 'Chad', 'Central African': 'Central Africa', 'Cameroon': 'Cameroon', 'Equatorial Guinea': 'Equatorial Guinea', 'Gabon': 'Gabon', 'Republic of Congo: Congo cloth': 'Republicof Congo', 'The Democratic Republic of the Congo: the DRC': 'Democratic Republicof Congo', 'Sao Tome and Principe': 'Sao Tomeand Principe', 'Mauritania': 'Mauritania', 'Western Sahara note: not independent, see:': 'Western Sahara', 'Senegal': 'Senegal', 'Gambia': 'Gambian', 'Mali': 'Mali', 'burkina faso ': 'Burkina Faso', 'Guinea': 'Guinea', 'Guinea-Bissau-': 'Guinea Bissau', 'Cape Verde-': 'Cape Verde', 'sierra leone': 'Sierra Leone', 'Liberia': 'Liberia', "Cote d'Ivoire'": 'Coted Ivoire', 'Ghana': 'Ghana', 'Togo': 'Togo', 'Benin': 'Benin', 'Niger': 'Niger', 'Canary Islands West': 'Canary Islands', 'Zambia': 'Zambia', 'Angola': 'Angola', 'zimbabwe': 'Zimbabwe', 'Malawi': 'Malawi', 'Mozambique': 'Mozambique', 'botswana': 'Botswana', 'Namibia': 'Namibia', 'South Africa': 'South Africa', 'Swaziland': 'Swaziland', 'Lesotho': 'Lesotho', 'Madagascar': 'Madagascan', 'Comoros': 'Comorin', 'Mauritius': 'Mauritius', 'Reunion method': 'Reunion', 'St. Helena': 'Saint Helena', 'Australia': 'Australia', 'New Zealand': 'New Zealand', 'papua new guinea': 'Guinea', 'Solomon Islands': 'Archipelago', 'Vanuatu': 'Vanuatu', 'Micronesia': 'Micronesia', 'Marshall Islands': 'Marshall Islands', 'Palau': 'Palau', 'Nauru': 'Nauru', 'Kiribati': 'Kiribati', 'Tuvalu': 'Tuvalu T V', 'Samoa': 'Samoa', 'Fiji Islands': 'Fiji Islands', 'Tonga': 'Tonga', 'Cook Islands NEW': 'Cook Islands', 'Guam beauty': 'Guam', 'New Caledonian method': 'New Caledonia', 'French Polynesia': 'French Polynesia', 'Pitcairn Island UK': 'Pitcairn Island', 'Wallis and Futuna /method': 'Wallis Futuna', 'Neo NEW': 'Niue', 'New Zealand': 'Tokelau', 'American Samoa ': 'American Samoa', 'North marianami': 'Mariana', 'Canada': 'Canada', 'U.S.A': 'America', 'Mexico': 'Mexico', 'Greenland Dan': 'Greenland', 'Guatemala': 'Guatemala', 'Belize': 'Belize', 'Salvatore': 'Salvador', 'Honduras': 'Honduras', 'Nicaragua': 'Nicaragua', 'Costa Rica(another)': 'Costarica Costa Rica', 'Panama': 'Panama', 'Bahamas': 'Bahamas', 'Cuba': 'Cuba', 'Jamaica': 'Jamaica', 'Haiti': 'Haiti', 'dominican republic': 'Dominican Republic', 'Antigua and Barbuda ': 'Antiguaand Barbuda', 'Saint Kitts and Nevis': 'Saint Kittsand Nevis', 'Dominica': 'Dominica', 'Saint Lucia': 'Saint Lucia', 'Saint Vincent and the Grenadines': 'Saint Vincentandthe Grenadines', 'Grenada': 'Grenada', 'Barbados': 'Barbados', 'Trinidad and Tobago ': 'Trinidadand Tobago', 'Puerto Rico': 'Porto Rico', 'British Virgin Islands': 'British Virgin Islands', 'Virgin Islands': 'Virgin Islandsofthe United States', 'Anguilla': 'Anguilla', 'Montserrat UK': 'Montserrat', 'Guadeloupe method': 'Guadeloupe', 'Martinique method': 'Martinique', 'Netherlands Antilles': 'Netherlands Antilles', 'Aruba': 'Aruba', 'Turks and Caicos Islands': 'Turks And Caicos Islands', 'Cayman Islands UK': 'Cayman Islands', 'Bermuda English': 'Bermuda', 'Columbia': 'Colombia', 'Venezuela': 'Venezuela', 'Guyana': 'Guyana', 'French Guiana': 'French Guiana', 'Suriname': 'Suriname', 'Ecuador': 'Ecuador', 'Peru': 'Peru', 'bolivia': 'Bolivia', 'Chile': 'Chile', 'Argentina': 'Argentina', 'Uruguay': 'Uruguay', 'Paraguay': 'Paraguay'}
# Chinese English name conversion function
def country_name(country):
    if country in dict_countries:
        return dict_countries[country];
# Select country and confirm columns from foreign
world_data = foreigns[['country', 'confirm']];
# Country name conversion
world_data['country'] = world_data['country'].map(country_name);
# Plus China data
china_world_data = china_info['confirm'].sum();
world_data = world_data.append({'country': 'China', 'confirm': china_world_data}, ignore_index=True);
print(world_data);


world_map = (
    Map()
    .add('', [list(z) for z in zip(world_data.country, world_data.confirm)], 'world')
    .set_series_opts(label_opts=opts.LabelOpts(is_show=False))
    .set_global_opts(
            # Title
            title_opts=opts.TitleOpts(title="World map epidemic distribution map"),
            tooltip_opts=opts.TooltipOpts(formatter='{b}: {c}'),
            # Visual effect
            visualmap_opts=opts.VisualMapOpts(is_piecewise=True,
                                              pieces=[
                                                  {'min': 201, 'label': '>200', "color": "#893448"},
                                                  {'min': 51, 'max': 200, 'label': '51-200', "color": "#ff585e"},
                                                  {'min': 21, 'max': 50, 'label': '21-50', "color": "#ffb248" },
                                                  {'min': 11, 'max': 20, 'label': '11-20', "color": "#ffb248"},
                                                  {'min': 1, 'max': 10,'label': '1-10', "color" : "#fff2d1" }
                                              ]),
        )
)


# Store the world epidemic distribution map in world_map.html
world_map.render(r'F:\Let's start the class.\RS Basic course\world_map.html');

Get the picture as

Published 4 original articles, won praise 3, visited 739
Private letter follow

Tags: encoding Pycharm VPN Python

Posted on Tue, 11 Feb 2020 05:23:37 -0800 by aboldock