Crawling World Epidemic Data to Map Visualization of Epidemics

Find a target site

First find a data interface

For example, find this website https://xw.qq.com/act/qgfeiyan?pgv_ref=3gqtb&ADTAG=3gqtb

Then the right mouse button opens the check

Find the Network and select XHR

Then refresh the page

Select ranklist

preview shows that we have the data information we need here

The interface addresses for this data are then found in the headers option

https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist

And you can see that the data type is json

Import related libraries

First you need to import libraries, such as requests, jsonpath

The method is to find the Terminal option under pycharm and pip install XXX, such as pip install jsonpath

There is also a pyecharts library, which is a bit more complex to reference

https://www.cnblogs.com/cyx-b/p/12815433.html

get data

First, it's easy to get data in the form of json

import json
import requests
import jsonpath
# 1.Target Site
url='https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist'
# 2.Request Resources
resp=requests.get(url)
print(resp.text)

The output shown in the figure above is in json format

However, we can convert it to a dictionary type through json.loads

import json
import requests
import jsonpath
# 1.Target Site
url='https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist'
# 2.Request Resources
resp=requests.get(url)
# 3.Extract data
# Type Conversion json-->dict
data=json.loads(resp.text)
print(type(data))
print(data['data'][0]['name'])

But it was obviously cumbersome, so we used the jsonpath we installed earlier

Use jsonpath.jsonpath(data,"$..name")

data is our content, $means under the root node,

This is understandable. Looking closely at the preview section, you can see that labels are actually hierarchical, and that $..Name represents the part of any hierarchical name key value under the root node

import json
import requests
import jsonpath
# 1.Target Site
url='https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist'
# 2.Request Resources
resp=requests.get(url)
# 3.Extract data
# Type Conversion json-->dict
data=json.loads(resp.text)
name = jsonpath.jsonpath(data,"$..name")
print(name)

This gives the name of the country most affected by the epidemic.

Similarly, we can make a list of the number of confirmed cases and check the label name used by the number of confirmed cases, confirm

We just need to change the name used for the country name to confirm

import json
import requests
import jsonpath
# 1.Target Site
url='https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist'
# 2.Request Resources
resp=requests.get(url)
# 3.Extract data
# Type Conversion json-->dict
data=json.loads(resp.text)
name = jsonpath.jsonpath(data,"$..name")
print(name)
confirm = jsonpath.jsonpath(data,"$..confirm")
print(confirm)

Visualization of data

This section seems to be related to the version of pyecharts, which seems to correspond to different codes.

Method 1:

This is someone else's code, but it doesn't work on my computer

You can combine two sets of data using the zip function in Python

The usage of zip() can be consulted

https://www.cnblogs.com/cyx-b/p/12818426.html

import json
import requests
import jsonpath
# 1.Target Site
url='https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist'
# 2.Request Resources
resp=requests.get(url)
# 3.Extract data
# Type Conversion json-->dict
data=json.loads(resp.text)
name = jsonpath.jsonpath(data,"$..name")
print(name)
confirm = jsonpath.jsonpath(data,"$..confirm")
print(confirm)
data_list = zip(name,confirm)
print(list(data_list))

Notice the output of the third line

import json
import requests
import jsonpath
from pyecharts import Map
# 1.Target Site
url='https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist'
# 2.Request Resources
resp=requests.get(url)
# 3.Extract data
# Type Conversion json-->dict
data=json.loads(resp.text)
name = jsonpath.jsonpath(data,"$..name")
print(name)
confirm = jsonpath.jsonpath(data,"$..confirm")
print(confirm)
data_list = zip(name,confirm)
print(list(data_list))
# 4.visualization matplotlib and pyecharts
map = Map().add(series_name='World Epidemic Distribution',
                data_pair=data_list,
                maptype='world',
                is_map_symbol_show=False
)
map.render('World Epidemic Distribution.html')

Method 2:

import json
import requests
import jsonpath
from pyecharts import Map
# 1.Target Site
url='https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist'
# 2.Request Resources
resp=requests.get(url)
# 3.Extract data
# Type Conversion json-->dict
data=json.loads(resp.text)
name = jsonpath.jsonpath(data,"$..name")
print(name)
confirm = jsonpath.jsonpath(data,"$..confirm")
print(confirm)
map = Map("World Epidemic Distribution",width=1200,height=600)
map.add("COVID19",name,confirm,maptype='world',is_map_symbol_show=False)
map.render('World Epidemic Distribution.html')

After successful operation, html files appear in the project folder

You can right-click and select Show in Explorer

The effect after opening is as follows:

However, it is not difficult to find that when the mouse moves to the corresponding location, the English name appears, and there is no information about the number of confirmed people.

So the first thing to do is to map the name of a country from Chinese to English

Introduce a dictionary as follows:

nameMap = {
        'Singapore Rep.':'Singapore',
        'Dominican Rep.':'Dominican',
        'Palestine':'Palestine',
        'Bahamas':'Bahamas',
        'Timor-Leste':'Timor-Leste',
        'Afghanistan':'Afghanistan',
        'Guinea-Bissau':'Guinea-Bissau',
        "Côte d'Ivoire":'Cote d'Ivoire',
        'Siachen Glacier':'Siyaquin Glacier',
        "Br. Indian Ocean Ter.":'British Indian Ocean Territory',
        'Angola':'Angola',
        'Albania':'Albania',
        'United Arab Emirates':'The United Arab Emirates',
        'Argentina':'Argentina',
        'Armenia':'Armenia',
        'French Southern and Antarctic Lands':'French Southern and Antarctic Territories',
        'Australia':'Australia',
        'Austria':'Austria',
        'Azerbaijan':'Azerbaijan',
        'Burundi':'burundi',
        'Belgium':'Belgium',
        'Benin':'Benin',
        'Burkina Faso':'burkina faso',
        'Bangladesh':'The People's Republic of Bangladesh',
        'Bulgaria':'Bulgaria',
        'The Bahamas':'Bahamas',
        'Bosnia and Herz.':'Bosnia and Herzegovina',
        'Belarus':'Belarus',
        'Belize':'Belize',
        'Bermuda':'Bermuda',
        'Bolivia':'bolivia',
        'Brazil':'Brazil',
        'Brunei':'Brunei',
        'Bhutan':'Bhutan',
        'Botswana':'botswana',
        'Central African Rep.':'Central African',
        'Canada':'Canada',
        'Switzerland':'Switzerland',
        'Chile':'Chile',
        'China':'China',
        'Ivory Coast':'Ivory Coast',
        'Cameroon':'Cameroon',
        'Dem. Rep. Congo':'Democratic Republic of the Congo',
        'Congo':'Congo',
        'Colombia':'Columbia',
        'Costa Rica':'Costa Rica',
        'Cuba':'Cuba',
        'N. Cyprus':'northern cyprus',
        'Cyprus':'Cyprus',
        'Czech Rep.':'Czech Republic',
        'Germany':'Germany',
        'Djibouti':'Djibouti',
        'Denmark':'Denmark',
        'Algeria':'Algeria',
        'Ecuador':'Ecuador',
        'Egypt':'Egypt',
        'Eritrea':'Eritrea',
        'Spain':'Spain',
        'Estonia':'Estonia',
        'Ethiopia':'Ethiopia',
        'Finland':'Finland',
        'Fiji':'Fiji',
        'Falkland Islands':'Falkland Islands',
        'France':'France',
        'Gabon':'Gabon',
        'United Kingdom':'Britain',
        'Georgia':'Georgia',
        'Ghana':'Ghana',
        'Guinea':'Guinea',
        'Gambia':'Gambia',
        'Guinea Bissau':'Guinea-Bissau',
        'Eq. Guinea':'Equatorial Guinea',
        'Greece':'Greece',
        'Greenland':'Greenland',
        'Guatemala':'Guatemala',
        'French Guiana':'French Guyana',
        'Guyana':'Guyana',
        'Honduras':'Honduras',
        'Croatia':'Croatia',
        'Haiti':'Haiti',
        'Hungary':'Hungary',
        'Indonesia':'Indonesia',
        'India':'India',
        'Ireland':'Ireland',
        'Iran':'Iran',
        'Iraq':'Iraq',
        'Iceland':'Iceland',
        'Israel':'Israel',
        'Italy':'Italy',
        'Jamaica':'Jamaica',
        'Jordan':'Jordan',
        'Japan':'Japan',
        'Japan':'Native Japan',
        'Kazakhstan':'Kazakhstan',
        'Kenya':'Kenya',
        'Kyrgyzstan':'Kyrgyzstan',
        'Cambodia':'Cambodia',
        'Korea':'The Republic of Korea',
        'Kosovo':'Kosovo',
        'Kuwait':'Kuwait',
        'Lao PDR':'Laos',
        'Lebanon':'Lebanon',
        'Liberia':'Liberia',
        'Libya':'Libya',
        'Sri Lanka':'Sri Lanka',
        'Lesotho':'Lesotho',
        'Lithuania':'Lithuania',
        'Luxembourg':'Luxembourg',
        'Latvia':'Latvia',
        'Morocco':'Morocco',
        'Moldova':'Moldova',
        'Madagascar':'Madagascar',
        'Mexico':'Mexico',
        'Macedonia':'Macedonia',
        'Mali':'Mali',
        'Myanmar':'Myanmar',
        'Montenegro':'Montenegro',
        'Mongolia':'Mongolia',
        'Mozambique':'Mozambique',
        'Mauritania':'Mauritania',
        'Malawi':'Malawi',
        'Malaysia':'Malaysia',
        'Namibia':'Namibia',
        'New Caledonia':'New Caledonia',
        'Niger':'Niger',
        'Nigeria':'Nigeria',
        'Nicaragua':'Nicaragua',
        'Netherlands':'Netherlands',
        'Norway':'Norway',
        'Nepal':'Nepal',
        'New Zealand':'New Zealand',
        'Oman':'Oman',
        'Pakistan':'Pakistan',
        'Panama':'Panama',
        'Peru':'Peru',
        'Philippines':'The Philippines',
        'Papua New Guinea':'papua new guinea',
        'Poland':'poland',
        'Puerto Rico':'Puerto Rico',
        'Dem. Rep. Korea':'Korea',
        'Portugal':'Portugal',
        'Paraguay':'Paraguay',
        'Qatar':'Qatar',
        'Romania':'Romania',
        'Russia':'Russia',
        'Rwanda':'Rwanda',
        'W. Sahara':'Western Sahara',
        'Saudi Arabia':'Saudi Arabia',
        'Sudan':'Sudan',
        'S. Sudan':'South Sudan',
        'Senegal':'Senegal',
        'Solomon Is.':'Solomon Islands',
        'Sierra Leone':'sierra leone',
        'El Salvador':'El Salvador',
        'Somaliland':'somaliland',
        'Somalia':'Somalia',
        'Serbia':'Serbia',
        'Suriname':'Suriname',
        'Slovakia':'Slovakia',
        'Slovenia':'Slovenia',
        'Sweden':'Sweden',
        'Swaziland':'Swaziland',
        'Syria':'Syria',
        'Chad':'Chad',
        'Togo':'Togo',
        'Thailand':'Thailand',
        'Tajikistan':'Tajikistan',
        'Turkmenistan':'Turkmenistan',
        'East Timor':'Timor-Leste',
        'Trinidad and Tobago':'Trinidad and Tobago',
        'Tunisia':'Tunisia',
        'Turkey':'Turkey',
        'Tanzania':'Tanzania',
        'Uganda':'Uganda',
        'Ukraine':'Ukraine',
        'Uruguay':'Uruguay',
        'United States':'U.S.A',
        'Uzbekistan':'Uzbekistan',
        'Venezuela':'Venezuela',
        'Vietnam':'Vietnam?',
        'Vanuatu':'Vanuatu',
        'West Bank':'West Bank',
        'Yemen':'Yemen',
        'South Africa':'South Africa',
        'Zambia':'Zambia',
        'Zimbabwe':'zimbabwe'
    }

Then you need to add name_map=nameMap to map.add as follows:

import json
import requests
import jsonpath
from pyecharts import Map
# 1.Target Site
url='https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist'
# 2.Request Resources
resp=requests.get(url)
# 3.Extract data
# Type Conversion json-->dict
data=json.loads(resp.text)
name = jsonpath.jsonpath(data,"$..name")
print(name)
confirm = jsonpath.jsonpath(data,"$..confirm")
print(confirm)
nameMap = {
        'Singapore Rep.':'Singapore',
        'Dominican Rep.':'Dominican',
        'Palestine':'Palestine',
        'Bahamas':'Bahamas',
        'Timor-Leste':'Timor-Leste',
        'Afghanistan':'Afghanistan',
        'Guinea-Bissau':'Guinea-Bissau',
        "Côte d'Ivoire":'Cote d'Ivoire',
        'Siachen Glacier':'Siyaquin Glacier',
        "Br. Indian Ocean Ter.":'British Indian Ocean Territory',
        'Angola':'Angola',
        'Albania':'Albania',
        'United Arab Emirates':'The United Arab Emirates',
        'Argentina':'Argentina',
        'Armenia':'Armenia',
        'French Southern and Antarctic Lands':'French Southern and Antarctic Territories',
        'Australia':'Australia',
        'Austria':'Austria',
        'Azerbaijan':'Azerbaijan',
        'Burundi':'burundi',
        'Belgium':'Belgium',
        'Benin':'Benin',
        'Burkina Faso':'burkina faso',
        'Bangladesh':'The People's Republic of Bangladesh',
        'Bulgaria':'Bulgaria',
        'The Bahamas':'Bahamas',
        'Bosnia and Herz.':'Bosnia and Herzegovina',
        'Belarus':'Belarus',
        'Belize':'Belize',
        'Bermuda':'Bermuda',
        'Bolivia':'bolivia',
        'Brazil':'Brazil',
        'Brunei':'Brunei',
        'Bhutan':'Bhutan',
        'Botswana':'botswana',
        'Central African Rep.':'Central African',
        'Canada':'Canada',
        'Switzerland':'Switzerland',
        'Chile':'Chile',
        'China':'China',
        'Ivory Coast':'Ivory Coast',
        'Cameroon':'Cameroon',
        'Dem. Rep. Congo':'Democratic Republic of the Congo',
        'Congo':'Congo',
        'Colombia':'Columbia',
        'Costa Rica':'Costa Rica',
        'Cuba':'Cuba',
        'N. Cyprus':'northern cyprus',
        'Cyprus':'Cyprus',
        'Czech Rep.':'Czech Republic',
        'Germany':'Germany',
        'Djibouti':'Djibouti',
        'Denmark':'Denmark',
        'Algeria':'Algeria',
        'Ecuador':'Ecuador',
        'Egypt':'Egypt',
        'Eritrea':'Eritrea',
        'Spain':'Spain',
        'Estonia':'Estonia',
        'Ethiopia':'Ethiopia',
        'Finland':'Finland',
        'Fiji':'Fiji',
        'Falkland Islands':'Falkland Islands',
        'France':'France',
        'Gabon':'Gabon',
        'United Kingdom':'Britain',
        'Georgia':'Georgia',
        'Ghana':'Ghana',
        'Guinea':'Guinea',
        'Gambia':'Gambia',
        'Guinea Bissau':'Guinea-Bissau',
        'Eq. Guinea':'Equatorial Guinea',
        'Greece':'Greece',
        'Greenland':'Greenland',
        'Guatemala':'Guatemala',
        'French Guiana':'French Guyana',
        'Guyana':'Guyana',
        'Honduras':'Honduras',
        'Croatia':'Croatia',
        'Haiti':'Haiti',
        'Hungary':'Hungary',
        'Indonesia':'Indonesia',
        'India':'India',
        'Ireland':'Ireland',
        'Iran':'Iran',
        'Iraq':'Iraq',
        'Iceland':'Iceland',
        'Israel':'Israel',
        'Italy':'Italy',
        'Jamaica':'Jamaica',
        'Jordan':'Jordan',
        'Japan':'Japan',
        'Japan':'Native Japan',
        'Kazakhstan':'Kazakhstan',
        'Kenya':'Kenya',
        'Kyrgyzstan':'Kyrgyzstan',
        'Cambodia':'Cambodia',
        'Korea':'The Republic of Korea',
        'Kosovo':'Kosovo',
        'Kuwait':'Kuwait',
        'Lao PDR':'Laos',
        'Lebanon':'Lebanon',
        'Liberia':'Liberia',
        'Libya':'Libya',
        'Sri Lanka':'Sri Lanka',
        'Lesotho':'Lesotho',
        'Lithuania':'Lithuania',
        'Luxembourg':'Luxembourg',
        'Latvia':'Latvia',
        'Morocco':'Morocco',
        'Moldova':'Moldova',
        'Madagascar':'Madagascar',
        'Mexico':'Mexico',
        'Macedonia':'Macedonia',
        'Mali':'Mali',
        'Myanmar':'Myanmar',
        'Montenegro':'Montenegro',
        'Mongolia':'Mongolia',
        'Mozambique':'Mozambique',
        'Mauritania':'Mauritania',
        'Malawi':'Malawi',
        'Malaysia':'Malaysia',
        'Namibia':'Namibia',
        'New Caledonia':'New Caledonia',
        'Niger':'Niger',
        'Nigeria':'Nigeria',
        'Nicaragua':'Nicaragua',
        'Netherlands':'Netherlands',
        'Norway':'Norway',
        'Nepal':'Nepal',
        'New Zealand':'New Zealand',
        'Oman':'Oman',
        'Pakistan':'Pakistan',
        'Panama':'Panama',
        'Peru':'Peru',
        'Philippines':'The Philippines',
        'Papua New Guinea':'papua new guinea',
        'Poland':'poland',
        'Puerto Rico':'Puerto Rico',
        'Dem. Rep. Korea':'Korea',
        'Portugal':'Portugal',
        'Paraguay':'Paraguay',
        'Qatar':'Qatar',
        'Romania':'Romania',
        'Russia':'Russia',
        'Rwanda':'Rwanda',
        'W. Sahara':'Western Sahara',
        'Saudi Arabia':'Saudi Arabia',
        'Sudan':'Sudan',
        'S. Sudan':'South Sudan',
        'Senegal':'Senegal',
        'Solomon Is.':'Solomon Islands',
        'Sierra Leone':'sierra leone',
        'El Salvador':'El Salvador',
        'Somaliland':'somaliland',
        'Somalia':'Somalia',
        'Serbia':'Serbia',
        'Suriname':'Suriname',
        'Slovakia':'Slovakia',
        'Slovenia':'Slovenia',
        'Sweden':'Sweden',
        'Swaziland':'Swaziland',
        'Syria':'Syria',
        'Chad':'Chad',
        'Togo':'Togo',
        'Thailand':'Thailand',
        'Tajikistan':'Tajikistan',
        'Turkmenistan':'Turkmenistan',
        'East Timor':'Timor-Leste',
        'Trinidad and Tobago':'Trinidad and Tobago',
        'Tunisia':'Tunisia',
        'Turkey':'Turkey',
        'Tanzania':'Tanzania',
        'Uganda':'Uganda',
        'Ukraine':'Ukraine',
        'Uruguay':'Uruguay',
        'United States':'U.S.A',
        'Uzbekistan':'Uzbekistan',
        'Venezuela':'Venezuela',
        'Vietnam':'Vietnam?',
        'Vanuatu':'Vanuatu',
        'West Bank':'West Bank',
        'Yemen':'Yemen',
        'South Africa':'South Africa',
        'Zambia':'Zambia',
        'Zimbabwe':'zimbabwe'
    }
map = Map("World Epidemic Distribution",width=1200,height=600)
map.add("COVID19",name,confirm,maptype='world',name_map=nameMap,is_map_symbol_show=False)
map.render('World Epidemic Distribution.html')

Refresh html page after successful run

Tags: JSON pip network Pycharm

Posted on Sun, 03 May 2020 23:21:34 -0700 by nalleyp23