Use Python crawlers to crawl simple Web pages (Getting started with Python crawlers)

Today, let's take a look at using Python to crawl some simple web pages.

Tool used: IDLE (Python 3.6 64-bit)

1. Crawl Jingdong Commodity Page

What I'm going to crawl is the information on this Tokyo merchandise page, coded as follows:

import requests
url = "https://item.jd.com/6957643.html"
try:
    r = requests.get(url)
    r.raise_for_status()
    r.encoding = r.apparent_encoding
    print(r.text[:1000])
except:
        print("Crawl failed")

2. Crawl Amazon Merchandise Page

Next, I'm going to crawl this Amazon product page with the following code:

import requests
url = "https://www.amazon.cn/gp/product/B00W2T39C8/ref=cn_ags_s9_asin?pf_rd_p=33e63d50-addd-4d44-a917-c9479c457e1a&pf_rd_s=merchandised-search-3&pf_rd_t=101&pf_rd_i=1403206071&pf_rd_m=A1AJ19PSB66TGU&pf_rd_r=FQQGZ7T42BF03V117HRD&pf_rd_r=FQQGZ7T42BF03V117HRD&pf_rd_p=33e63d50-addd-4d44-a917-c9479c457e1a&ref=cn_ags_s9_asin_1403206071_merchandised-search-3"
try:
    kv = {'user-agent':'Mozilla/5.0'}
    r = requests.get(url,headers = kv)
    r.raise_for_status()
    r.encoding = r.apparent_encoding
    print(r.text[1000:2000])
except:
    print("Crawl failed")
    

3. Enter keywords to crawl how much data Baidu or 360 can search for.The code is as follows:

import requests
keyword = "Python"
try:
    kv = {'wd':keyword}#Change key pair wd to q and baidu to so if using 360
    r = requests.get("http://www.baidu.com/s",params = kv)
    print(r.request.url)
    r.raise_for_status()
    print(len(r.text))
except:
    print("Crawl failed")

4. Crawl the picture and save it in the designated place (E://hh name abc.jpg).The code is as follows:

import requests
import os
url = "https://timgsa.baidu.com/timg?image&quality=80&size=b9999_10000&sec=1533128040259&di=601acd33bcb188bfeb41cb50bc51ed41&imgtype=0&src=http%3A%2F%2Fs1.sinaimg.cn%2Fmw690%2F006LDoUHzy7auXElZGE40%26690"
path = "E://hh/abc.jpg"
try:
    
        r = requests.get(url)
        with open(path,'wb') as f:
            f.write(r.content)
            f.close()
            print("File saved")
       
except :
    print("Crawl failed")

How did you feel?Did you learn?

Tags: Python encoding

Posted on Wed, 15 Jan 2020 08:13:43 -0800 by mikeT