python's requests library and beutifusoup library crawling girls' pictures

1: python crawler - using requests library to get information of web page

Record your reptile process
In fact, I want to use jsonpath and json before climbing, and then I found that after opening the Meitu bar website, I only thought about the methods of crawling the requests library and the beautiful soup Library (dog head)

Not much to say, go straight to the code:

import requests
from bs4 import BeautifulSoup

def getHtml(url):
    cookie = {
    'UM_distinctid': '1727e776cfc7c6-0de60d097b0f15-c373667-144000-1727e776cfd9a6',
    'CNZZDATA1256622196': '1626070704-1591255197-null%7C1591255197',
    'Hm_lvt_1941ba27d34dec171a181ef89e310488': '1591259656',
    'Hm_lpvt_1941ba27d34dec171a181ef89e310488': '1591260180',
    }
    header = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36'}
    r = requests.get(url, headers=header, cookies=cookie)
    r.encoding = r.apparent_encoding
    return r.text

def getImage(html, result):
    soup = BeautifulSoup(html, 'html.parser')
    # Return to a dictionary
    tergets = soup.find_all("img", {'style':"display: inline;"})
    for each in tergets:
        result.append(each['src'])
    return result


def main():
    result = []
    num = 1   # Picture No
    url = 'http://www.meituba.com/xinggan/list{}.html'
    # How many pages to crawl
    depth = 5
    for i in range(81, 81+depth):
        u = url.format(i) 
        html = getHtml(u)
        getImage(html, result)
    #print(result)
    
    for i in result:
        a = requests.get(i)
        with open(r'D:\pythonproject\A small project of your own or a small project of your own\A picture of a girl\{}.jpg'.format(num), 'wb') as f:
            f.write(a.content)
            num += 1

if __name__ == '__main__':
    main() 

Third party libraries used:


PS: to run the code, you need to modify the address where the pictures are saved and install the corresponding library on your own computer
PS: Xiaobai wrote an article for the first time, and the code was rubbish. I just want to record my crawler and watch myself progress! Come on, stranger!

Tags: Python JSON Windows encoding

Posted on Thu, 04 Jun 2020 08:39:09 -0700 by Jay87