您好,登錄后才能下訂單哦!
本篇內容介紹了“python怎么爬取京東商品評論”的有關知識,在實際案例的操作過程中,不少人都會遇到這樣的困境,接下來就讓小編帶領大家學習一下如何處理這些情況吧!希望大家仔細閱讀,能夠學有所成!
#!/usr/bin/python # -*- coding: UTF-8 -*- import requests import re import json import time import xlwt import random # # #配置表格 #不需要明白是干啥的 #有下面4行代碼就可以往表格寫中文了 # style=xlwt.XFStyle() font=xlwt.Font() font.name='SimSun' style.font=font #創建一個表格 w=xlwt.Workbook(encoding='utf-8') #添加個sheet ws=w.add_sheet('sheet 1',cell_overwrite_ok=True) #當前寫入表格到第 row行 row=1 # #寫入表格頭 # ws.write(0,0,'content') ws.write(0,1,'userClientShow') ws.write(0,2,'creationTime') ws.write(0,3,'userLevelName') ws.write(0,4,'productColor') ws.write(0,5,'userLevelId') ws.write(0,6,'score') ws.write(0,7,'referenceName') ws.write(0,8,'referenceTime') ws.write(0,9,'isMobile') ws.write(0,10,'nickname') # #接受一個json對象 #將內容寫進表格 #一次一頁評論 # def write_json_to_xls(dat): global row for comment in dat['comments']: ws.write(row,0,comment['content']) ws.write(row,1,comment['userClientShow']) ws.write(row,2,comment['creationTime']) ws.write(row,3,comment['userLevelName']) ws.write(row,4,comment['productColor']) ws.write(row,5,comment['userLevelId']) ws.write(row,6,comment['score']) ws.write(row,7,comment['referenceName']) ws.write(row,8,comment['referenceTime']) ws.write(row,9,comment['isMobile']) ws.write(row,10,comment['nickname']) row+=1 # # # 循環獲取數據 # # hearders = {"Referer": "https://item.jd.hk/2990360.html", "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36'"} auto_jquery = 8809536 for i in range(1,10+1): #url='https://club.jd.com/comment/productPageComments.action?productId=1475512465&score=0&sortType=5&page=%d&pageSize=100&isShadowSku=0&fold=' % i #url = 'https://sclub.jd.com/comment/productPageComments.action?callback=jQuery2663266&productId=2990360&score=2&sortType=5&page=%d&pageSize=10&pin=null&_=1563330030798' % i url = 'https://sclub.jd.com/comment/productPageComments.action' #print(url) try: auto_jquery = auto_jquery+1 jquery = 'jQuery%d' % auto_jquery times = random.randint(100,999) print times auto_tims = int(time.time()) true_string = '%d%d' %(auto_tims, times) print true_string params = {'callback':jquery,'productId':'2990360','sortType':'5','page':i,'pageSize':'10','pin':'null','_':true_string,'score':'2'} json_req = requests.get(url,params=params,headers=hearders) print('11111111111111111111111') print (json_req.url) print (json_req.text) print('22222222222222222222222') flag = json_req.text.split('(') flag_two = flag[1].split(')') print flag_two[0]; print ('666666666666666666') json_flag = json.loads((flag_two[0])) print json_flag['comments'] #dat = json_req.json() write_json_to_xls(json_flag) print(u'寫入第%d頁數據'%i) except Exception as e: print(u'獲取數據失敗數據',e) time.sleep(0.5) #將數據存進表格 w.save('result.xls')
“python怎么爬取京東商品評論”的內容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業相關的知識可以關注億速云網站,小編將為大家輸出更多高質量的實用文章!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。