使用python3怎么爬取數據至mysql

發布時間：2021-05-01 09:27:26 來源：億速云閱讀：160 作者：Leah 欄目：開發技術

使用python3怎么爬取數據至mysql？針對這個問題，這篇文章詳細介紹了相對應的分析和解答，希望可以幫助更多想解決這個問題的小伙伴找到更簡單易行的方法。

Python的優點有哪些

1、簡單易用，與C/C++、Java、C# 等傳統語言相比，Python對代碼格式的要求沒有那么嚴格；2、Python屬于開源的，所有人都可以看到源代碼，并且可以被移植在許多平臺上使用；3、Python面向對象，能夠支持面向過程編程,也支持面向對象編程；4、Python是一種解釋性語言，Python寫的程序不需要編譯成二進制代碼，可以直接從源代碼運行程序；5、Python功能強大，擁有的模塊眾多，基本能夠實現所有的常見功能。

#!/usr/local/bin/python3.5 
# -*- coding:UTF-8 -*- 
from urllib.request import urlopen 
from bs4 import BeautifulSoup 
import re 
import datetime 
import random 
import pymysql 
 
connect = pymysql.connect(host='192.168.10.142', unix_socket='/tmp/mysql.sock', user='root', passwd='1234', db='scraping', charset='utf8') 
cursor = connect.cursor() 
cursor.execute('USE scraping') 
 
random.seed(datetime.datetime.now()) 
 
 
def store(title, content): 
 
  execute = cursor.execute("select * from pages WHERE `title` = %s", title) 
  if execute <= 0: 
    cursor.execute("insert into pages(`title`, `content`) VALUES(%s, %s)", (title, content)) 
    cursor.connection.commit() 
  else: 
    print('This content is already exist.') 
 
 
def get_links(acticle_url): 
  html = urlopen('http://en.wikipedia.org' + acticle_url) 
  soup = BeautifulSoup(html, 'html.parser') 
  title = soup.h2.get_text() 
  content = soup.find('div', {'id': 'mw-content-text'}).find('p').get_text() 
  store(title, content) 
  return soup.find('div', {'id': 'bodyContent'}).findAll('a', href=re.compile("^(/wiki/)(.)*$")) 
 
links = get_links('') 
 
try: 
  while len(links) > 0: 
    newActicle = links[random.randint(0, len(links) - 1)].attrs['href'] 
    links = get_links(newActicle) 
    print(links) 
finally: 
  cursor.close() 
  connect.close()

關于使用python3怎么爬取數據至mysql問題的解答就分享到這里了，希望以上內容可以對大家有一定的幫助，如果你還有很多疑惑沒有解開，可以關注億速云行業資訊頻道了解更多相關知識。

向AI問一下細節

91超碰碰碰碰久久久久久综合_超碰av人澡人澡人澡人澡人掠_国产黄大片在线观看画质优化_txt小说免费全本

使用python3怎么爬取數據至mysql

Python的優點有哪些

猜你喜歡

91超碰碰碰碰久久久久久综合_超碰av人澡人澡人澡人澡人掠_国产黄大片在线观看画质优化_txt小说免费全本

使用python3怎么爬取數據至mysql

Python的優點有哪些

猜你喜歡

最新資訊

相關推薦

相關標簽