您好,登錄后才能下訂單哦!
【簡介】
Puppeteer 是一個 Node 庫,它提供了一個高級 API 來通過 DevTools 協議控制 Chromium 或 Chrome。Puppeteer 默認以 headless 模式運行,但是可以通過修改配置文件運行“有頭”模式。
GitHub 網址:https://github.com/GoogleChrome/puppeteer
官網文檔:https://pptr.dev/
官方文檔中文版:https://zhaoqize.github.io/puppeteer-api-zh_CN/#?product=Puppeteer
【準備工作】
下載并安裝,選 LTS 版本即可:https://nodejs.org
安裝 puppeteer
npm?install?puppeteer?--registry=https://registry.npm.taobao.org
【51cto.js】
/*! ?*?walker@2019-07-13:?測試?puppeteer,得到網頁截圖和html ?*/ `use?strict`; const?puppeteer?=?require('puppeteer'); const?fs?=?require("fs"); //?得到一個兩數之間的隨機整數,包括兩個數在內 function?GetRandInt(min,?max)?{ ????min?=?Math.ceil(min); ????max?=?Math.floor(max); ????return?Math.floor(Math.random()?*?(max?-?min?+?1))?+?min;?//含最大值,含最小值? } //?下載一篇文章 async?function?DownOneArticle(page,?rawid)?{ ????console.log('DownOneArticle?%s?...',?rawid); ????url?=?'https://blog.51cto.com/walkerqt/'?+?rawid; ????console.log('goto?%s?...',?url); ????await?page.goto( ????????url, ????????{ ????????????//?timeout:?90*1000, ????????????referer:?'https://blog.51cto.com/walkerqt' ????????}); ????let?selector?=?'div.artical-copyright'; ????console.log('waitForSelector:?%s?...',?selector); ????await?page.waitForSelector( ????????selector,????//?指定等待?css ????????{ ????????????timeout:?10?*?1000 ????????} ????); ????await?page.waitFor(GetRandInt(2,?5)?*?1000);?????//?隨機睡眠幾秒 ????await?page.screenshot({?path:?rawid?+?'.png'?});????//?保存截圖 ????let?html?=?await?page.content(); ????fs.writeFileSync(rawid?+?".html",?html);????????????//?保存網頁 } (async?()?=>?{ ????const?browser?=?await?puppeteer.launch({????????//?啟動瀏覽器 ????????headless:?false,????//?是否啟用無頭模式 ????????args:?[ ????????????'--no-sandbox', ????????????//?'--proxy-server=http://192.168.30.3:8080'???//?代理 ????????] ????}); ????const?page?=?await?browser.newPage(); ????let?url?=?'https://blog.51cto.com/walkerqt';????//?首頁 ????console.log('goto?%s?...',?url); ????await?page.goto(url); ????let?xpath?=?'//*[@id="Tab"]/div[@class="artical-tit"]'; ????console.log('waitForXPath:?%s?...',?xpath); ????await?page.waitForXPath(????????//?指定等待?xpath ????????xpath, ????????{ ????????????timeout:?10?*?1000 ????????} ????); ????await?page.waitFor(GetRandInt(2,?5)?*?1000);?????//?隨機睡眠幾秒 ????let?rawidArray?=?['2419918',?'2415142',?'2413401',?'2396430'] ????for?(let?idx?in?rawidArray)?{ ????????try?{ ????????????await?DownOneArticle(page,?rawidArray[idx]); ????????}?catch?(error)?{ ????????????console.log('*?stack:\n?%s',?error.stack); ????????} ????} ????console.log("Good?boy!?Game?over!"); ????await?browser.close();??????//?關閉瀏覽器 })();
【運行】
運行
node?51cto.js
【相關閱讀】
mozilla 講 JavaScript 隨機數生成:https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Global_Objects/Math/random
Node.js Tips
*** walker ***
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。