文章詳情頁

python獲取整個(gè)網(wǎng)頁源碼的方法

瀏覽：10日期：2022-07-15 13:20:58

1、Python中獲取整個(gè)頁面的代碼：

import requestsres = requests.get(’https://blog.csdn.net/yirexiao/article/details/79092355’)res.encoding = ’utf-8’print(res.text)

2、運(yùn)行結(jié)果

實(shí)例擴(kuò)展：

from bs4 import BeautifulSoupimport time,re,urllib2t=time.time()websiteurls={}def scanpage(url): websiteurl=url t=time.time() n=0 html=urllib2.urlopen(websiteurl).read() soup=BeautifulSoup(html) pageurls=[] Upageurls={} pageurls=soup.find_all('a',href=True) for links in pageurls: if websiteurl in links.get('href') and links.get('href') not in Upageurls and links.get('href') not in websiteurls: Upageurls[links.get('href')]=0 for links in Upageurls.keys(): try: urllib2.urlopen(links).getcode() except: print 'connect failed' else: t2=time.time() Upageurls[links]=urllib2.urlopen(links).getcode() print n, print links, print Upageurls[links] t1=time.time() print t1-t2 n+=1 print ('total is '+repr(n)+' links') print time.time()-tscanpage(http://news.163.com/)

到此這篇關(guān)于python獲取整個(gè)網(wǎng)頁源碼的方法的文章就介紹到這了,更多相關(guān)python如何獲取整個(gè)頁面內(nèi)容請搜索好吧啦網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持好吧啦網(wǎng)！

Python 編程

上一條：python爬蟲使用正則爬取網(wǎng)站的實(shí)現(xiàn)下一條：python線程里哪種模塊比較適合

相關(guān)文章：

1. 不使用XMLHttpRequest對象實(shí)現(xiàn)Ajax效果的方法小結(jié)2. 用xslt將xml解析成xhtml的代碼3. Ajax實(shí)現(xiàn)省市縣三級聯(lián)動4. ASP.NET MVC使用Log4Net記錄異常日志并跳轉(zhuǎn)到靜態(tài)頁5. Ajax原理與應(yīng)用案例快速入門教程6. asp下利用xml打包網(wǎng)站文件7. XSL簡明教程8. JavaScript css3實(shí)現(xiàn)簡單視頻彈幕功能9. PHP laravel實(shí)現(xiàn)配置使用多數(shù)據(jù)庫10. XML和YAML的使用方法

排行榜

					
					不使用XMLHttpRequest對象實(shí)現(xiàn)Ajax效果的方法小結(jié)
用xslt將xml解析成xhtml的代碼
Ajax實(shí)現(xiàn)省市縣三級聯(lián)動
Ajax原理與應(yīng)用案例快速入門教程
ASP.NET MVC使用Log4Net記錄異常日志并跳轉(zhuǎn)到靜態(tài)頁
asp下利用xml打包網(wǎng)站文件
XSL簡明教程
JavaScript css3實(shí)現(xiàn)簡單視頻彈幕功能
Python requests HTTP驗(yàn)證登錄實(shí)現(xiàn)流程
一個(gè)用于MySQL的PHP XML類
SpringBoot+docker環(huán)境變量配置詳解