python 网络爬虫

大小: 1KB

文件类型: .py

金币: 1

下载: 1 次

发布日期: 2021-08-21
语言: Python
标签: python WebSpider 网络爬虫

高速下载

资源简介

用python语言写的一个网络爬虫程序，实现了爬取网站内的所有链接，可以用来对一个网站的受欢迎程度进行数据分析

资源截图

小图大图

代码片段和文件信息

# encoding utf-8
# Function:acquire the link on the web page



import urllib.request
import re




r = re.compile（r‘href=“（http://www\.cnpythoner\.com.+?）“‘）#正则



def get_urls_and_save_from_contents（url）:       #打开当前页面，筛选符合条件的网址
        try:
                req = urllib.request.Request（url）
                req.add_header（‘User-Agent‘‘Mozilla/5.0 （Windows NT 10.0） AppleWebKit/537.36 （KHTML like Gecko） Chrome/46.0.2486.0 Safari/537.36 Edge/13.10586‘）
                response = urllib.request.urlopen（req）
                contents = response.read（）.decode（‘utf-8‘）
                g = []

上一篇：数据的导入与预处理课程设计附带报告书
下一篇：Python利用神经网络解决非线性回归问题详解

共有条评论

python 网络爬虫

资源简介

资源截图

代码片段和文件信息

评论

相关资源