• 大小: 6.9MB
    文件类型: .rar
    金币: 1
    下载: 0 次
    发布日期: 2023-08-24
  • 语言: Python
  • 标签: python  爬虫  

资源简介

利用python做的一个简单爬虫程序,可获取python百度百科所有链接内容并以网页的内容显示

资源截图

代码片段和文件信息

import urllib
from urllib import request
class HtmlDownloader(object):
    def download(self new_url):
        if new_url is None:
            return None;
        response=urllib.request.urlopen(new_url);
        if response.getcode()!=200:                  #判断是否请求成功
            return None
        return response.read();


 属性            大小     日期    时间   名称
----------- ---------  ---------- -----  ----

     文件        306  2018-05-07 19:24  pypachong\.idea\misc.xml

     文件        270  2018-05-06 10:34  pypachong\.idea\modules.xml

     文件        517  2018-05-07 19:24  pypachong\.idea\pypachong.iml

     文件        180  2018-05-07 19:10  pypachong\.idea\vcs.xml

     文件      25324  2018-05-19 14:46  pypachong\.idea\workspace.xml

     文件        354  2018-05-07 19:29  pypachong\baidu_baike\html_downloader.py

     文件        735  2018-05-07 21:28  pypachong\baidu_baike\html_output.py

     文件       1263  2018-05-07 21:20  pypachong\baidu_baike\html_parser.py

     文件         82  2018-05-09 10:54  pypachong\baidu_baike\output.html

     文件       1579  2018-05-07 21:22  pypachong\baidu_baike\spider_main.py

     文件        668  2018-05-06 18:45  pypachong\baidu_baike\url_manager.py

     文件          0  2018-05-06 10:38  pypachong\baidu_baike\__init__.py

     文件        570  2018-05-16 10:48  pypachong\baidu_baike\__pycache__\html_downloader.cpython-36.pyc

     文件       1029  2018-05-16 10:48  pypachong\baidu_baike\__pycache__\html_output.cpython-36.pyc

     文件       1322  2018-05-16 10:48  pypachong\baidu_baike\__pycache__\html_parser.cpython-36.pyc

     文件       1165  2018-05-07 19:24  pypachong\baidu_baike\__pycache__\url_manager.cpython-36.pyc

     文件        119  2018-05-07 19:24  pypachong\baidu_baike\__pycache__\__init__.cpython-36.pyc

     文件         54  2018-05-06 10:34  pypachong\venv\Lib\site-packages\easy-install.pth

     文件          1  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\EGG-INFO\dependency_links.txt

     文件         68  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\EGG-INFO\entry_points.txt

     文件          1  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\EGG-INFO\not-zip-safe

     文件       2639  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\EGG-INFO\PKG-INFO

     文件         64  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\EGG-INFO\requires.txt

     文件      10147  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\EGG-INFO\SOURCES.txt

     文件          4  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\EGG-INFO\top_level.txt

     文件      11910  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\pip\basecommand.py

     文件      10465  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\pip\baseparser.py

     文件      16474  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\pip\cmdoptions.py

     文件       1382  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\pip\commands\check.py

     文件       2453  2018-05-06 10:34  pypachong\venv\Lib\site-packages\pip-9.0.1-py3.6.egg\pip\commands\completion.py

............此处省略342个文件信息

评论

共有 条评论