干it的小张

2024年2月3日

摘要： import requests, time, os, pandas as pd, numpy as npfrom bs4 import BeautifulSoupdef download(url, page): html = requests.get(url).text soup = Beautif 阅读全文

posted @ 2024-02-03 08:00 干it的小张阅读(11) 评论(0) 推荐(0) 编辑

2023年12月13日

有关解决模块的问题

摘要： 1、Python 检查Python模块是否为最新版本：https://geek-docs.com/python/python-ask-answer/467_python_check_if_requirements_are_up_to_date.html https://blog.csdn.net/p 阅读全文

posted @ 2023-12-13 19:49 干it的小张阅读(26) 评论(0) 推荐(0) 编辑

2023年11月18日

etree和协程爬明朝那些事、协程和解密爬网吧电影、scrapy爬4399游戏、

摘要： 1、etree和协程爬明朝那些事 import requestsfrom lxml import etreeimport asyncioimport aiohttpimport aiofilesimport os# 1. 拿到主页面的源代码 (不需要异步)# 2. 拿到页面源代码之后. 需要解析出阅读全文

posted @ 2023-11-18 14:26 干it的小张阅读(44) 评论(0) 推荐(0) 编辑

2023年11月8日

不墨迹爬就完啦：用bs4爬壁纸网、用re爬动画影评网、用xpath爬中国票房数据、爬梨视频、ThreadPoolExecutor线程池爬电影票房、Queue队列和线程池爬斗图网、用协程asyncio和异步包aiohttp和aiofiles爬美图网、

摘要： 1、用bs4爬壁纸网 import requestsfrom bs4 import BeautifulSoup # 导入BeautifulSoupfrom urllib.parse import urljoin # 专门用来做url路径拼接的import timeheader = { "user-a 阅读全文

posted @ 2023-11-08 12:58 干it的小张阅读(53) 评论(0) 推荐(0) 编辑

爬虫常用写法和用法

摘要： 1、查找所有：结果 = re.findall(正则, 字符串) => 返回列表，用法：r""专业写正则的。没有转义的烦恼，result = re.findall(r"\d+", "a56爆大奖在线娱乐有1000万，不给你花，a56爆大奖在线娱乐有1块a56爆大奖在线娱乐给你") 2、结果 = re.finditer(正则, 字符串) => 返回迭代器阅读全文

posted @ 2023-11-08 12:30 干it的小张阅读(41) 评论(0) 推荐(0) 编辑

公告