SeleniumPyTest

非正式使用指南

环境安装

安装python

访问：

下载页面

Windows直达

安装记得点上注册到Path！！

安装pip

Python 3.4+ 以上版本都自带 pip 工具

下载Chrome Driver

如果你没有Google Chrome那就装一个，装完后再下对应版本的Chrome Driver

将Chrome Driver.exe放在Python安装目录下

安装Selenium

输入 pip install selenium 通过pip安装Selenium(抓阅读量用)

安装lxml(俩都是L)

输入 pip install lxml (抓内容、转评赞用)

(如果还缺啥那就pip install 安装就完事儿了)

不主动、不拒绝、不负责声明

如果有啥bug我也不负责，反正就随便写的。要说有什么可以Update的地方，那就是再把xpath学习一手让TestUnit.py直接抓取所有内容，输出到.xls中

接着开始跑用例

在PowerShell或者其他Terminal中输入py 程序名儿.py 就可以运行程序了

用例weiboSpider.py（可扒取发布时间、机型、地点、博文、转评赞数量）
用例 TestUnit01.py 是我自己写来扒阅读量的，最后会生成一个.xls文件

想必看到这里你已经懂了，数据合并还是得手动。但是这已经减少80%的工作量了！知足了！有兴趣的话你再修改。欢迎pr！

TestUnit01_py

这个用来抓阅读量。

运行之后，扫描二维码。然后全自动抓下来，生成Excel表格。

要抓哪一页就改pageIndex然后看到Terminal有个<<<<< done >>>> 就去文件夹下找到名为微博数据-2019xxxx的Excel文件。

如果不想要阅读量三个字，请使用=RIGHT(A1,LEN(A1)-3)

注意：A1为A1单元格，3为你要删除前3个字符 s

具体参考：

百度经验1

百度经验2

百度经验3

weiboSpider_py

这个用来抓大部分的内容，最后会生成一个csv，也可以用Excel打开，然后手动合并一下数据。

你需要修改weiboSpider.py中几项参数。

cookie
user_id
filter

先登录https://passport.weibo.cn/signin/login,然后在地址栏输入https://weibo.cn。

F12点击"Headers"，其中"Request Headers"下，"Cookie"后的值即为我们要找的cookie值取出cookie。

具体操作见原作者GitHub

注意：每99条保存一次csv，如果不想爬了直接关闭终端

webScpTest01.py

微博相关资料统计

使用m.weibo.cn古老网页抓取教程weiboSpider.py
xpath的使用

Notice

我的Chrome版本为74
Chrome Driver忘记了.下载网址为http://chromedriver.storage.googleapis.com/index.html
除weiboSpider.py版权归原开发者所有，本Repo所有代码CC0，随意使用。（如果用的上的话，笑）

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
NoteImage		NoteImage
PicResource/weibopics		PicResource/weibopics
__pycache__		__pycache__
README.md		README.md
READMEMORE.md		READMEMORE.md
TestUnit01.py		TestUnit01.py
debug.log		debug.log
excelMerage.py		excelMerage.py
excelWT.py		excelWT.py
strTest.py		strTest.py
webScpTest01.py		webScpTest01.py
weiboSpider.py		weiboSpider.py
wordCloudGen.py		wordCloudGen.py
xpathTest.py		xpathTest.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SeleniumPyTest

非正式使用指南

环境安装

安装python

安装pip

下载Chrome Driver

安装Selenium

安装lxml(俩都是L)

不主动、不拒绝、不负责声明

接着开始跑用例

TestUnit01_py

weiboSpider_py

webScpTest01.py

微博相关资料统计

Notice

About

Releases

Packages

Languages

sannnyu/CCYLDataTools

Folders and files

Latest commit

History

Repository files navigation

SeleniumPyTest

非正式使用指南

环境安装

安装python

安装pip

下载Chrome Driver

安装Selenium

安装lxml(俩都是L)

不主动、不拒绝、不负责声明

接着开始跑用例

TestUnit01_py

weiboSpider_py

webScpTest01.py

微博相关资料统计

Notice

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages