Scrapy随机更换User-Agent

利用DownMIDLEWARE实现UA随机更换

安装fake-useragent库,随机生成UA

pip install fake-useragent github地址

在middlewares创建类

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from fake_useragent import UserAgent
#随机更换UserAgent
class RandomUserAgentMiddleware(object):
def __init__(self,crawler):
super(RandomUserAgentMiddleware,self).__init__()
@classmethod
def from_crawler(cls,crawler):
return cls(crawler)
def process_request(self,request,spider):
# 调用fake-useragent 随机生成UA
ua = UserAgent()
request.headers.setdefault('User-Agent',ua.random)

在settings中设置 DOWNLOADER_MIDDLEWARES

1
2
3
4
DOWNLOADER_MIDDLEWARES = {
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
'lagou.middlewares.RandomUserAgentMiddleware': 500,
}

注意:要先取消掉Scrapy默认设置UA的Middleware