Random User-Agent middleware based on
fake-useragent. It
picks up User-Agent
strings based on usage
statistics
from a real world database.
The simplest way is to install it via pip:
pip install scrapy-fake-useragent
Turn off the built-in UserAgentMiddleware
and add
RandomUserAgentMiddleware
.
In Scrapy >=1.0:
DOWNLOADER_MIDDLEWARES = {
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,
}
In Scrapy <1.0:
DOWNLOADER_MIDDLEWARES = {
'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,
}
There's a configuration parameter RANDOM_UA_TYPE
defaulting to random
which is passed verbatim to the fake-user-agent. Therefore you can set it to say firefox
to mimic only firefox browsers. Most useful though would be to use desktop
or mobile
values to send desktop or mobile strings respectively.
To use with middlewares of random proxy such as scrapy-proxies, you need:
- set
RANDOM_UA_PER_PROXY
to True to allow switch per proxy - set priority of
RandomUserAgentMiddleware
to be greater thanscrapy-proxies
, so that proxy is set before handle UA
There's a configuration parameter FAKEUSERAGENT_FALLBACK
defaulting to
None
. You can set it to a string value, for example Mozilla
or
Your favorite browser
, this configuration can completely disable any
annoying exception.