Skip to content

infin8/scrapy-fake-useragent

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyPI version Requirements Status

scrapy-fake-useragent

Random User-Agent middleware based on fake-useragent. It picks up User-Agent strings based on usage statistics from a real world database.

Installation

The simplest way is to install it via pip:

pip install scrapy-fake-useragent

Configuration

Turn off the built-in UserAgentMiddleware and add RandomUserAgentMiddleware.

In Scrapy >=1.0:

DOWNLOADER_MIDDLEWARES = {
    'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
    'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,
}

In Scrapy <1.0:

DOWNLOADER_MIDDLEWARES = {
    'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
    'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,
}

Configuring User-Agent type

There's a configuration parameter RANDOM_UA_TYPE defaulting to random which is passed verbatim to the fake-user-agent. Therefore you can set it to say firefox to mimic only firefox browsers. Most useful though would be to use desktop or mobile values to send desktop or mobile strings respectively.

Usage with scrapy-proxies

To use with middlewares of random proxy such as scrapy-proxies, you need:

  1. set RANDOM_UA_PER_PROXY to True to allow switch per proxy
  2. set priority of RandomUserAgentMiddleware to be greater than scrapy-proxies, so that proxy is set before handle UA

Configuring Fake-UserAgent fallback

There's a configuration parameter FAKEUSERAGENT_FALLBACK defaulting to None. You can set it to a string value, for example Mozilla or Your favorite browser, this configuration can completely disable any annoying exception.

About

Random User-Agent middleware based on fake-useragent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%