Random User-Agent in Requests (Python)

When using the Python requests library to extract data from websites, you may want to avoid detection and minimise the chances of your scraping activities being detected.

Setting a Custom User-Agent

To lower the chances of detention it is often recommended that users set a custom header. The requests library makes it very easy to set a custom user-agent. Often this is enough to avoid detection, with system administrators only looking for default user-agents when adding server side blocking rules.

Setting a Random User-Agent

If engaged in commercially sensitive scrapping, you may want to take additional precautions and randomise the User-Agent sent with each request.

The above snippet of code returns a random user-agent and Chrome’s default ‘Accept’ heading. When writing this code snippet I took efforts to include the ten most commonly used desktop browsers. It would probably be worth updating the browser list from time to time, to ensure that the user agents included in the list are up to date.

I have seen others loading a large list of hundreds and hundreds of user-agents. But I think this approach is misguided as it may see your crawlers make thousands of requests from very rarely used user agents.

Anyone looking closely at the ‘Accept’ headers will quickly realise that all of the different user agents are using the same ‘Accept’ header. Thankfully, the majority of system administrators completely ignore the intricacies of the sent ‘Accept’ headers and simply check if browsers are sending something plausible. Should it really be necessary, it would also be possible to send accurate ‘Accept’ headers with each request. I have never personally had to resort to this extreme measure.

3 thoughts to “Random User-Agent in Requests (Python)”

  1. I would personally do it like this:

    from fake_useragent import UserAgent

    def random_headers():
    return {UserAgent().random, ‘Accept’:’text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8′}
    Hope it helps

Leave a Reply

Your email address will not be published. Required fields are marked *