Detecting Selenium

When looking to extract information from more difficult to scrape sites many programmers turn to browser automation tools such as Selenium and iMacros. At the time of writing, Selenium is by far the most popular option for those looking to leverage browser automation for information retrieval purposes. However, Selenium is very detectable and site owners would be able to block a large percentage of all Selenium users.

Selenium Detection with Chrome

When using Chrome, the Selenium driver injects a webdriver property into the browser’s navigator object. This means it’s possible to write a couple lines of JavaScript to detect that the user is using Selenium. The above code snippet simply checks whether webdriver is set to true and redirects the user should this be the case. I have never seen this technique used in the wild, but I can confirm that it seems to successfully redirect those using Chrome with Selenium.

Selenium Detection with Firefox

Older versions of Firefox used to inject a webdriver attribute into the HTML document. This means that older versions of Firefox could be very simply detected using the above code snippet. At the time of writing Firefox no longer adds this element to pages when using Selenium.

Additional methods of detecting Selenium when using Firefox have also been suggested. Testing seems to suggest that these do not work with the latest builds of Firefox. However, the webdriver standard suggests that this may eventually be implemented in Firefox again.

Selenium Detection with PhantomJS

All current versions of PhantomJS, add attributes to the window element. This allows site owners to simply check whether these specific PhantomJS attributes are set and redirect the user away when it turns out that they are using PhantomJS. It should also be noted that support for the PhantomJS project has been rather inconsistent and the project makes use on an outdated webkit version which is also detectable and could present a security list.

Avoiding Detection

Your best of avoiding detection when using Selenium would require you to use one of the latest builds of Firefox which don’t appear to give off any obvious sign that you are using Firefox. Additionally, it may be worth experimenting with both Safari and Opera which are much less commonly used by those scraping the web. It would also seem likely that Firefox may be giving off some less obvious footprint which would need further investigation to discover.

Leave a Reply

Your email address will not be published. Required fields are marked *