A web scraper build to search specific information for a given compound (and its pseudonyms)

TODO on sppofing user agent

+3 -3
+3 -3
FourmiCrawler/settings.py
··· 18 18 FEED_URI = 'results.json' 19 19 FEED_FORMAT = 'jsonlines' 20 20 21 - USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.137 Safari/537.36' 22 - 23 - 24 21 # Crawl responsibly by identifying yourself (and your website) on the 25 22 # user-agent 26 23 24 + # [todo] - Check for repercussions on spoofing the user agent 25 + 27 26 # USER_AGENT = 'FourmiCrawler (+http://www.yourdomain.com)' 27 + USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.137 Safari/537.36'