Proxies — also known as proxy servers — are intermediate servers that receive web requests and redirect them. Proxy servers are important in instances in which a site may block or monitor traffic from specific locations. Additionally, IP-based messaging or tailored content meant for specific audiences may be circumvented with the use of proxies. Proxies allow a visitor to circumvent these blockages.
Diffbot’s web extraction technologies utilize proxies to provide predictable crawling access to properties across the web. Diffbot’s crawlers default to respect robots.txt on crawled domains.