Turn websites into data in seconds. Crawly spiders and extracts complete structured data from an entire website.
Miles Grimshaw of Thrive Capital recently used Crawlbot and our Product API to analyze product availability and extract pricing data from a number of online fashion marketplaces — to help determine the scale, margins, customer profile and trends of each site, and to inform their investment decision-making. Miles writes about his experience and analysis on his blog. Nice […]
What could you build if the entire web was your database? Could you do it in a day? We’re glad to be working with the fine folks at SemanticWeb.com to host the inaugural Semantic Hack at the Semantic Technology & Business Conference in San Francisco on June 1, 2013. See additional details and registration at http://semantichack.eventbrite.com, and more […]
In a recent benchmark, Diffbot placed first overall among text extraction APIs on an academic evaluation set and one sampled from Google News. Tomaz Kovacic, a university student in artificial intelligence, recently conducted a comprehensive benchmark of text extraction methods as part of his thesis. Included in the study are commercial vendors as well as open-source APIs […]