Diffbot’s New Product API Teaches Robots to Shop Online

Diffbot’s human wranglers are proud today to announce the release of our newest product: an API for… products! The Product API can be used for extracting clean, structured data from any e-commerce product page. It automatically makes available all the product data you’d expect: price, discount/savings amount, shipping cost, product description, any relevant product images, SKU and/or other […]

Read More

Diffbot APIs Are Getting Very META

We noticed recently that a common use for our Custom API Toolkit was augmenting Diffbot’s Automatic APIs with custom fields to return markup <META> tag data: meta descriptions, OpenGraph and Twitter Card tags, Schema.org microdata, etc. We figured we’d save you the trouble of hand-curating rules, so we added the <META> parameter across all of our […]

Read More

New Feature: Correct and *Concatenate* Multi-Page Articles

Our Article API automatically joins multiple-page articles into a single “text” or “html” field. On some sites though our algorithm is unable to concatenate for various reasons (typically non-standard pagination design convention). Furthermore, any site with an overridden “text” field (via a Custom API rule) will no longer automatically concatenate multiple pages. We’re happy to […]

Read More

Diffbot’s HackerNews Trend Analyzer

Like any good developer service, we’re fans of Hacker News. Making the vaunted Frontpage is a, well, vaunt-worthy accomplishment (we’ve been there once), so we thought we’d use our APIs to analyze and identify any trends in what content makes the Frontpage. The result is Diffbot’s HackerNews Trend Analyzer. Feel free to click that link and […]

Read More

Announcing Semantic Hack (June 1, 2013)

What could you build if the entire web was your database? Could you do it in a day? We’re glad to be working with the fine folks at SemanticWeb.com to host the inaugural Semantic Hack at the Semantic Technology & Business Conference in San Francisco on June 1, 2013. See additional details and registration at http://semantichack.eventbrite.com, and more […]

Read More

New Feature: Custom Timeouts

The slowest part of any Diffbot API request is the call-response to third-party content. Depending on the third party server’s responsiveness and location, it could be anywhere from a third of a second to tens of seconds before we receive content to process. (Diffbot internal rendering and processing, by comparison, averages just over 100 milliseconds.)

Read More