API Features – Page 5

Crawlbot Updates: Webhooks and Preventing Duplicate Content

We added a couple of frequently requested Crawlbot features this week: webhook notifications and much smarter content de-duplication.

John Davi

New Crawlbot Features: API Parameters and Product Crawl CSVs

We added a couple of frequently requested features to Crawlbot this week: the ability to pass in Diffbot API parameters to tailor the output of… Keep Reading

John Davi

Diffbot’s New Product API Teaches Robots to Shop Online

Diffbot’s human wranglers are proud today to announce the release of our newest product: an API for… products! The Product API can be used for extracting clean,… Keep Reading

John Davi

Diffbot APIs Are Getting Very META

We noticed recently that a common use for our Custom API Toolkit was augmenting Diffbot’s Automatic APIs with custom fields to return markup <META> tag data:… Keep Reading

John Davi

Announcing Crawlbot: Smart Site Spidering and Extraction

Today we’re happy to announce the public availability of Crawlbot, our computer-vision-powered site crawler and extractor. If you want structured data from an entire site,… Keep Reading

John Davi

Setting up a Machine Learning Farm in the Cloud with Spot Instances + Auto Scaling

Previously, I wrote about how Amazon EC2 Spot Instances + Auto Scaling are an ideal combo for machine learning loads. In this post, I’ll provide… Keep Reading

Mike Tung

Machine Learning in the Cloud

Machine Learning Loads are Different than Web Loads One of the lessons I learned early is that scaling a machine learning system is a different… Keep Reading

Mike Tung

New Feature: Correct and Concatenate Multi-Page Articles

Our Article API automatically joins multiple-page articles into a single “text” or “html” field. On some sites though our algorithm is unable to concatenate for… Keep Reading

John Davi

World of Web Data

The Semantic web is a dream that many are attempting to make into reality through the use of machine-readable metadata. Web developers worldwide would use… Keep Reading

Diffy

Diffbot’s HackerNews Trend Analyzer

Like any good developer service, we’re fans of Hacker News. Making the vaunted Frontpage is a, well, vaunt-worthy accomplishment (we’ve been there once), so we thought… Keep Reading

John Davi