How Computer Vision Helps Get You Better Web Data

In 1966, AI pioneer Marvin Minsky instructed a graduate student to “connect a camera to a computer and have it describe what it sees.” Unfortunately, nothing much came of it at the time.

But it did trigger further research into the computer’s ability to replicate the human brain. More specifically, how the eyes see, how that information gets processed in the brain, and how the brain uses that information to make intelligent decisions.

The process of copying the human brain is incredibly complicated, however. Even a simple task, like catching a ball, involves intricate neural networks in the brain that are near impossible to replicate (so far).

But some processes are more successfully duplicated than others. For instance, just as the human eye has the ability to see the ball, computer vision enables machines to extract visual data in the same way.

It can also analyze and, in some cases, understand the relationship between the visual data it receives from images, making it the closest thing we have to a machine brain. While it’s not perfect at recreating the visual cortex or replicating the brain (yet), it still has some serious benefits for data users where it is in the process right now.

Computer Vision and Artificial Intelligence

In order to understand exactly how valuable computer vision can be in gathering web data, you first need to understand what makes it unique – that is to say, what separates it from general AI.

According to Gum Gum VP Jon Stubley, AI is simply the use of computer systems to perform tasks and functions that usually require human intelligence. In other words, “getting machines to think and act like humans.”

Computer vision, on the other hand, describes the ability of machines to process and understand visual data; automating the type of tasks the human eye can do. Or, as Stubley puts it, “Computer vision is AI applied to the visual world.”

One thing that it does particularly well is gather structured or semi-structured data. This makes it extremely valuable for building databases or knowledge graphs, like the one Google uses to power its search engine, which is then used to build more intelligent systems and other AI applications.

Advantages of the Knowledge Graph

Knowledge graphs contain information about entities (an object that can be classified) and their relationships to one another (e.g. a Corolla is a type of car, a wheel is a part of a car, etc.).

Google uses their knowledge graph to recognize search queries as distinct entities, not just keywords. When you type in “car” it won’t just pull up images that are labeled as “car,” it will use computer vision to recognize items that look like cars, tag them as such, and feature them, too.

This can be helpful when searching for data, as it enables you to create targeted queries based on entities, not just keywords, giving you more comprehensive (and more accurate) results.

How Computer Vision Impacts Your Data

Computer vision also helps you identify web pages quickly, allowing you to strategically pull product information, images, videos, articles and other data without having to sort through unnecessary information.

Computer vision techniques enable you to accurately identify key parts of a website and extract those fields as structured data. This structured data then enables you to search for specific image types or text, or even specific people.

Computer vision also allows you to (among other things):

  • Analyze images – Using tagging, descriptions, and domain-specific models, it can identify content and label it accordingly, apply filters and settings, and separate images by type or even color scheme
  • Read text in images – It can recognize words even if they are embedded within images or otherwise unable to be extracted, copied or pasted into a text document (called OCR, or Optical Character Recognition)
  • Read handwriting – If information on a page is handwritten or an image of handwriting, it can also recognize and translate it into text (OCR)
  • Analyze video in real time – Computer vision enables you to extract frames from videos from any device for analysis

Certain ecommerce sites use computer vision to perform image analysis in their predictive analytics efforts to forecast what their customers will want next, for example. This can save an enormous amount of time when it comes to pulling, analyzing and using that data effectively.

Because it works on structured data, computer vision also gives you cleaner data that you can then use to build applications, inform your marketing decisions. You can quickly see patterns in data sets and identify entities that you may have otherwise missed.

Final Thoughts

Computer vision is a field that continues to grow at a rapid pace alongside AI as a whole. One of its biggest boons is the ability to power databases of knowledge that power search engines. The more that machines learn to recognize entities on sites and in images, the more accurate the results are.

But more importantly, computer vision can be used to drive better results when data is extracted from the Web, enabling users to pull accurate, structured data from any site without sacrificing quality and accuracy in the process.