Stories By DQL: Tracking the Sentiment of a City


The story: sentiment of news mentions of Gaza fluctuate by as much as 2000% a week. 90% of news mentions about Minneapolis have had negative sentiment through the first week in June 2020 (they’re typically about 50% negative). Positive sentiment news mentions about New York City have steadily increased week by week through the pandemic.

Locations are important. They help form our identities. They bring us together or apart. Governance organizations, journalists, and scholars routinely need to track how one location perceives another. From threat detection to product launches, news monitoring in Diffbot’s Knowledge Graph makes it easy to take a truly global news feed and dissect how entities being talked about.

In this story by DQL discover ways to query millions of articles that feature location data (towns, cities, regions, nations).

How we got there: One of the most valuable aspects of Diffbot’s Knowledge Graph is the ability to utilize the relationships between different entity types. You can look for news mentions (article entities) related to people, products, brands, and more. You can look for what skills (skill or people entities) are held by which companies. You can look for discussions on specific products.

In this Stories by DQL we’ll look at a set of queries that highlight how to gain a sentiment score regarding a physical location (a city). This is of useful lens in numerous news monitoring, governance, social science, threat detection, and journalism use cases.

There are numerous ways to examine sentiment about other entity types within Diffbot’s Knowledge Graph. Each has varying degrees of specificity and timeliness in the returned data that aligns with particular use cases better or worse. We’ll work through a number of these queries below.

Articles About A Location


First, we’ll need to line up the subset of article entities that we’ll draw sentiment from. There are a few ways to do this:
View In Knowledge Graph//obtain articles tagged with a precise location entity
type:Article tags.{uri:"http://www.diffbot.com/entity/As73-XUShMSuy4pkWakFzbQ"}

View In Knowledge Graph//obtain articles tagged with a city or location name
type:Article tags.label:"Gaza"

View In Knowledge Graph//obtain articles with a city name in the title
type:Article title:"Gaza"

View In Knowledge Graph//obtain articles with a city name in the text
type:Article text:"Gaza"

As you may imagine, these queries will return a range of articles. In particular, the above queries are arranged from most to least specificity. The first tag utilizing a Diffbot URI will return articles deemed to be primarily about a given Diffbot entity. In relation to locations, entities can be neighborhoods, counties, regions, and truly any location commonly used in written text on the web. (There’s even a Diffbot entity for Narnia.)

The second query is only differentiated by the fact that you’re filtering returned articles by tags with the label “Gaza.” Instead of one specific entity this may catch numerous related entities (as well as regional spellings and so forth).

The third query is more general yet, but is a fairly good way to ascertain an article is centered on a location. In this case the third query returned hundreds of thousands of article results while the second returned 100,000+ results.

Finally, if a phrase is mentioned in an article it is likely to be mentioned with some form of sentiment. This is one of the broader nets we can cast for news mentions of a location. And there are obvious trade offs as just because a location is mentioned within the text of an article does not mean

Articles About a Location From a Specific Publisher Or Country

In the case of governance, social science, and journalistic use cases, you’ll often be searching for more specific markers related to sentiment. Who exactly is portraying positive or negative sentiment. That’s where you can bound any of the above queries by two particularly useful parameters: publisherCountry and pageURL.

View In Knowledge Graph//gain all articles with 'Gaza' in text published in Turkey
type:Article text:"Gaza" publisherCountry:"Turkey"

View In Knowledge Graph//gain all articles with 'Gaza' in text published in NyTimes
type:Article text:"Gaza" pageUrl:"nytimes.com"

You can also exclude a given publisherCountry or pageURL:
View In Knowledge Graph//gain all articles with 'Gaza' in text not published in Turkey
type:Article text:"Gaza" not(publisherCountry:"Turkey")

Articles About a Location From a Specific Time Frame

Timeliness is important when tracking sentiment. Moments of abrupt sentiment change are actionable for many organizations. There are several syntax-es that can be used to bound article mentions by a time frame. There are also ways to facet sentiment values by a time frame (more on that in the next section).

To return articles from a date range relative to today you can use the following syntax:
View In Knowledge Graph//return articles with 'Gaza' in text older than 30 days old
type:Article text:"Gaza" date>30d

View In Knowledge Graph//return articles with 'Gaza' in text newer than 2 days old
type:Article text:"Gaza" date<2d

“Greater than” and “less than” selectors for date ranges can also be used together to bound dates on either side.

View In Knowledge Graph//return articles with 'Gaza' in text older than 6/10/2020
type:Article text:"Gaza" date<1591888517

Additionally, if you’re searching for a definitive time range (not relative to today) you can use Unix Epoch time. This is basically the number of seconds from 00:00:00 UTC on 1 January 1970 and there are a number of useful tools to convert human-readable dates to this format with a quick search.

Sentiment of Articles About a Location

Now that we’ve seen some basics on how to return articles centered around a location, let’s jump into how to parse the sentiment of these articles. Note that each article entity within the Knowledge Graph includes all pertinent images, text, and discussions. This means you can also pass elements from article entities in bulk to sentiment analysis tools of your choice.

Using Diffbot’s built-in sentiment scores, we have a few views through which we can filter our location article data. First, we can find all positive articles (articles with a sentiment score greater than 0). Or all negative articles (sentiment scores <0).

View In Knowledge Graph//Return all articles with positive sentiment with "Gaza" in the text
type:Article text:"Gaza" sentiment>0.0

View In Knowledge Graph//Return all articles with negative sentiment with "Gaza" in the text
type:Article text:"Gaza" sentiment<0.0

With a facet search, we can also see a summary view of just how positive or negative articles about a location are:
View In Knowledge Graph//article counts of articles with 'Gaza' in text sorted by how positive or negative
type:Article text:"Gaza" facet:sentiment

View In Knowledge Graph//how many positive and negative articles with text "Gaza" were published in last 7 days
type:Article text:"Gaza" facet:sentiment date<7d

Additionally, we can track counts of how many positive or negative articles there are about a location by time frame:
View In Knowledge Graph//count of positive articles with 'Gaza' in text by week
type:Article text:"Gaza" sentiment>0.0 facet[week]:date


View In Knowledge Graph//count of negative articles with 'Gaza' in text by week
type:Article text:"Gaza" sentiment<0.0 facet[week]:date

Armed with these starter queries you can start to explore a range of sentiment and time-related news monitoring queries within Diffbot’s Knowledge Graph. Why not try to emulate the queries used for this stories initial observations? Or point them at your own locations of interest. With a free 14-day trial you can gain immediate access to global news mention sentiment today.

Need some ideas? Check out the Knowledge Graph Documentation, or other stories by DQL.