Web Pages Suck

At Diffbot HQ we operate under the general premise that web pages… sort of suck.

This is not meant to demean Sir Berners-Lee, former Vice-President Gore, our own talented web designer, or even the good folks at Macromedia whose Dreamweaver made so much possible when we knew so little about tables. But to our gleaming, unfeeling, robotic eye, web pages really are bad news.

Web pages are inherently built for monitors — still

Featuring Mauna Kea in stunning Super VGA.
Featuring Mauna Kea in stunning Super VGA.

When the web emerged, the notion of multiple screens was laughable. Heck, the notion of multiple monitors was laughable, with motherboard video chips struggling to keep pace with the rapid leaps being made from EGA to VGA to SVGA.

Now connected screens are everywhere, and nearly every screen carries the same expectation: that it display web content in its own crazy way, shape or form.

But while videos and images can scale-to-fit (though codecs often stand in the way, and scaling-up is a nonstarter), for web content the only way to handle additional screen sizes is through hand-crafted, specialized templates and stylesheets — effectively, totally different sites. Or you can throw up your hands and collapse content in the name of “responsiveness,” all the while sacrificing your overall aesthetics, or ignoring them completely, for anything but the screen size your designer is using.

The net: web pages on anything but “standard” monitors — a decreasing percentage of the consumption environment — are an exercise in compromise.

Web pages represent a hybrid of flexible design plus flexible data — but our devices aren’t flexible

The web represents a weird stage in information delivery, with the medium suddenly needing to be as malleable as the content within it. Beforehand, content design needed only target a fixed output entirely in the control of the content creator: a 4:3 television screen, a broadsheet or tabloid newspaper front page, the 32-42 lines on a page of a trade paperback, the three seconds of airtime for “station identification.” In these, traditional graphic design loosely accommodated dynamic information (the length of any given day’s headlines, for instance), but more typically the content was curated to fit design templates (giving unto the world Headlinese).

media='screen and (max-width: 87px)'
max-width: 87px

With the web, the delivery mechanism has to be responsive not only to dynamic content, but for the first time to dynamic consumption environments. This started with different monitor sizes and resolutions, then operating system window sizes, then different browser interpretations of markup, then TVs, then phones, then suddenly anything that was connected… which is to say, nearly everything, from your car to your watch to the little printer in your kitchen.

Templates used to be sufficient. And in moving content online, we’ve replicated the notion of a design template. But in a world where the consumption environment has control, our best efforts to accommodate this have basically been to create additional templates for the templates (@media queries, etc.).

No longer in charge. Except when on his Nexus 10.
Nope. Except when on his Nexus 10.

Instead of trying to outmaneuver the future of dimensions and display, content producers ought to (be able to) deliver content in a way that allows devices to present it as appropriately as possible. The device / the environment / the screen is in charge, which is really just another way of saying the consumer is.

Web pages present content with little to no structural differentiation; ad hoc hierarchy; and zero standardization

Looking at the markup of any given web page, you’d be hard-pressed to figure out which content was “important.” And I say “hard-pressed” because “f**king hosed” doesn’t belong in the vocabulary of a gentrified robot.

With rare exceptions, markup differentiation exists solely for the sake of easing stylesheet and script management, content population and Ajax behaviors. But class names of .content and IDs of #main, while admittedly better than .stuff and #things, are specific to designer/developer whims. One man’s #main is another man’s #primary.

And while English seems to be the universal language of markup… says who? How do we semantically accommodate <div id=”contenido“>?

Now, you should say, these classes and IDs — when paired with scripts and stylesheets — do the job. They visually make the web page appear as it’s supposed to, so: problem solved. (Unless of course you’re blind, or on some sites colorblind, or you’re looking at the site on your new phablet which is just between the specified sizes of the site’s hand-crafted responsive stylesheets, so the price of the product you’re browsing is slightly off-screen.)

The web ought to be more than that. It’s more than the dumb, fixed visual medium of its predecessors, and its pages should enable identification and consumption of all of the “important stuff” without a visual element.

(And yes, this is where the vaunted semantic web should exist, but with the exception of Facebook-driven OpenGraph tags for the purposes of easy Timeline sharing, and SEO abuse for the purposes of Google rank jockeying, it doesn’t. Even where it does exist it’s for a limited data set, largely in the “metadata” realm, and largely for blog posts. There are lots of other web page types out there.)

Don’t get us wrong. (We’re robots, so we’re often gotten wrong.) The web and most of its pages are incredible. The web’s existence is letting me to deliver this important message from my pajamas. Web pages and their takeover of the stock market enabled our investors to fund our competitive salaries, amazing health benefits, first-class office space and huge-ass monitors (hey, we’re hiring) — monitors that make viewing web pages such a pleasure, except of course if those pages don’t scale well to “huge-ass.”

‘Cause web pages also suck.

In our next post, because this is long and we have to think some more: what do we wish for the web to help its pages suck less?

John Davi

John runs everything product for Diffbot. Drop him a line at john at diffbot if you have questions.