Inner nerd and Semantic Web: The glory details

What we just heard in the introduction means that the semantic web once and for all (at least for a while) solves the data modeling problem we face today. There is no application or use case proprietary data anymore. The data describes itself, regardless of a specific application or use case. Can you imagine what this means for data re-use?

But why is re-use so important? Let me explain that by a posting of Tim Berners-Lee in 2007:

The word Web we normally use as short for World Wide Web. The WWW increases the power we have as users again. The realization was “It isn’t the computers, but the documents which are interesting”. Now you could browse around a sea of documents without having to worry about which computer they were stored on. Simpler, more powerful. Obvious, really.

If I look back this fits so perfectly well in one of the revelations I had myself one day. In the early 90ies before I started using the web, my computer was the center of my digital universe. Everything was on my OS/2 box and I was happy.  Now I have multiple devices and data on each one of them and on various sites somewhere in the Internet. Is this better? Not so sure about it, as most of the devices or sites act as a small islands somewhere in the wide ocean and there is no way to get from one island to the other. So let us quote Tim again:

[…] The Net links computers, the Web links documents. Now, people are making another mental move. There is realization now, “It’s not the documents, it is the things they are about which are important”. Obvious, really.

There are some important remarks in here: While we (or rather our brain) can make the link between things, the computer can not. If you don’t believe me, google for something like Jaguar and get me only the sites which are related to the animal. Seems to be pretty hard for Google.

Biologists are interested in proteins, drugs, genes. Businesspeople are interested in customers, products, sales. We are all interested in friends, family, colleagues, and acquaintances. There is a lot of blogging about the strain, and total frustration that, while you have a set of friends, the Web is providing you with separate documents about your friends. One in Facebook, one on LinkedIn, one in LiveJournal, one on advogato, and so on. The frustration that, when you join a photo site or a movie site or a travel site, you name it, you have to tell it who your friends are all over again. The separate Web sites, separate documents, are in fact about the same thing — but the system doesn’t know it.

The other remark is related to what Tim calls separate documents. You can take sites like LinkedIn or Facebook as separate documents in that regard. Why? Simple: Those sites pervert the original design idea of the web as they create something like a giant document or black hole which sucks data in and just opens up a few things over proprietary APIs to the outside world. Sounds like what? Right, sooo 90ies! Did that, done that, just with Microsoft products back then. Doing the same in the web browser as Web 2.0 doesn’t really make the whole thing better.

So how is the Semantic Web gonna make this better? Pretty simple, the data is the API! If you describe information in a semantic web way you will use RDF as the lingua franca of the web and this, by definition, provides a universal, unambiguous way for accessing and querying it. No more lock-in, no more application or use-case proprietary data but data re-use. And another thing I really love about it: As transport it is using the same foundation of what the web runs on since 20 years: http. Good times!

If you want to get your hands dirty now you might check out our gentle introduction to the technology behind the semantic web.