Inner nerd and Semantic Web: The glory details

What we just heard in the introduction means that the semantic web once and for all (at least for a while) solves the data modeling problem we face today. There is no application or use case proprietary data anymore. The data describes itself, regardless of a specific application or use case. Can you imagine what this means for data re-use?

But why is re-use so important? Let me explain that by a posting of Tim Berners-Lee in 2007:

The word Web we normally use as short for World Wide Web. The WWW increases the power we have as users again. The realization was “It isn’t the computers, but the documents which are interesting”. Now you could browse around a sea of documents without having to worry about which computer they were stored on. Simpler, more powerful. Obvious, really.

If I look back this fits so perfectly well in one of the revelations I had myself one day. In the early 90ies before I started using the web, my computer was the center of my digital universe. Everything was on my OS/2 box and I was happy.  Now I have multiple devices and data on each one of them and on various sites somewhere in the Internet. Is this better? Not so sure about it, as most of the devices or sites act as a small islands somewhere in the wide ocean and there is no way to get from one island to the other. So let us quote Tim again:

[…] The Net links computers, the Web links documents. Now, people are making another mental move. There is realization now, “It’s not the documents, it is the things they are about which are important”. Obvious, really.

There are some important remarks in here: While we (or rather our brain) can make the link between things, the computer can not. If you don’t believe me, google for something like Jaguar and get me only the sites which are related to the animal. Seems to be pretty hard for Google.

Biologists are interested in proteins, drugs, genes. Businesspeople are interested in customers, products, sales. We are all interested in friends, family, colleagues, and acquaintances. There is a lot of blogging about the strain, and total frustration that, while you have a set of friends, the Web is providing you with separate documents about your friends. One in Facebook, one on LinkedIn, one in LiveJournal, one on advogato, and so on. The frustration that, when you join a photo site or a movie site or a travel site, you name it, you have to tell it who your friends are all over again. The separate Web sites, separate documents, are in fact about the same thing — but the system doesn’t know it.

The other remark is related to what Tim calls separate documents. You can take sites like LinkedIn or Facebook as separate documents in that regard. Why? Simple: Those sites pervert the original design idea of the web as they create something like a giant document or black hole which sucks data in and just opens up a few things over proprietary APIs to the outside world. Sounds like what? Right, sooo 90ies! Did that, done that, just with Microsoft products back then. Doing the same in the web browser as Web 2.0 doesn’t really make the whole thing better.

So how is the Semantic Web gonna make this better? Pretty simple, the data is the API! If you describe information in a semantic web way you will use RDF as the lingua franca of the web and this, by definition, provides a universal, unambiguous way for accessing and querying it. No more lock-in, no more application or use-case proprietary data but data re-use. And another thing I really love about it: As transport it is using the same foundation of what the web runs on since 20 years: http. Good times!

If you want to get your hands dirty now you might check out our gentle introduction to the technology behind the semantic web.

How my inner nerd got hooked by the Semantic Web

This was supposed to be the first post for this blog but I never published it so far. The second part is a pretty technical explanation on why I started to love the semantic web, which might also explain the subtitle of this blog ;) I got way better in explaining it meanwhile but I still think it makes sense to post it for historical and nerdy reasons, so here we go.

About four years ago a friend of mine and I were having dinner at my place and I tried to explain him what we aim at. Our vision was still very abstract back then but he told me that the stuff I talk about sounds a lot like something which goes under the name Semantic Web. Some time later he made a short presentation to our team, I remember sitting there and hear about things like triples, SPARQL, giant global graph and so on.

To be honest, I didn’t get it at all at first. But somehow it stuck in my head, I had the idea that this technology indeed might be a part of the puzzle we try to solve. A few month later it was summer and I was looking for a good excuse to sit at the lake in the sun instead of working on the computer.  So I printed a bunch of papers from W3C explaining the semantic web and its components and started reading.

I was amazed. I mean I was seriously amazed. I spend quite some time in the IT business and I did quite a bit of data modeling, programming and all that stuff but what I read was sexy, super sexy (inner nerd speaking here). I still didn’t understand the whole thing yet but it seemed like there is something out there which has the potential to solve all the nasty technical problems I was running into sooner or later in the past.

So what is so sexy about it? As the Internet (or rather its protocol suite called TCP/IP) connected computers in a universal language in the 70ies and the Web connected documents in the 90ies Semantic Web connect things the same way. What things? Anything! Seriously!

If this sounds fair enough you can now stop reading. If you don’t believe me yet, read on.