Fly Me to the Moon

It’s interesting to see how people talk about Linked Data & RDF these days. Most of the time the discussions talk about one specific feature of the technology stack which either rocks or sucks, depending on which side the author stands.

Let’s start with what are for me the best two pages about RDF I’ve read since I started working with the technology five years ago: Irene Polikoff in my opinion summarizes perfectly what RDF is about:

The ability to combine multiple data sources into a whole that is greater than the sum of its parts can let the business glean new insights. So how can IT combine multiple data sources with different structures while retaining the flexibility to add new ones into the mix as you go along? How do you query the combined data? How do you look for patterns across subsets of data that came from different sources?

The article gives a very good idea of when you need what parts of the RDF stack to tackle these kind of questions. The reason why I started reading into RDF & Linked Data is because I think RDF can solve these kind of questions in a time and money efficient way up to the scale of global companies and governments. And this is the scale I’m really interested in.

And this brings us to the other end of what we need to become mainstream with a technology: The average (web) developer. It’s still painfully hard to use the highly flexible data model you get with RDF to create user interfaces. I know this because me and some colleagues work on this for some time now and it’s also the domain where we see a lot of (often negative) postings about Linked Data and RDF. Some examples:

What they have in common is that they only look at the Semantic Web stack from their particular, limited perspective. The things they criticize are mostly correct, in its own small world. What they fail to see is that the Semantic Web does not try to solve a problem that is easy but one that is pretty hard: Find a way to split up the web of documents in a web of data and make sure that machines can help us interpreting it and make our life easier. I wasn’t aware of the real complexity of this before I started working with the RDF stack.

Now there are several options to handle this:

  • Ignore everything else than what you are trying to solve: JSON-LD is great, it probably does make things easier for a lot of developers. Manu states that he never had the need for a quadstore and SPARQL in +7 years of working with the technology stack. Good for him but then we obviously don’t solve the same kind of problems. This is not a problem at all but it’s important to keep in mind when we compare technologies.
  • Reinvent the wheel: Jens Ohlig first rants about Semantic Web and then explains for 30 minutes why Wikidata is so much work: unique identifiers, relationship between data, ontologies, provenance, multiple languages etc. I understand that Wikidata decided against using RDF and go for what they know best, which is probably PHP & MySQL. But it doesn’t help your point if you show me that in the end you solve exactly the same kind of problems RDF defined in W3C standards. You just build yet another data silo.
  • Not invented here. The Nepomuk project was funded by an EU FP7 research grant and I guess that none of the guys which originally worked on the RDF code are still there. The new guys probably mainly know key/value stores and didn’t understand RDF or graphs. The normal reaction in this case is to throw things away and start from scratch, instead of learning something which looks unfamiliar at first.
  • Accept that the world is complicated and continue working on the missing parts of the stack.

Manu Sporny:

TL;DR: The desire for better Web APIs is what motivated the creation of JSON-LD, not the Semantic Web. If you want to make the Semantic Web a reality, stop making the case for it and spend your time doing something more useful, like actually making machines smarter or helping people publish data in a way that’s useful to them.

I fully agree Manu but again, there are more problems out there than the ones JSON-LD tries to address. I think Brian Sletten summarized this best in a recent posting at semanticweb.com:

Fundamentally, however, I think the problem comes down to the fact that the Semantic Web technology stack gets a lot of criticism for not hiding the fact that Reality is Hard. The kind of Big Enterprise software sales that get attention promise to hide the details, protect you from complexity, to sweep everything under the rug.

[lots of more good stuff]

What is the alternative? If we abandon these ideas, what do we turn to? The answer after even the briefest consideration is that there is nothing else on the table. No other technology purports to attempt to solve the wide variety of problems that RDF, RDFS, OWL, SPARQL, RDFa, JSON-LD, etc. do.

I couldn’t agree more. You can be big enough that you do all this work on your own. If you are Google or Facebook that might even make sense. For everyone else, go with the standards. Even Google recommends you this.

I’m glad that Manu Sporny accepted to keep JSON-LD RDF compatible, as they solved a lot of interesting problems around JSON-LD like graph normalization and data signing. Maybe we need more people like him which “stop making the case for it and spend [their] time doing something useful”. But at the same time we need the guys who want to bring us to the moon. I’m glad Tim Berners-Lee decided to do so more than 20 years ago when he wrote his ‘Vague, but exciting’ proposal.

6 thoughts on “Fly Me to the Moon”

  1. Hi Adrian,
    I am glad to see that you are very excited about the Semantic Web. So was I, and still am. Unfortunately, your blog post shows a few inaccuracies and misunderstandings that I wanted to point you out to:
    * JSON-LD is RDF. That is exactly the point of JSON-LD. You can still SPARQL it, you still have URIs, you still have a graph. I am not sure what the issue is, and why you would regard JSON-LD not to be a proper part of the Semantic Web.
    * Wikidata exports its data to RDF, and the semantics of the data model of Wikidata are based on OWL. Wikidata did not reinvent the wheel at all, but is built on top of the stack. I wonder why you believe that Wikidata is not a proper part of the Semantic Web.
    * RDF inside an application that is not build on top of the Web has proven in several cases to be harder than expected. Read up on the histories of Mozilla and RDF for its internals, or how Joost has used RDF, and Nepomuk, where RDF was baked into the file system. RDF is an awesome exchange format, but it is not necessarily the best internal data model for an application. Before one decries such a decision as short-sighted, it might be useful to study what has come before.
    Whereas I completely understand your frustration with the world and its seemingly non-acceptance of RDF, I would like to ask you to take a moment and look a bit deeper in the examples you have selected. I do not think they provide the strongest case for the point you want to make.
    Cheers!

    1. Hi Denny,

      Thanks for the remarks!

      JSON:
      I am fully aware of RDF compatibility, I even praise it in my article :) What I criticize is that Manu disregards some aspects of the RDF stack because he has/had no need for it so far. And if you read his article one can get the impression that RDF compatibility was on the edge during the standardization process. Not sure if this is really the case but I get that impression and I’m glad they changed their mind.

      Wikidata:
      I’m not sure if you listen to the podcast I mention. It becomes pretty clear that they started to rebuild something like RDF without using RDF. Like Linked Data they promote URIs for things but as of today I cannot seem to get any RDF back from any Wikidata resource with content negotiation. I know that they are working on RDF dumps but it’s clearly not Linked Data.

      regards

      Adrian

      1. I listened to the Podcast. Wikidata did not reimplement RDF, it is built on top of RDF and OWL. Google for Wikidata OWL mapping, etc.
        With regards to Content Negotation, it should be working – have you tried with the correct URIs? I.e. try to conneg http://www.wikidata.org/entity/Q42 – which is the actual URI for the entity. (wiki/Q42 is the HTML representation, and will not conneg). (Here’s a relevant blog post: http://notconfusing.com/3-ways-to-access-wikidata-data-until-it-can-be-done-properly/ )
        Again, Wikidata was planned and implemented as a proper part of the Semantic Web, from the beginning, and so it is today, using the relevant standards.

        1. Do you have any links related to the Wikidata statements? That’s really not what I understood and read so far but I would love to learn that they do it right :) Thanks for the pointer regarding getting RDF, it does seem to work but it is definitely not proper content negotiation yet as you have to add .ttl to the file it redirects me to. But I didn’t know that, this is a big step forward for Wikidata!

  2. You may find a very interesting and highly relative debate of mine at http://goo.gl/KDvkTL. I am posting also regularly at the Semantic Web Research group at LinkedIn. In brief, I believe there is going to be a better solution than RDF and Linked Data. I call it R3DM and it has a solid foundation based on Aristotle’s sign theory….

    1. You may also find an interesting comparison of several Variable-Value-Pair data models, I call them VVP, at http://goo.gl/EWFmQZ and see what is a new Semantic data model foundation I propose based on a solid theoretical framework of Aristotle’s semiotic triangle.

Comments are closed.