Category Archives: Semantic Web

Fly Me to the Moon

It’s interesting to see how people talk about Linked Data & RDF these days. Most of the time the discussions talk about one specific feature of the technology stack which either rocks or sucks, depending on which side the author stands.

Let’s start with what are for me the best two pages about RDF I’ve read since I started working with the technology five years ago: Irene Polikoff in my opinion summarizes perfectly what RDF is about:

The ability to combine multiple data sources into a whole that is greater than the sum of its parts can let the business glean new insights. So how can IT combine multiple data sources with different structures while retaining the flexibility to add new ones into the mix as you go along? How do you query the combined data? How do you look for patterns across subsets of data that came from different sources?

The article gives a very good idea of when you need what parts of the RDF stack to tackle these kind of questions. The reason why I started reading into RDF & Linked Data is because I think RDF can solve these kind of questions in a time and money efficient way up to the scale of global companies and governments. And this is the scale I’m really interested in.

And this brings us to the other end of what we need to become mainstream with a technology: The average (web) developer. It’s still painfully hard to use the highly flexible data model you get with RDF to create user interfaces. I know this because me and some colleagues work on this for some time now and it’s also the domain where we see a lot of (often negative) postings about Linked Data and RDF. Some examples:

What they have in common is that they only look at the Semantic Web stack from their particular, limited perspective. The things they criticize are mostly correct, in its own small world. What they fail to see is that the Semantic Web does not try to solve a problem that is easy but one that is pretty hard: Find a way to split up the web of documents in a web of data and make sure that machines can help us interpreting it and make our life easier. I wasn’t aware of the real complexity of this before I started working with the RDF stack.

Now there are several options to handle this:

  • Ignore everything else than what you are trying to solve: JSON-LD is great, it probably does make things easier for a lot of developers. Manu states that he never had the need for a quadstore and SPARQL in +7 years of working with the technology stack. Good for him but then we obviously don’t solve the same kind of problems. This is not a problem at all but it’s important to keep in mind when we compare technologies.
  • Reinvent the wheel: Jens Ohlig first rants about Semantic Web and then explains for 30 minutes why Wikidata is so much work: unique identifiers, relationship between data, ontologies, provenance, multiple languages etc. I understand that Wikidata decided against using RDF and go for what they know best, which is probably PHP & MySQL. But it doesn’t help your point if you show me that in the end you solve exactly the same kind of problems RDF defined in W3C standards. You just build yet another data silo.
  • Not invented here. The Nepomuk project was funded by an EU FP7 research grant and I guess that none of the guys which originally worked on the RDF code are still there. The new guys probably mainly know key/value stores and didn’t understand RDF or graphs. The normal reaction in this case is to throw things away and start from scratch, instead of learning something which looks unfamiliar at first.
  • Accept that the world is complicated and continue working on the missing parts of the stack.

Manu Sporny:

TL;DR: The desire for better Web APIs is what motivated the creation of JSON-LD, not the Semantic Web. If you want to make the Semantic Web a reality, stop making the case for it and spend your time doing something more useful, like actually making machines smarter or helping people publish data in a way that’s useful to them.

I fully agree Manu but again, there are more problems out there than the ones JSON-LD tries to address. I think Brian Sletten summarized this best in a recent posting at semanticweb.com:

Fundamentally, however, I think the problem comes down to the fact that the Semantic Web technology stack gets a lot of criticism for not hiding the fact that Reality is Hard. The kind of Big Enterprise software sales that get attention promise to hide the details, protect you from complexity, to sweep everything under the rug.

[lots of more good stuff]

What is the alternative? If we abandon these ideas, what do we turn to? The answer after even the briefest consideration is that there is nothing else on the table. No other technology purports to attempt to solve the wide variety of problems that RDF, RDFS, OWL, SPARQL, RDFa, JSON-LD, etc. do.

I couldn’t agree more. You can be big enough that you do all this work on your own. If you are Google or Facebook that might even make sense. For everyone else, go with the standards. Even Google recommends you this.

I’m glad that Manu Sporny accepted to keep JSON-LD RDF compatible, as they solved a lot of interesting problems around JSON-LD like graph normalization and data signing. Maybe we need more people like him which “stop making the case for it and spend [their] time doing something useful”. But at the same time we need the guys who want to bring us to the moon. I’m glad Tim Berners-Lee decided to do so more than 20 years ago when he wrote his ‘Vague, but exciting’ proposal.

Data quality counts

When linking with other data sources,  without a doubt knowing the origin and the quality level of external data is important. After all, an important point in using semantic web technology is to use other data sources to enrich our own data, and enhance our own solutions by that.

People being new to RDF, which never or rarely came in touch with the task of interlinking with foreign data sources so far, may now think that this was a problem coming only with RDF. At the contrary! Any data retrieved of whatever other system may be right or wrong, and may be complete or incomplete. Within the RDF concept it is even literally stated that RDF-based data by definition is neither correct nor complete – well, at least if you are not the one to ensure that. But this problem may only get bigger the more you access external data sources, and once you would start to use semantic web technology to do so, it may just happen more and more often.

This puts up questions like: how securely do you know from what source a given set of data originates, how much do you know about the provider of that data source, how accurate is the data model and how much can you trust in the quality of data maintenance. Obviously the answers to these questions will have a direct impact on how much you can benefit from interlinking with the given data. If the quality of external data is not ensured, the quality of your solution in turn will suffer.

So if you link to external data yourself, you will have to pay tight attention to the quality of the used data sources, in order to avoid any bad impact on the quality of your own products and services.

Does RDF fit into my architecture?

When software engineers come into touch with the world of semantic data for the first time, it may be difficult for them to grasp all the benefits and consequences at once. When I heard of RDF for the first time, I thought this concept could possibly impose more risks than benefits. So how would RDF fit ino the architecture of my solutions? Or better: why did I feel at first that this possibly may not be the case ? Just in case you have or have had similar feelings, I would want to share my thoughts on that with you.

I think that back then I felt quite uncertain, because I would have to let go all the known ways of accessing structured data. At first I even considered RDF-based data to be stored with no real structure, but of course that is not true. Later, I felt that modeling data with literally any vocabulary would imply lots of problems and hinder applications from being able to deal with data from anybody else from outside, but of course that is not true as well.

After some time, dealing with RDF and practically working with it, I more and more understood that these points are not a problem at all, instead RDF shows its beauty and flexibility by letting me do things just differently.

First of all, it is not that RDF data would be poorly structured. Instead, it is structured by semantics – only!

In fact vocabulary used within the RDF concept provides the most flexible and at the same time a very precise and reliable way to describe and structure data in a machine-readable way. The vocabulary can be uniquely identified, so that the the meaning of the described data can always be exactly determined. And with RDF you put the knowledge about the structure of data into he data itself, while with other concepts you have to put it mostly into (self-)developed software. This is why RDF is said to compare to other storage concepts like knowledge engineering to software engineering.

Using semantics over conventional ways of structuring data, so e.g. by tables and hierarchies, has another important advantage: scalability in complexity. Without any problem, RDF data can get more complex without breaking any existing logic or requiring more efforts to keep more complex data models performing well.  On the other hand this is not true for tables and hierarchies, they don’t scale well in complexity at all.

Second, it is not that one would need a restricted set of vocabulary to keep data usable for oneself or anybody else. Instead, RDF encourages to use any existing vocabulary in order to increase reusability of data. In fact I can model a given set of data with any amount of different sets of vocabulary at the same time, and that without much overhead or  redundancy (try to do that with SQL based data sources…).

If you would restrict your solutions to a restricted set of vocabulary, it would not give you any extra benefit. Instead you would restrict your services to those RDF-based data sources that use the very same set of vocabulary that you do. And that would be much more of a restriction in the future than you can think of today.

However, even if you use existing vocabulary as much as possible, you may still need to interlink your data with data from a source describing same things, but using different vocabulary. Then you simply need a way to translate between different sets of vocabularies. Due to the nature of vocabularies being described semantically themselves, this can be done in a generic manner. In the easiest case this translation would take place in the back end, so to say in your triple store engine, making this step completely transparent to your services using the store. By the time of this writing however, no database product seems to be able to do that. As an alternative, such translation could be encapsulated in a framework, through which your services would access RDF data sources, including our own. netlabs.org is currently working on a technique that allows interlinking between sets of vocabularies with a flexible way of defining such a translation, for sure basing on RDF-based data as well.

Inner nerd and Semantic Web: The glory details

What we just heard in the introduction means that the semantic web once and for all (at least for a while) solves the data modeling problem we face today. There is no application or use case proprietary data anymore. The data describes itself, regardless of a specific application or use case. Can you imagine what this means for data re-use?

But why is re-use so important? Let me explain that by a posting of Tim Berners-Lee in 2007:

The word Web we normally use as short for World Wide Web. The WWW increases the power we have as users again. The realization was “It isn’t the computers, but the documents which are interesting”. Now you could browse around a sea of documents without having to worry about which computer they were stored on. Simpler, more powerful. Obvious, really.

If I look back this fits so perfectly well in one of the revelations I had myself one day. In the early 90ies before I started using the web, my computer was the center of my digital universe. Everything was on my OS/2 box and I was happy.  Now I have multiple devices and data on each one of them and on various sites somewhere in the Internet. Is this better? Not so sure about it, as most of the devices or sites act as a small islands somewhere in the wide ocean and there is no way to get from one island to the other. So let us quote Tim again:

[…] The Net links computers, the Web links documents. Now, people are making another mental move. There is realization now, “It’s not the documents, it is the things they are about which are important”. Obvious, really.

There are some important remarks in here: While we (or rather our brain) can make the link between things, the computer can not. If you don’t believe me, google for something like Jaguar and get me only the sites which are related to the animal. Seems to be pretty hard for Google.

Biologists are interested in proteins, drugs, genes. Businesspeople are interested in customers, products, sales. We are all interested in friends, family, colleagues, and acquaintances. There is a lot of blogging about the strain, and total frustration that, while you have a set of friends, the Web is providing you with separate documents about your friends. One in Facebook, one on LinkedIn, one in LiveJournal, one on advogato, and so on. The frustration that, when you join a photo site or a movie site or a travel site, you name it, you have to tell it who your friends are all over again. The separate Web sites, separate documents, are in fact about the same thing — but the system doesn’t know it.

The other remark is related to what Tim calls separate documents. You can take sites like LinkedIn or Facebook as separate documents in that regard. Why? Simple: Those sites pervert the original design idea of the web as they create something like a giant document or black hole which sucks data in and just opens up a few things over proprietary APIs to the outside world. Sounds like what? Right, sooo 90ies! Did that, done that, just with Microsoft products back then. Doing the same in the web browser as Web 2.0 doesn’t really make the whole thing better.

So how is the Semantic Web gonna make this better? Pretty simple, the data is the API! If you describe information in a semantic web way you will use RDF as the lingua franca of the web and this, by definition, provides a universal, unambiguous way for accessing and querying it. No more lock-in, no more application or use-case proprietary data but data re-use. And another thing I really love about it: As transport it is using the same foundation of what the web runs on since 20 years: http. Good times!

If you want to get your hands dirty now you might check out our gentle introduction to the technology behind the semantic web.

How my inner nerd got hooked by the Semantic Web

This was supposed to be the first post for this blog but I never published it so far. The second part is a pretty technical explanation on why I started to love the semantic web, which might also explain the subtitle of this blog ;) I got way better in explaining it meanwhile but I still think it makes sense to post it for historical and nerdy reasons, so here we go.

About four years ago a friend of mine and I were having dinner at my place and I tried to explain him what we aim at. Our vision was still very abstract back then but he told me that the stuff I talk about sounds a lot like something which goes under the name Semantic Web. Some time later he made a short presentation to our team, I remember sitting there and hear about things like triples, SPARQL, giant global graph and so on.

To be honest, I didn’t get it at all at first. But somehow it stuck in my head, I had the idea that this technology indeed might be a part of the puzzle we try to solve. A few month later it was summer and I was looking for a good excuse to sit at the lake in the sun instead of working on the computer.  So I printed a bunch of papers from W3C explaining the semantic web and its components and started reading.

I was amazed. I mean I was seriously amazed. I spend quite some time in the IT business and I did quite a bit of data modeling, programming and all that stuff but what I read was sexy, super sexy (inner nerd speaking here). I still didn’t understand the whole thing yet but it seemed like there is something out there which has the potential to solve all the nasty technical problems I was running into sooner or later in the past.

So what is so sexy about it? As the Internet (or rather its protocol suite called TCP/IP) connected computers in a universal language in the 70ies and the Web connected documents in the 90ies Semantic Web connect things the same way. What things? Anything! Seriously!

If this sounds fair enough you can now stop reading. If you don’t believe me yet, read on.

Academics are rewarded for publishing papers

As mentioned in the last post I participated at the information day for the FP7 call on big data a few weeks ago.

I’m not sure if netlabs.org will participate in this call, I see the need for big data but our focus is on creating and interfacing graph based information in smart ways, more about that in another post. However, I often use existing datasets to demo our technology and I regularly run into issues which were mentioned by one of the FP7 coordinators as well:

Academics are rewarded for publishing papers, not for writing robust code.

Later when they talked about making the research available as open source software he mentioned another problem: Sometimes the outcome of projects is code, that compiles and works only on the computer of the PhD student which wrote it. And as we probably all agree, this is not really useful.

Also he said that a good response time of a system from the user perspective is something that gives a result within at max half a second, followed by the statement that if it doesn’t provide an answer within 20ms, it won’t be part of the technology chain – which is a statement from Google (IIRC) that shows that even if you are within 0.5 secs, you are not necessarily the only part of a chain :)

Unfortunately, most or at least many of the LOD resources I use on a regular base out there for demo cases are a good example of #FAIL for the “robust code” and “half a second” remark. I for example often use linkedgeodata.org, unfortunately the response time for a non-cached query is still way too high on this site, it usually takes several seconds.

Linkedgeodata is just one of the examples, many projects do not realize that the service needs to be fast and reliable as well to make sure people will be able to use it in the real world. This leads to some sort of a chicken-egg situation: As long as the experience is slow (aka sucks), people do not use a (semantic web powered) site. And as long as there are no real world users there is less motivation for the service provider to improve the response time.

Conclusion: If we want the semantic web to become a success, we definitely need to fit the 20ms rule! Every service which gives back RDF is just one part of the technology chain and thus just one part of the final user experience!

Big Data and the Semantic Web

I wrote this post in the train on the way back from Luxembourg where I participated at the ICT Call 8 Information and Networking Day: Intelligent Information Management, which is another EU FP7 call for research projects on big data.

The information day was pretty interesting as I didn’t really read into the big data issue yet. The summary was basically that big data is when the size of the data itself is a problem. Examples given included Google which talks about 1 petabyte of data, Amazon S3 with 500 billion objects or Wall Mart which seems to process data sets up to 100 million each day.

They did have quite some RDF/semantic web related projects there, existing ones (lod2.eu) and proposals for new ones by groups which search partners. I was a bit confused about RDF and LOD because although the total data size is impressive, each one of the data bases like DBpedia itself is not that big (DBpedia is only few 100 gigabytes). And funnily enough, I had an article on my reading list about exactly this problem at semanticweb.com: Two kinds of big data.

Rob Gonzalez makes some really good remarks in there, like the statement that there are two kinds of big data: Really big data sets which need to be processed on one box/instance (vertical big data) and the semantic web, which in itself is horizontal big data.

With Horizontal Big Data (maybe HBD will start catching on!), the problem isn’t how to crunch lots of data fast.  Instead, it’s how to rapidly define a working subset of information to help solve a specific need.

That’s a really good remark and I am curious about how we will be able to solve the problem of widely distributed data. So, semantic web community, listen up: There is some money available in this EU FP7 call, deadline for proposals is 17 January 2012 at 17:00 (Brussels local time) !

Recommended readings (mentioned at the FP7 information day):

On the quest of explaining Semantic Web

A few weeks ago I did the (for me) so far most successful presentation about semantic web. And guess what: it worked so well because I almost did not mention it in the presentation.

An old friend of mine asked me to present the idea of semantic web to the company she is working for in an event they call Puzzle Lunch. The idea is to present a technology to everyone interested in the company and have lunch after that. The time limit was one hour, which I considered as practically impossible. I did quite a lot of talks about it in the past, both to programmers and to non-technical people and I always found it easier to explain it to someone with little or no technical background. This way I could skip the time-consuming details of RDF and related standards.

Inspired by a presentation from Bart and a long discussion with Christian the night before I decided to drop the  technical aspect completely and just try to explain how we and others use the technology. In the train to Bern I was not so sure anymore if this was really the way to go, but in the end I decided to give it a try. I went there only with an improvised spreadsheet on which I explain the issues with list and table-like structures. I did not had a single slide prepared.

After a warm welcome the room was filled with quite a lot of people, most of them as expected programmers and a few customers from Puzzle. I started my presentation and explained the issues with implicit knowledge we find everywhere where we use list- or table-like structures, like Excel files or databases. I showed them how much information gets lost that way and talked about why we need unique identifiers not just for the information itself but also for the headings, the annotations, the relationships between entries in tables. Finally I showed them how this implicit knowledge can be visualized in a graph, which I explained by very simple examples on the whiteboard.

I already talked longer than planned for this part so I switched to examples, stories from our customers and use cases I found on semanticweb.com or heard at SemTech. Every example I explained based on three great points Christian and I figured out the night before:

  • dissolve existing data silos
  • make implicit (or tacit) knowledge explicit available
  • make it possible to store any relationship without the need of designing an appropriate data model upfront

The most important key message was however that semantic web does not offer entirely new things, but it allows us to solve common problems more quickly and, hence, at lower costs.

Surely I did briefly mention RDF as the data model, I did talk about vocabularies and how you describe them in so called ontologies but I did not waste more time on it than absolutely necessary. Later we briefly talked about why we have to bring the benefits of the flexible data model in the backend to the user interface, which is exactly what we are working on at netlabs.org.

In the end I showed a few examples of our user interface technology and briefly talked about Linked Open Data and its potential. Surely I did talk a bit longer than an hour and when I decided to stop, I knew that I didn’t talk about quite a lot of things which I find tremendously exciting about the semantic web.

The feedback was just great. We had great discussions at lunch, later I got various text messages and emails from Puzzle employees and customers which told me the loved the idea of the semantic web and the way I presented it. They were sparkling with ideas of how they could use it in their own company or at customers and some of them want to read into the technology and get their hands dirty.

But there was one ultimate remark which proved to me that presenting semantic web without lot’s of technical details is definitely the way to go: At lunch they realized that someone else presented RDF a few years back but back then no one understood what it could be used for. So no matter to whom you are explaining the semantic web: rather show how it can solve existing problems, but don’t waste time on the technology itself. And by the way: me and my guys at netlabs.org would be more than happy to explain it to you as well :)

Why tables and hierarchies don’t scale

Maintaining data is a challenge, and there are several things that can make maintenance a real pain. Many problems with that arise out of the patterns that we use to store our data, as they have a big impact on how easy we can retrieve stored data later, may it be for viewing or cleaning up obsolete data or other tasks.

Before computer-age, people stored data by simply writing it. Lists and tables were invented to make retrieval of stored data much easier, just think of birth and wedding registries etc. Beside that hierarchies were used e.g. in natural science to display relations between unequal things and similar things at a time. Both methods implied a limit, as lists either needed to be short enough to remain usable, or there had to be a suitable scheme to split it up in smaller parts. Hierarchies could as well not grow too big, otherwise it would have been impossible to display them on a single or at least on a small amount of sheets of paper.

Surprisingly, even since the invention of computers, the concepts of tables and hierarchies still dominate the way we store and retrieve data. Tables are used in the majority of database management systems as well as in spreadsheets, and hierarchies are used in file systems and applications to store data. And although we might expect that with a computer we should not have a problem with storing large amounts of data, somehow the limits of pre-computer ages still apply.  This happens at all places where people need to access data, or require to design tables or hierarchies to suit a specific use case. Many problems arise out of that, but mostly the storage patterns are either not seen as the underlying reason, or the problems are taken as irrevocable.

Data to be held in a table, may it be in a spreadsheet or a database, may still not grow too big or  complex, otherwise it cannot be stored in one table of reasonable size and/or complexity. In this case people cannot use spreadsheets or other kind of simple table views anymore, but require use case specific applications for accessing and visualizing data. A more important drawback that applies to tables of all sizes is that they are not very suitable for data exchange, when either the meaning or the formatting of data items can be misinterpreted.

Wherever users create hierarchies, e.g. in file systems or within applications, they face the problem that the hierarchy strongly depends on the logic created in it by one or a group of persons. With with increasing complexity it gets more and more difficult, if not impossible to extended it without breaking this logic.

Semantic web technology is a true game changer in that regard. Beside other advantages, it comes with the ability of linking things with any amount of other thing, instead interlinking documents like the world wide web does, or linking a set of things with another set of things, like tables do. Because of using the most fine-granular relation possible, relations do not have to fit into any other logical system, which would have to be designed for a given use case. And semantic data is not required to be stored in a hierarchy, so there is no risk of implementing today a boundary of tomorrow. As a result, the storage pattern is completely nonspecific to any use case, and can be scaled to any size and complexity.

Of course, retrieval of data stored like this is not bound to a use case as well, as no knowledge about tables or hierarchies is required. Instead data is retrieved by querying relations between things, which is far more intuitive. Interestingly this storage and retrieval pattern matches exactly how we memorize things, namely simply by association between things! Or do you open a table or a directory structure in your mind to remember what you had for lunch yesterday?

This does not mean however that tables and hierarchies are not longer required. Data needs to be displayed for to create, view and modify it in the front end, and therefore we still need the tables or hierarchies we are used to – you can take any form as a hierarchical way of displaying data. And whatever is required for that is already part of the data, because another, most important feature of semantic web technology is that the description of the data is part of the data itself.

However, for that applications need to apply the concept of tables and hierarchies only to a small part of available data, so there is no scaling problem. At the same time storage of data logically scales like never before, not hindered by any schemes that are otherwise only required for the visualization of data.

Files are not the problem, finding them is

Recently Alex Bowyer blogged at O’Reilly Radar about why files need to die. He provides some good ideas about how we will store and find information in the future but he also misses the point a bit. Files are not really the problem, finding them is.

The past months I spent a lot of time writing on the quest for money. Due to the decentralized nature of our team at netlabs.org, we already realized a long time ago that doing work in something like a Word/OpenOffice document is not the way to go. In the peak time of Wikis we set up our own MediaWiki and started working in there. That did and does work quite well for some things but it is something I am not gonna do forever for various reasons. So what are the benefits/drawbacks?

Benefits of working with a Wiki:

  • History. I can not imagine working on a text anymore without a powerful history. I need to be able to see any change between version x and y at any time. If you interact with several people on the same text this is the only right way to do it in my opinion. A good history in a Wiki is far more powerful than tracking changes in an office document.
  • Multiple users can work on it at the same time. Yes there are more advanced ways of doing that nowadays, some suck (Google Docs: “someone else is editing this file”), some are a bit overkill for my taste (Etherpad). If we screw up the same paragraph we get hints and we have to fix it by hand. Not great, but works most of the time.
  • I just work on the text. I might add some headings and things like bold/emphasis etc but that’s about it. I do not loose any time at formatting it to look good on something ridiculous like an A4 page.
  • The text itself always has the same link and I can access it from everywhere. This is what made the Web so useful.

Drawbacks of a Wiki:

  • Finding wiki pages. This problem started with (digital) files in the late 70-ies, continued with emails in the 90-ies and we still have it with links (technically URIs) nowadays. There were plenty of ideas on how to solve it and none of them worked for me yet.
  • Exporting that stuff. At some point I have to do a snapshot in time which I send out to someone by email/PDF/office document etc. This is the point where I start to cry because I have to launch an office-like app to do that and believe me, I really hate all of them (LaTeX included).

A few years ago, a former boss very proudly told me that he had rearranged the files on the file server. I had a look at it and was totally lost. Not that I liked the old structure but I really couldn’t figure out which files were where anymore. It was logical for him but my brain totally disagreed with that particular structure. He could not understand why this did not work for me.

This was one of the early lessons we learned in our team. Have a look at someones Desktop. Might scare the sh*t out of you but it works for that person. For a long time I had the tendency to order everything in nested structures myself, like emails or files. Result: A few months later I absolutely could not remember where I saved that information and I either wasted a lot of time finding it or I let it go. And I did let go a lot I noticed.

So how would I want to handle knowledge in the future? I do want to have all benefits of a wiki, combined with a way of finding content by context. This means I don’t want to crawl through a fixed structure for locating a text I wrote, as I won’t agree with the structure or won’t remember the right way through it anymore.

My brain remembers the things around a specific event, even if it is only a simple text about a certain topic.  I know it had something to do with funding, particularly for an EU FP7 project. I remember that I sent it to that guy of the University in Bern for a review. I know that my friend Barbara had a look at the English before we sent it out. I remember that I went to Bali, the day after I finished the stuff. I know that one guy sent me an email about it and told me he loves the project. Note that I don’t remember the exact time of it but the things around it.

If I want to find that text, that’s the way I want to search and find it. This can be a file on my disk or a link on the Internet. In that regard a URI is nothing else than a modern form of a file, including all the problems we know.  And this is part of what we are working on at netlabs.org these days.