I’m not sure if netlabs.org will participate in this call, I see the need for big data but our focus is on creating and interfacing graph based information in smart ways, more about that in another post. However, I often use existing datasets to demo our technology and I regularly run into issues which were mentioned by one of the FP7 coordinators as well:
Academics are rewarded for publishing papers, not for writing robust code.
Later when they talked about making the research available as open source software he mentioned another problem: Sometimes the outcome of projects is code, that compiles and works only on the computer of the PhD student which wrote it. And as we probably all agree, this is not really useful.
Also he said that a good response time of a system from the user perspective is something that gives a result within at max half a second, followed by the statement that if it doesn’t provide an answer within 20ms, it won’t be part of the technology chain – which is a statement from Google (IIRC) that shows that even if you are within 0.5 secs, you are not necessarily the only part of a chain :)
Unfortunately, most or at least many of the LOD resources I use on a regular base out there for demo cases are a good example of #FAIL for the “robust code” and “half a second” remark. I for example often use linkedgeodata.org, unfortunately the response time for a non-cached query is still way too high on this site, it usually takes several seconds.
Linkedgeodata is just one of the examples, many projects do not realize that the service needs to be fast and reliable as well to make sure people will be able to use it in the real world. This leads to some sort of a chicken-egg situation: As long as the experience is slow (aka sucks), people do not use a (semantic web powered) site. And as long as there are no real world users there is less motivation for the service provider to improve the response time.
Conclusion: If we want the semantic web to become a success, we definitely need to fit the 20ms rule! Every service which gives back RDF is just one part of the technology chain and thus just one part of the final user experience!