Perspective — RDF and the Semantic Web are getting serious attention, but they are ludicrous ideas. They may be about doing the right thing, but they are doing it the wrong way.
The Semantic Web — and RDF, for that matter — is about providing information on information… But take a few steps back: How much time would you — as an everyday user — spend providing information on information by inserting tags every other word you type?
Suppose, for instance, that you are writing something about your dog Fido, and that you decide to provide some information about Fido:
:Dog rdf:type rdfs:Class :Fido rdf:type :Dog :name rdf:type rdf:Property :Fido :name "Fido" :Dog rdfs:subClassOf :Animal
The above is a piece of so-called “simple datatyping model for RDF” taken from Sean Palmer’s Introduction to the Semantic Web. It stands for: Dog is a class of information, Fido is of type Dog, name is a property, Fido’s name is “Fido”, and Dogs are a subclass of Animal.
While you’re at it, also provide information about yourself, about what you were doing with Fido, about the place you were with Fido, about why you are talking about him, etc. in the relevant rdf schema.
Let’s get real here: Beyond the syntax, which is not very inspiring to say the least, there’s not a chance in hell any user will fill this kind of information on information themselves. And no computer will fill the information for you: The very idea behind the Semantic Web would then make no sense at all… Take another few steps back.
What do you expect the computer will automatically fill? The author? Great stuff: What if your assistant is managing the master document of your entire team’s report? Is she the author? Are you? your colleagues? Did I mention she based the document on an older one? The date? Sure: the last modified date might be correct, but the date created might not be since the initial document was written a year ago. And what if you merge and/or split documents? The topic? Laugh!: Trying Yahoo!’s context search feature should make it clear we’re not even remotely close to extracting it properly. Let’s suppose… assume… that the computer actually manages to fill this common meta information by itself nevertheless. Say, 80% of the time. Will you spend the necessary time to manually check the meta information to cope with the 20% untypical cases where context had any importance? Will you pay someone to do the job for you for the sake of providing potentially relevant meta information on the documents you produce? I doubt it… As for automatically inserting relevant tags within the document, take yet another few steps back…
The very purpose of the Semantic Web is to let you interconnect related information to and from your document. As such, unless you come up with an Artificial Intelligence that understands the meaning of your words — making the Semantic Web itself an irrelevant construct — you will need the Semantic Web or a HUGE hand-made Semantic Network to automatically provide you with information on the information you are manipulating in order to automatically fill the RDF data. Yes that’s right: You need the Semantic Web to be up and running in order to automatically generate the Semantic Web itself. Sounds wrong to you too? That’s because the idea itself makes no sense at all.
You’ve certainly come up with a categorization problem in the past. As in, you’ve a document to categorize: Which category will it be? For instance, this document is a Column mainly related to Information Technology, Internet and Semantics; It might also interest someone looking for data on Computational Linguistics or on the Philosophy of Language. Hence any of these categories will do. It’s also about Strawberries and Polar Bears — because I just mentioned both. I’d categorize it as a document on Raspberries. And if you don’t agree, you are making my point.
The idea behind the Semantic Web assumes some sort of universal language scheme traverses individuals and cultures. Quoting Tim Berners-Lee himself:
Where for example a library of congress schema talks of an “author”, and a British Library talks of a “creator”, a small bit of RDF would be able to say that for any person x and any resource y, if x is the (LoC) author of y, then x is the (BL) creator of y. This is the sort of rule which solves the evolvability problems. Where would a processor find it?
If that doesn’t sound absurd to you, here are a few perspectives.
First off, you must be aware that semiotic signs and the concepts behind them are categories. When you speak and think of apples, your interest is in a class of items you categorize as apples.
Then, you need to know that it is mathematically impossible to come up with a categorization scheme that lets you consistently express everything a language will let you say using a consistent, finite set of categories. Unless you decide — as did Ludwig Wittgenstein in his Tractatus Logicus — that you are never manipulating a category, you’ll thus mechanically end up with inconsistencies — as in: “this false statement is true”.
Please note — Part of this post was lost. It was rewritten from this point onward.
Moreover, the way you categorize your environment differs from a culture to another, from a subculture to another, and from an individual to another. For instance, West Greenlandic has no less that 49 ways to say “snow” or “ice”. This lack of homogenity means you’ll readily encounter subcultures and individuals within a culture or a subculture that give different meanings to a word. And by all means, this is not hierarchical — far from it — or even consistent.
As such, the very idea of the Semantic Web becomes inane: On one side, there is no absolute scheme to tag the web. On the other, no two individuals will interpret a tag the same way.
You can now safely laugh too: RDF and the Semantic Web are ludicrous ideas.
Suggestion of related posts on this site: