I believe we are making a huge and unconscious mistake in how we handle knowledge; how we capture, organise and distribute facts and information for assimilation. It might have a wide-ranging negative impact on all what we do, and I think we should do something about it.
Knowledge is the source of our wealth, well-being, and hope for the future.
Knowledge is facts, information and skills acquired by experience or education.
Thus the most important aspect of knowledge is how and in what form it is captured and distributed for the most efficient assimilation.
As we cannot have all knowledge in our personal RAM at all times we need systems and ways to have the right information and facts delivered at the right time, and in a form that we immediately understand - preferably without any previous and specific training.
That is what makes organisations work better, that increases global wealth and well-being, in short, that yields more efficient resource use.
And in practice it's about the single most important aspect for education, knowledge management, politics, global warming, enterprise software and almost anything else. Do not underestimate the importance of how knowledge is handled.
Let me keep it simple and divide the handling methods into two distinct ways:
1. Categories
When asked "what word is the odd one out among these three - cow, chicken, grass" and your answer is "grass" - then you lean towards organising life by Categories.
Also known as taxonomies, hierarchies, tags, classes, branches and similar.
That's when you want to acquire some knowledge about the honey bee and find that it has a latin name - Apis mellifera - mellifera for "honey" of the family "apis" for bee, with no less than 10 more super-categories and you really need to be a highly trained zoologist to assimilate the official knowledge.
Tag this post with "education", and someone looking for tuition fees might read it, not precisely what you meant.
Categories are nouns, they are boxed, limited and requires training and acceptance and belief that the definitions are "right".
Categories are dogmatic as in "accept it" and quite theoretical as in taxonomies based on the male reproductive organs.
Categories requires distribution of common rules and understanding of what each category entails, without that knowledge categories are rendered useless. Or worse, it becomes a source of discord and destruction. This requirement was perhaps always one of the driving forces for the educational system, second only after the historic need for dogmatic religious training.
2. Relationships
When asked "what word is the odd one out among these three - cow, chicken, grass" and your answer is "chicken" - then you lean towards organising life by Relationships.
When we observe that "honey bees" fly, "honey bees" gather nectar, flowers produce nectar, nectar attracts bees, bees get covered by pollen, pollen is brought to other flowers by bees, nectar is converted to honey by the bee... we establish Relationships.
Relationships are endless collections, a web where relationships can easily be followed without training nor much education, and that still could diffuse more precise knowledge about the bee and all that it touches directly or indirectly than any strictly logical taxonomy could do.
That's when you may do a (IT based of course) multilevel query of the full population of all IBM'ers (if you work there) for: "Everyone that know C++, speak Italian, have friends that live in Rome and where those friends ride bikes and have a bike my size to lend out".
Relationships describe how a cup needs liquids and a mouth, thus makes it a cup.
That's how children learn, they observe and "get" the relationship between the objects in their vicinity. That's how our mind works, empirical, learn from observation.
Relationships are verb phrases, based on real activities. It's pragmatic and not theoretical and have no boundaries as the relationships links everything in one way or the other.
Relationships are human in form while still useful for all other things, after all, all relates to the observer, the human being.
In your daily life you would say "Chanterelles are yellow, look like beakers and are really good when sautéed" instead of "Cantharellus cibarius of cantharellaceae family of the Basidiomycetes class". Relationships instead of Categories is what comes natural, and the listener does not have to be a highly trained mycologist.
Not so at work with it's category-based forms and questionnaires and hierarchical positions. Heck, even archiving, the high priesthood of categorising is a proper profession!
So why do we still bother with the Category method if the Relationship method is better in all aspects?
Because of technological limits. Organised life required organised facts and information, and for that technology had to be employed.
Categories worked well in the two-dimensional reality of pen and paper where the multidimensional Relationships could not easily be represented. So frameworks suitable for paper were devised - taxonomy, organisational hierarchies, narrative reporting, accounting and even the last kid to the block, tagging.
But now, yep, Relationships are "made for" modern information technology with its ability to represent multiple dimensions and query links for any number of steps with great speed.
Time to refocus on the single most important issue in all what we do: How we capture, distribute and assimilate facts and information - in short how we handle knowledge.
Make that better, then the rest follows - economic efficiency, better resource use and simplified and better educational methods.
Relationships, not Categories, will save the planet.
I'm wondering why you choose enterprise IT as your domain. If something like del.icio.us is available for relationships, them your claim of saving the planet can be easily validated.
Is there plans to open up thingamy for such anarchy? No need to opensource. Same hosted model, but available for playing around and sharing? Perhaps a monster like delicious will emerge out of it.
Posted by: Balaji Sowmyanarayan | January 01, 2008 at 11:58
Hi Balaji,
Happy New Year to you!
Allow me to put it like this:
Enterprise IT is supposed to be the framework that enables to make the best possible use of resources for all organised value creation.
To enable that, the best possible decisions has to be made as often as possible, again dependent on having the "right" facts and information at the right time and in a form that makes it possible to assimilate quickly and without glitches.
All of the above is a question about "knowledge handling" - get that "right" and the rest follows - lower resource use for value created = profits, wealth and in the long run a better planet...
That's why I like Enterprise IT :-D
And yes, Thingamy will be using (semantic) Relationships, almost exclusively, to add the knowledge to objects that you're tinkering with in a workflow to create value. The more "good" knowledge these objects holds the better the flow will become.
Note just in case: Relationships are usually thought of as a inter-human thing, but it is essential for all objects - Plato defined knowledge (freely interpreted) as "how objects relate to other objects"!
Posted by: sig | January 01, 2008 at 12:17
Always interesting to see your thoughts in action, me thinks '08 is the year of the Thingamy Revolution!
Posted by: Craig Cmehil | January 01, 2008 at 12:50
Thanks Craig!
Hehe, a bit of nip & tuck on the (semantic) code, a few down-to-earth pilot examples and we should at least be enabled! Then we'll see, *rolling up sleeves* :-D
Posted by: sig | January 01, 2008 at 13:04
Sig
I hope 2008 will be the year of customers, customers and customers.
thomas
Posted by: Thomas Otter | January 01, 2008 at 14:42
Thanks Thomas!
My thoughts precisely! Opening up...squeaky door...to the world... ;)
Posted by: sig | January 01, 2008 at 14:48
Maybe it is not being neglected but, it is just unresolved.
http://www.asis.org/Bulletin/Jun-07/quintarelli_et_al.html
Doesn't faceted classification go a long way to adding relationships (context?) to flat tagging? Tag bashing (http://thingamy.typepad.com/sigs_blog/2007/09/no-more-tags.html) does not in itself suggest an improved method. I don't think it is a matter of tags being flawed, it is a matter of resolving making tags more useful. They are useful (being used and adding value) now. Adding methods to create relations between tags using simple methods like tagging will produce richer results.
As highlighted by Mike Wesch
http://www.youtube.com/watch?v=6gmP4nk0EOE
we need to rethink...
Posted by: Mike O | January 01, 2008 at 17:27
Mike,
as I said in above "bashing tags" post (did I bash? Thought I just renounced! :-D) - tags are nouns (as are categories), half sentences, lacking verbs - and are thus quite imprecise on their own.
Relationships are full sentences (like N-triples) with subject, predicator and object and are thus precise.
Not saying that categories does not deliver relationships - they do - but only indirectly as members of a class/category that have certain common traits. Thing is that placing an object in a category does not tell much unless you're trained/educated in what said category entails. Relationships on the other hand does give the traits directly. And that has a direct bearing on the assimilation of facts and information, an important aspect of knowledge.
Back to "how to make tags better" - a while ago I had a multiple-tags-browser up as a demo (thingamy 2.1 uses same method) where you could highlight more than one tag and thanks to the wonders of Javascript could extend or narrow down on the results. Using the tags as overlapping templates - that quickly increases the precision and indeed delivers easy assimilation of knowledge.
But it's at the end of the day cumbersome as many tags would be required to deliver the precision of one single predicator.
And last - and in practice very important - argument for direct relationships versus any type of categories: You can follow the "threads" (relationships) as far as you want - yielding a rather amazing ability to query anything you'd want. That's not doable with same efficiency, if at all, with categories.
Posted by: sig | January 01, 2008 at 20:33
Very enjoyable, original, and thought-provoking post as always, Sig. Thanks and Happy New Year. Keep the great posts coming in '08!
Cheers,
ewH
Posted by: ewH | January 01, 2008 at 23:37
Thanks Eddie, and a very Happy New Year to you too!
Posted by: sig | January 01, 2008 at 23:47
Wow! - what a great post.
The notion of relationships as a way of viewing things innately make sense (at least in my head), while the catgeorization approach seems overly complicated, and requires specialist domain knowledge.
Is this just because of the 2D legacy of paper? Relationships were present then - The Hyperlink is kind of like the grandson of the citation... Perhaps people simply had different attitudes to work, and designed systems for specialization and complexity, rather than for effectiveness.
Posted by: Gordon | January 02, 2008 at 15:36
Thanks Gordon!
Hard to pinpoint exactly when and how categorising started (and thus why), but it seems Kant had his say that it was a "natural" method. To which I disagree, I see much more use of relationships in the daily space when no structure is given.
It certainly has been used extensively in closed cultures, when "all knew" what the categories entailed, but then as now the knowledge did not travel easily with that baggage.
And that we all can recognise - listen in on a closed group of MBAs, Marketing folks, Geeks... they all have their own taxonomy it seems, aka xxxx-speak.
"Globally" it seems the taxonomy schemes were the first ones (in parallel with organisational taxonomies, hierarchies as they're called). And at that point I would argue it was the 2D restrictions of paper that won the day. Typically Carl von Linnae beat out Comte de Buffon for reasons where the latter ones wanted multidimensional methods, and that did not pan out well on paper.
On an even less scientific level I ask the "cow, chicken, grass" question when I meet people - and so far I have this hunch that mostly "well organised, well educated, western males" are the ones mostly falling back to categorising :)
Posted by: sig | January 02, 2008 at 16:12
the distinction between tags and relationships reminds me of stephen pinker's 'the language instinct'. why? nouns and relationships appear to have a profound place in our brains, pre-wired - hence 'instinct'. i see the need for and value of both; can't they be seen as interactive concepts? allowing us to hold onto things and relationships while recognising that a thing is intrinsicly defined by its relationships, and perhaps its 'thingness' evolves as a result of those relationships. but also that relationships can only occur between things? do we have to see one or the other as better? or to choose one or the other in an application like this?
Posted by: blueskypoint | January 02, 2008 at 17:18
Bluesky (Huw is it?),
indeed, in daily use we mix the two - using language adding a verb is easy even when using a category. As long as we know the listener knows what the category means. Try saying that "this and that software is CRM software" to your wife if she's far from Enterprise IT oriented. You would probably gone with a few relationships; "used for", "similar to" etc.
Or try using something as iffy as "SOA", not even three Enterprise software geeks in the same company might hear the same thing ;)
For the enterprise system (useful for the organisation) that struggles with organising facts and information; if there is a mechanism for relationships (that includes verbs) then it'll have no problems with categories - "Lady is of type dog". And you could add other relations to that category that immediately gives you knowledge about the category as relations works both ways, and quite precise even if the predicator always requires an inverse predicator (is parent of, has parent).
For the current systems that has no verbs, you might tag "Lady" with "Dog" and if you do not know Lady you might think "Lady" has a "Dog". I.e. inclusion of a verb phrase gives a clear relationship that delivers real knowledge in both directions (for subject and object).
Add that "things" (objects, subjects) does not necessarily have to be tangible objects in a relationship system. "Problem", "Medical condition" and so forth are all useful as subjects and objects in use with a predicator.
Posted by: sig | January 02, 2008 at 17:54
Actually I'm now quite confused. Remind me again what is it that you don't *like* about the traditional relational model?
After all, the good old fashioned RDBMS has been representing knowledge as "relations" for over 30 years.
You could say that you don't like the fact that it's optimized for many relations of the same format (a few tables with a lot of rows) rather than lots of relations of different formats.
That's sort of true, but there's nothing to *stop* you having thousands of relations (tables). I believe you can even get a Prolog front end for Oracle if you want to think of it that way.
But I say that your real issue here is that there's always going to be a trade-off between the cost of creating the database and the cost of querying it - that's inevitable.
Allowing the creators of the database to build up custom webs of relationships without imposing much up-front discipline on them looks initially attractive. (And maybe you can justify that as working with BRPs rather than ERPs.)
But it makes the knowledge very hard to query systematically (and by "systematic" I mean *reliably* so the researcher doesn't miss crucial bits of information)
As you try to scale up to more complex tasks in larger data-sets you are going to find that that trade-off comes back and bites you in the form of query complexity.
If three different users in your organization build up slightly different semantic-relational models for similar processes; and then at a later date you want to group them all together into a single report, you'll have a heck of a job.
So here's how this looks to me. If you like relations rather than hierarchical categorization, then RDBMSs are GREAT! Really, if you've forgotten, go back and check.
Unless, of course, you don't like imposing the discipline of up-front schema design on your users.
But if you don't want to do that, then 6 months down the line with a history of 10,000 purchase transactions represented with 10 slightly different schemas ... querying any kind of overall state of your enterprise is going to be hell whether you've used relational or network (object oriented, hierarchical, semweb) structure.
What *might* have saved you, is if people were free to model their processes as they liked but they stuck to *very* simple "common-sensical" models of processes. The kind of situation where tags work because their meaning is embedded in everyday practice eg. all objects tagged "sale" are a sale etc.
But it now looks to me that you've bought into the idea of a SemWeb-style triple-store.
Yet the SemWeb has prohibitory cost of up-front schema-definition. (That's one of the reasons it hasn't worked out too well in the real world) And you're trying to combine it with a free-style, low-cost-of-entry schema definition?
How can that work? Either Thingamy using corporations will have to hire someone to do good network-shaped schema definition and you'll have to train all users to understand the schema (much harder than doing the same with relational models that a lot of people already have expertise in) or individual users will put data in in an inconsistent way and nobody will ever be able to pull data out.
As always, these criticisms are meant constructively, I'm a big fan of the idea of Thingamy and what you're doing - but I like to push you a bit. :-) I want to see a Thingamy that's suited to my (current) world of warehouse management, stock-control, purchase orders etc. all the usual ERP stuff.
Posted by: phil jones | January 02, 2008 at 20:35
Phil,
secretly, I have a price for the first one to catch me in that relational DB no, relationship yes dichotomy! :)
OK, no big price, but a beer or three if we ever cross paths! OK, you get double price for pushing on quite technical issues too :-D
Now, "relational" in database terms means as you say "based on tables" - and here I meet the wall as tables are rigid, hierarchical and boxed, i.e. "relation" in mathematical terms are not the same as semantic relations.
The R in RDBMS is cheating! ;)
That said, precisely how you store the objects, they be "objects" or "predicators" has little direct effect on the front end for the user. Until queries. Have to admit that I have yet to know precisely how the different types of DBMS handles queries like the one I describe. Suspect triples-stores beats OODBMS beats RDBMS...
So as you say, tradeoff-ing ahead, what we're doing in thingamy is a kind of hybrid, still. Sticking with OODBMS, using a hybrid of N-triples and real objects. But there is little need to keep the objects over time, just practical now. Trade off back-biting planned for, or at least the risk as we go step by step.
But I doubt we drift back to RDBMS, you said it, schema changes at any time is what we like. And if we really, really like that and want the speedy queries per relationships, well then we might see triples-stores instead, one day.
I agree with your point about pre-set schema, but take into account something I repeat over and over again: All is dependent (and not thingamy only, but efficient models) on singular objects, nothing more, nothing less than what can represent a real-world object - a person would have properties like first, middle, last name, social security number and DOB. Full stop, what else is never-changing?
Then mix such singular-objects and create whatever messed up, sorry mashed-up documents as we're used to.
In use for thingamy the schema comes as classes - defined by the builder, the exact way relationships comes I can tell you soon as we are experimenting with that currently - but so far the "builder" would define predicators and inverse predicators as well as "semantic objects" not necessarily used in any flow. Then you can add triples to the classes where subject is the object in question, and the object is limited to a class making the N-triple come forward as almost a "property" where you have to choose the specific object. I.e. for a person, beside manes etc as usual properties there could be say subject, lives in, [country] so you have to choose from the given "country" objects.
Phil, time to fire up the Skype soon as we have planned!
Posted by: sig | January 02, 2008 at 21:44
Sig, yes "bashing" is too strongly worded. The contrarian view is appreciated, in fact all the thought here is. I would argue that tags not limited to nouns but, would agree that verb tags are rarely used. Because the complexities of language, I wonder if modeling descriptors using language as a model is a good idea. I understand little about how languages evolved but, at a fundamental level they are used to communicate and describe.
Paper, Yellow, Square, Sticky, Flat, all describe http://www.flickr.com/photos/wheatfields/1803288927/ without any relationship to one another or any hierarchy. It leaves more to be known (discovery) but, it rules out all the things that are http://www.flickr.com/photos/lenore-m/441565757/
Cow, Chicken, Grass experiment from Richard Nisbett's book "The Geography of Thought : How Asians and Westerners Think Differently...and Why" describes that may be differences between the way Westerners and Easterners think. I believe there are differences between how each individual organizes and processes information. I don't believe there are absolutes. What works for you might not work for me. There seems to be success in offering information in a way that allows individuals to construct a method for organizing that is useful to them. The aggregate of each individual's organization produces new layers of information and value.
Thanks Sig, interesting perspectives.
Posted by: Mike O | January 03, 2008 at 02:56
Mike,
I agree that we all have our ways of thinking, and that we are "wired" differently - but is that not the very result of our training/education/culture membership?
To me it seems that direct observation of real world relationships, that includes activities and thus verbs, like a child - is less in need of training and thus less of a "communication" problem if two participants have different background?
I noted the use of the word "description" ;) - being cheeky I would use that in favour of my views: The word came from Latin and translates to "write down", i.e. technology limitation already present (pen and paper, no more than two dimensions). The only way to use verbs and thus include activities and relationships would be in the narrative form - a cumbersome form and not very efficient in distributing facts and information.
Thanks again Mike, have to admit that this kind of (flat-format) narrative discussions certainly gives much freedom in the discussions, and that is at least highly enjoyable! :-D
Posted by: sig | January 03, 2008 at 10:21
OK then ... here's a challenge ... let's see an n-triple data-model of a simple stock-control application :-)
( Yeah, and Sig, definitely want to skype, I see you online but always when I'm at work. :-(
But let's do it soon.)
Posted by: phil jones | January 03, 2008 at 16:04
Phil, hehe, OK, "stock-control application", easy... :-D
Actually, not so straight forward - I'd have to ask you first "what for?" as that certainly is only a step in a larger flow and a solution could thus end in many different ways...
That would be a good starting point for the Skype discussion!
Posted by: sig | January 03, 2008 at 16:23