« prime tags or iffy overlapping tags | Main | ataxonomy (even more on tags) »



I think you misunderstood me in yesterdays post. I actually totally agree with your ideas on overlap and wasn't suggesting hierarchy at all.

The idea of "master tags" was the minimum set of tags whose intersection sets cover the most possible entries in the smallest possible groups.

I also don't believe that there needs to be a fixed set of master tags to be defined by a "standards body". I think they can be computationally derived from patterns in the way content is tagged.



my fault - I got your point yesterday, but used the "master tags" a bit freely there, the sound of it kind-of-fit the passage :)

You got a point actually for the use of "master tags" -> for homogenous groups where all are quite aligned in understanding the tags/keywords. Say among Audi employees (from yesterday's discussion). Makes things quicker then.

And then probably less "master" the more heterogenous the user group is...



I like Toby feel misunderstood. My only concern with Tags is that if people don't agree on a least a few tags for something... People will never even know about each others stuff no matter how much they search.

I completely agree that by filtering on tags, simply by adding more relevant tags to the filter one by one you will get to the appropriate information quickly. The only caveat is that the tags you chose must already be on the information. If you can explain how that will happen exactly: without resorting to the 'you tag it with whatever you need' argument, which of course is a convenient way of bypassing the my central concern:

How do you find it to tag it?


Alex, sorry, that was not the intent!
(I've been clumsy this time... shall sharpen my ways :)

And yes, I remember your comment over at Hugh's - I wanted to comment on that but got lost on another thread - you have a very good point there - how to find the object in the first place!

And having no real good specific answer (subconcious reason why I did not follow up?) I have to revert to discussion-mode and hope that you may have some idea input there:

Obviously, it does put some extra burden on the author/object creator - he/she will have to do his/her utmost to cover the original set of tags including tags that may not come natural to him/her. As you say, the width and volume of the original set will increase the chances for even more and better tags from others later.

But why should it not be like that? Sticking to a text - any author will make an extra effort to get the title right in the sense that it could stir interest for a wide group, and he would often solicit the help of a third party to go over the text - a friend, editor - a fresh pair of eyes.
Why not apply that to the choice of tags too? Even more important, as the 'tag-set' inherently holds a lot of information, knowledge about the object in fact.

Tags-only-based data sorting is new and would require 'getting used to it' I suspect, the more we get used to it and learn from others doing it, what triggered me finding something and so forth, the better we'll get at doing it.

In sum (and as a first effort):
- no simple 'methods' comes to mind yet (and as with writing a text, there are no simple system I think),
- the original 'tag-set' is highly important as it delivers real knowledge about your object,
- it places the object in the object-sphere,
- and if the goal is a wide distribution and heightened value then a real effort has to be made to include tags to the original tag-set that could cover other mindsets than your own.

It's a bit like the object itself: If the text is bad, if the ideas are of little interest then few will be interested. If the (original) tags are too narrow or too few, the same will happen.

Note the word 'value', I think that's an important one: Well chosen tags at the outset will deliver knowledge directly and ease of finding the object for a wider user-group who then could add even more knowledge to the object in the form of tags. All adding up to an increasing value for the object! Suspect this fact will drive the ability to tag well and at the end go a long way to solve your paradox.

What do you think?


Imprecision if you like ... I would call that variety of perspectives and impossibilty to have a complete or single view of what the "subject of conversation" is.
I've introduced lately on http://universimmedia.blogspot.com/ the notion of "hubjects" which might be relevant in this debate.
See an introduction at http://www.mondeca.com/lab/bernard/hubjects.pdf




interesting... I of course grabbed page 10 in your PDF: "Semantic diversity is life"!

And the points there: "No representation is exhaustive", "All contexts have limited expressivity" and "Anybody can represent anything anyway..." - exactly, I say :)

The hub image still gives me a bit of two-dimensionalness, a bit too rigid... Accepting that the world is in fact imprecise (as on your page 10 too) then spokes seem to me to be too clearly defined.

That's why I started using "iffy-blobs-intercepting" for the "delivery-method". Not unlike a hub at all but for the iffyness, the diversity and flexibility of the "ibi" :)


I had a crack at this on gapingvoid, but I'll try again. I think it is incumbent on the object-creator to start with AT LEAST what he or she would use to find the object. That effort will be improved with your suggestion Sig, that they seek some alternative input viewpoint at the time of creation. As I said on gapingvoid, subsequent finders of the item (either by accident, or via a sequential read of a long list that resulted from the initial, very imperfect search) should then add the tags THEY find more useful. This makes it easier for the next person, who adds their tags, which makes it easier for the next person, who adds ......

Now if it's something that only a few people find interesting, then it may take some time to get a set of tags that is more useful, or that is amenable to the sort of automated 'tagging of tags' that a couple of people have suggested. And you may find that some object-creators add popular but misleading tags to their item, which long-term will be self-defeating but distracting in the short-term.

If we subscribe to the 'longtail' idea, ultimately ANY item will gather sufficient tags to be found by someone with sufficient interest, and as long as THEY tag, the findability will always improve.

Obviously the more 'interesting' an item it is, the quicker it will accumulate an easy-to-find path. Placement will be important also - as Sig knows, a topic on Hugh's blog will more quickly attract attention despite originating here!



you're absolutely right, of course :-)

And as you say, placement is important... I bow henceforth to Master Hugh...

Interesting part though when using multiple tags originating from different sources (and thus understandings of the object) is that the "knowledge" increases, directly to be seen in the tags (equal ID/name at the end?). And with that, value!


Dennis Howlett

From a business point of view, there comes a point where you need a standardised (sorry) set of tags at some level. If that is industry wide (say banking, insurance, telco etc) then fine. But that's not the way of the world...is it? However, if you can get an agreed starting point - which is FAR from easy, then you're onto something.



"standarised" is only needed when you organise on the source level, so I go... nahh, you really do not need standards in this non-taxonomy way, business point of view or not : )
(quarrelsome I am..hehe)

When you structure on source level (as in taxonomy) you need standards, but here anything goes on the source level as long as it's relevant (and relevant it'll be in the eyes of the tagger).

This "free-tagging plus iffy-tags-intersection" method is the opposite of taxonomy on all counts.

Question is if somebody with a completely different mindset will find the object even if using 'multiple-tags-intersections'?
Well, that remains to be seen. He/she will in any case be better off than in a system based on standards if she/he's without the manual (or training)!

In other words - in a homogenous environment (like a business or an industry) - the chances of being able to find the object would be even better than when the users are heteregenous.

Thus less need for standards in business ;-)


To sig, about rigidity and flatness of the wheel metaphor

- Imagine the wheel in a space of arbitrary dimension N, and spokes as varieties of dimension N-P ...
- Spokes are rigid, yes. Each of them capture one perspective, which is very well structured, but partial. So it's a bit different of free open tagging.
- The wheel is bound to move to keep up with the subject it tries to capture. A static wheel is useless, as is a static knowledge.

I copy below what I wrote this morning about this issue in another non-public conversation.

I think there is two ways to consider ambiguity :

Way 1. Subjects are ill-defined, everything is fuzzy, nothing can be asserted for sure
Way 2. Subjects are well defined, but in many ways, as so many views in/from different frameworks/perspectives

Way 1 is good for unformal and cheerful conversation, like the one you used to have in forums, and now blogs, RSS, tagging and the like. But it is IMO pernicious : people either think they agree, though they speak passed each other, or the other way round think they disagree because they have no way to figure if they actually have different viewpoints, or if they speak about different things. Billions of examples available everyday.

Way 2 is what TMRM (http://www.isotopicmaps.org/tmrm/) and hubjects are about : subjects are ambiguous, contradictory, fuzzy, moving targets, OK. But each view on a subject has better be well defined, and the rules for this definition explicit (perspective disclosed). You know your view is not exhausting the subject, you can explore different views, see if their logics are compatible, if they can play nicely with each other or are too orthogonal for that etc ... So you can agree that you agree or disagree on clear grounds, and go to war if needed, but with crystal clear reasons :))

Dennis Howlett

This is the thing I was hoping to avoid - the almost religious nature of discussions around this. There's nothing fundamentally wrong with tagging. I like the idea. I'm thinking of the commercial, business reality in shops like BP, Barclays, Ford etc (and you are aiming hi ).
Given what I've said on my post (http://www.bazaarz.com/archives/2005/07/tagging_-_searc.php) about problems with XBRL across Europe, what hope for tagging?

My sense is that to gain acceptance of the concept, you either need some very brave and high profile people to come to the party - like Henning Kagermann - or accept some level of compromise. Provided it doesn't impact the user experience, what's the problem with compromise?

I'm also concerned about the amount of oomph you might need to power searches conducted using tags in high volume, complex environments. Have you done any big volume testing? I'm thinking 10K transactions per second. Think banks...

Dennis Howlett

For some reason, the link doesn't work from the last post but it should appear in your trackbacks...doh!


ouch! Me being religious... double ouch :)
Will promise to shape up and restrain enthusiasm... :D

But back to business - this particular discussion and the "test site" mentioned earlier is purely for the heck of it, and a bit to test some underlying ideas... not aiming to convince BP or such (they'll find out by themselves if I'm only half right anyway ;) - got other stuff for them that can mimic current world-views until they find out.

Read your article, liked it a lot - and in there you wrote "...attempts to create taxonomoies using XML as the basis, XBRL springs to mind...".

And therein lies the clue! What I'm talking about (not being too good about it though...) is in fact not taxonomy, it's quite the opposite! It's breaking with taxonomy, it dumps taxonomy.

That is perhaps why our aguments passes each other sometime midway (my bad, sorry!)... so therefore, see my next post where I compare old-hat taxonomy with the opposite... based of course upon the works of Aristotles and Plato, but humbly breaking mold with Carl von Linnae! (now you can mention aiming high!) :D


[Note to self]: Two feeling misunderstood and one feeling discussion getting dogmatic, all under one post...hmm... must shape up form, restrain enthusiasm, behave etc.

Sorry guys! Will behave! Eh, try that is... bear with me :)


Sig... cool, seems like we are working on the same problem now then. We accept there are issues and are trying to figure out a solution.

Now we are colleagues, interpid ones at that, boldly going where no man... (maybe that is stretching it a little!)

Ric has got it correct 'again' when he says the Object-Creator must at the very least tag it so that he can find the object again. Hopefully that is enough to help others find it too. I agree with Ric 'yet again' when he mentions the fact that the more people find and tag something, the easier it will be for others to find.

The problem I have with this I think is that in a business scenario (unlike the URL bookmarking scenario of del.icio.us) not many pieces of information will be used by lots of people. Lots of information will be used by one or two predominately and then 'maybe' required later by someone else. Now unlike the del.icio.us scenario, in business something can be very important even if only one person uses it, just think of how important one email can be to determining the outcome of anti-trust cases.

In this scenario I hope that the searcher can put themself into the 'mindset' of the object-creator.

When I think about this problem I can't help but think of it with my 'WinFS' or 'Base4.NET' hat on. Both of these are essentially extensible database file systems. The key word being 'extensible'.

If you can match your file (binary data) to a type which has custom properties (for example a word document might be recognized as an RFI document which is known to have a client, author etc), and you can somehow train your file system to Promote information from inside the binary data into the properties of the instance used to represent the file, then you will have a very structured way to find information (i.e. by Client, by Author, or even by Author's Department etc). I.e. this is the standard 'SAP' like view of information, albeit a little more flexible I think.

If on top of that you also support tagging so any file can be tagged in anyway by anybody. Then you have what I think the best of both worlds: People can tag things how they want, but if they don't do a good job the information can still be found in ways that make sense for that industry or business 'shared mindset', because this has automatically been promoted or indexed.

Bernard thanks for your 'hubjects' idea... I will be reviewing you pdf more over the next few days.


I have seen many projects that use "precise definitions of the tags, must have standards, must have master lists..." die a quiet death; simply because they weren't used. While on the other hand I see sites like flickr and del.icio.us thriving (and working!)

Why is that?

Simply because of that fact that while some people do not tag, or tag 'wrongly' (for a given perspective of 'wrong'), a lot of other people will come along and tag the same item 'right' (again, for a given view of 'right')?

The comments to this entry are closed.

My Photo


  • Phone: +33 6 8887 9944
    Skype: sigurd.rinde
    iChat/AIM: sigrind52

Tweet this

Thingamy sites

  • Main site
  • Concept site

Tittin's blog


Enterprise Irregulars


Twitter Updates

    follow me on Twitter


    • Alltop, all the cool kids (and me)


    Blog powered by Typepad
    Member since 01/2005