« logical discussion (tags, not trees continued) | Main | the logic of outsourcing (really dim stuff afoot?) »

Comments

Steve Cooper

Similar categorisation systems;

* Kant's 'forms of intuition' (bottom of http://www.philosophypages.com/hy/5f.htm)

* The six journalistic questions "who, what, where, when, why, and how"

Although a pure tagging system is very =neutral= in how it approaches things, here's a couple of ideas;

People answer specific questions more readily than open questions. For example, if you've ever told someone you know a foreign language, they'll inevitably say "say something in (russian|iranian|lojban) for me". And your mind goes blank. When they ask 'how do you say "my hovercraft is full of eels"', you're more likely to give an answer.

Similarly, polls get more responses than open comment boxes, and there's nothing more terrifying to an author than a blank sheet of paper.

So, if you want people to tag, focussed questions might help people to pull useful tags out of their brain. Imagine six textboxes for tags, labelled with the journalistic questions;

who? [_________]
what? [_________]
when? [_________]
where? [_________]
how? [_________]
why? [_________]

I am more likely to contribute those pieces of information if directly prompted for them.

Second, tag prompts let you store tags with their categories, which allows more sensible searching, eg:

{place=turkey condition=roasting} =>
"It's 40deg C here it Marmaris today"

{substance=turkey action=roasting} =>
"Delia's Christmas Feast"

Lastly, if a tag has a category, all those category tags can be processed seperately. Let's say, for example, that everything has a 'where' category. That will define a rough tree, with 'where' tags pointing upwards towards 'the universe'

eg:

york; where=yorkshire
yorkshire: where=uk
uk: where=europe

That creates a path, and the complete set of paths is, as I say, roughly a tree. So you get rough taxonomies developing naturally. Not trees, but more like kudzu. ;)

sig

Steve,

very good point indeed - and yes, being prompted makes life easier... and thus the method easier to get going (perhaps until we'll leave some of the trained-tree-thinking behind?)

But, as I like to stick to the basic principles as long as possible (then adjust of course if required :) :

It would apply logic constraints, at least a set of logical pointers. Would "umbrella" for "Mary Poppins" pop up using journalistic categories? Would "nano technology" for "Ms Spears" be ventured from Brixton when Aristotle categories were applied?
Probably not... and I still think such "seemingly irrelevant" tags would increase the knowledge of the object, and make it easier to find.

That said it certainly is an idea to be seriously considered, see if it can be combined with the basic principles... hmm...

craigo

Sigs, your example there assumes multiple taggers on a single item in order to build that namespace. On a social bookmarking level this sounds lovely, and del.icio.us does this now.
eg.

http://del.icio.us/url/175540a254e846e8b30ad923bf392113

the only item missing from your list is why they tagged it... additionally, filtering exists where no common tag is found.

However when it comes to individually tagged items additional user defining input simply isn't available. How therefore is this content to be found via tags I believe to be relevent when others think differently when tagging? Facilitating additional user definition is what I see as vital.

Steves thoughts there are tantamount to categorisation. And like you, I agree that purely using this method of organising filters out otherwise specific information leading to an item.

Using a system like anataxonomy the only information available for those items in creating paths to them other than their tags, is in how an items tags relate to other tags around it. Additionally, should that item be text, then its keywords. Got yourself a map or image or audio though and your screwed unless you have OCR and pattern recognition.

"Anataxonomy - efficient organiser of objects and subjects, direct delivery of comprehensive knowledge and a true identifier - all in one swat - and all better than existing and separate methods?"

anataxonomy combined with relational keyword searching maybe.

Anyway, on the search side when sifting through other peoples poorly tagged content. Tagged search and the resultant related tags, in combination with keyword/phrase analysis of that content where available seems the way to go.

Finding relative content to what your looking for comes back to when its created and ease in defining it.
Say on the content creation end we do this to make it easier: Free-tagging in addition to an automated content keyword analysis. This system suggesting user selectable primary keywords and phrases to compliment free tagging. On enacting free-tagging an ajax drop down textbox or suggestion box could do this suggesting. Drupal is in the first steps of implementing something like this now. Suggested tags based on what you type from previous tagging, much like google suggest. Simple autocomplete. In addition to this I see an expanding box area with other tag suggestions based on keyword analysis of your content useful too. Even analysis of peoples free-tags and suggestions of more as they type them.

Ultimately though when it comes down to finding a needle in a haystack, it's keyword/keyphrase search that wins over. Combine this with some smart anataxonomy and I see a winner. It's all in the UI. ;)

Steve Cooper

Sig;

> Would "umbrella" for "Mary Poppins" pop up using journalistic categories?

Perhaps not. And an 7th 'other' box will probably get less attention than a single tag box. The questions risk 'peripheral' connections being ignored.

Seems to me that decisions like this will have to be made based on the particular application in question.

For example, in a database of diseases, boxes saying

symptoms [________]
observed_in_towns [________]

might be supremely relevant and useful; prompting individual doctors to say where they have seen a particular disease and what symptoms they observed is a useful set of data in itself. For example; searching for {huge_boils coughing London} might pop up 'black death'.

But, for marking up a huge unrelated set of articles (eg, the whole web), the question 'what's it about?' is far too vague to be useful, and in fact probably better done by searching the data (words) rather than tags.

Craigo -

> Steves thoughts there are tantamount to categorisation.

Yep, happily agree with that. Well, I'd prefer to say that asking certain types of questions get certain types of answers. But also, narrower questions get you lots of narrow answers, and open questions get you fewer, broader answers.

sig

Guys, you're raising some interesting issues here!

Actually, I think Steve sums it up in "..decisions like this will have to be made based on the particular application in question".

Nevertheless, the broader and less specific (or standardised/categorised), the easier it would be for "outsiders" to use such a data repository. And less training and manuals would be needed.

Which again leaves the question till when the "application" is built :)

A health database purely for professionals - keep it closer to the "industry specific" terms I guess. But the moment other untrained individuals sahll/will use it, then broader terms/tags could be allowed. Note that these un-healthy (sic) ones would not ruin for the terms/keywords/tags typically useful for the health profession.

The example of the "A4, 1999, back, light, rear, sedan,..." would not be less useful for the trained mechanics than "TXG-7688-fw-900", while at the same time making the manuals and the training a thing of the past.
And I'm sure would leave less messages of "we got the wrong part delivered, it will take another week for your vehicle to be delivered" to be heard. And I do not think I'm the only one who's ever heard that one!

Perhaps, as Steve suggest, let the "other" become the "anything that comes naturally to you" and leave that in...

Ric

Sig - I know this comes a long time after the fact, but just found the following blog post:
http://www.betaversion.org/~stefano/linotype/news/85/
where Stefano talks about marking tags uniquely - and his idea talks about resolving situations where syntactically similar tags are semantically different, and where they are syntactically different but semantically similar (whew - that's a mouthful!). It stikes me that his idea could also perhaps be extended to provide information about the originator, as discussed in some of these posts ...

Ric

PS - that's Stefano Mazzocchi at MIT, who is working an a project called Simile, which looks quite interesting - I'm particularly checking out a thing called "Piggy Bank"
http://simile.mit.edu/piggy-bank/

sig

Ric,

interesting - after a bit of head scratching - I do get a "aha!" in the sense that I think multiple-tags-interception could do it as well, and scale better... let's see:

"syntactically similar tags are semantically different": If we both use the same word to decribe/tag an object, but have different understanding (as things goes always), then you had to add a few words, I would have to add a few words to clarify each other's understanding of the word.

And that is precisely using multiple "tags" (words) and overlapping those. With "enough" extra tags, then we both would find the same "understanding" :)

"syntactically different but semantically similar": Just like above, you use one word to describe an object, I may use another word for exactly the same meaning. Again we would need some "extra" words to clarify for each other - multiple-tags-interception again!

"Terrorist" or "freedom fighter" - syntactically different but can be semantically similar "tags" for the same object for two different readers (sometimes). Knowing the "tagger's" background would help a lot, and add much "knowledge" at the same time...

The issue is of course "what amount" of information one could pack into a single "tag/descriptor"?

I'm still leaning towards the need for transparency as in access to a "tagger's" blog say, or CV or something more informative than a name. Seems to me that Stefano's solution (I may have that wrong though!) would work nicely as unique identifiers, but would not deliver much knowledge about the "tagger". Is the tagger Israeli or Palestinian, is he a member of Likud or Hamas... stuff that could give above syntactically different "tags" some meaning.

Yep, there is still a way to go :)

The comments to this entry are closed.

My Photo

Contact


  • Phone: +33 6 8887 9944
    Skype: sigurd.rinde
    iChat/AIM: sigrind52

Tweet this


Thingamy sites

  • Main site
  • Concept site

Tittin's blog


Hugh's


Enterprise Irregulars


Faves

Twitter Updates

    follow me on Twitter

    alltop


    • Alltop, all the cool kids (and me)

    Subscribe

    Blog powered by Typepad
    Member since 01/2005