Dave Johnson » Tagging

Dave Johnson’s Blog

RSS

Archive for the 'Tagging' Category

More on Wink and Tag Search | January 24th, 2006

I read an interesting post by Jeff Clavier the other day and have been wondering about how an implicit search context, such as that used by Wink, could work for or against you. Btw I also still get a JavaScript error on the Wink homepage when I try to click on the search box .

I have posted on various issues regarding tag based search before and there was good discussion on a recent(ish) post by Om Malik entitled People Power vs Google. The new problem I envision is that when you are searching for something that is syntactically the same but semantically different from concepts which you or other people have tagged, then the results will be skewed in the wrong direction. It is a very good idea on Wink’s part to put Google search results on the same page.

Of course this problem can be overcome with a little work by the searcher who can make a more exact search string; however, one could then argue that if you have to make a more exact search string to find things outside of your tagospehere then why bother when it is likely that Google searching (ie not using tags) in your area of interest will generally return the results you want with or without tags. The same is generally true with using del.icio.us in that it is faster to go and search on Google than to find what you are looking for on del.icio.us.

It is interesting to think about the problem in terms of information theory. When you encode the western alphabet for transmission like using Morse code, one would usually want to devote as few bits as possible to the letters “e” and “s” because they have a high degree of redundancy. Tag supported search is similar, in that it reduces the number of tags needed to find frequently accessed information (like reducing the number of bits that represent the letter “e”) by leveraging the work that people have put into tagging pages. This can also backfire of course when you are looking for AJAX the football club rather than AJAX the wicked-awesome programming technique when most of the pages you tag with AJAX are those relating to the technology. The user essentially has to climb out of this “context pit” created by their tagging habits by specifying “AJAX amsterdam” or “AJAX football”. Really it all depends on your search habits.

I am not sure we can prevent this problem when searching for obscure or syntactically different topics. While this might be a slightly larger problem with tag based searching it can also be a problem with Google - the main difference being that Google bases its results on actual HTML links between pages, which, in my opinion, should generally result in a more robust and less biased result set. Will this problem become even worse once we start using things like the Semantic Web?

Posted in Search, Semantic Web, Tagging, Web2.0 | No Comments » | Add to Delicious | Digg It

Social Annotation | December 28th, 2005

I have just read about a company currently in private beta called Diigo, which is in the business of social annotation (SA).

Apparently SA is a superset of social bookmarking or tagging, which is of course the piece de resistance of ‘Web 2.0′. The question is can SA be an even better route to getting aquired by MAGY? Don’t quite me on ‘MAGY’ though since I am not sure what order those names should go in…

I had been thinking about SA for some time but did not have the time / resources to get anything together for public showing - but this might be a good reason to do so. Of course given my record with getting code up on my blog I won’t have a sample till next year this time. Anyhow, the possiblities for SA are much more attractive than social bookmarking in my mind. With social annotations (at least what I consider it to be) I can surf to any web page and place tagged sticky notes (private or public) in a browser agnostic fashion that will contain my comments and refer to a certain block in the web page DOM. Then I can go to some central place to view / oranize my comments and can also subscribe to RSS of other people’s comments on those pages or from particular people. The main problem that I have with Diigo (from the looks of their Flash demo) is that I need to install their toolbar - yuck!

The useful part of these systems for end users is that they can tag particular bits of content on a page and find exactly what they were referring to with a tag. Then if you combine this idea with microformats and the Semantic Web you might really be cooking with something combustible like methane.

This brings us to the all important (both dreaded and revered at the same time) question of ‘monetization’ - I guess I have to eat somehow but that is why I have a day job In a perfect world I imagine the toolbar from Diigo being essentially a web toolbar (as opposed to browser integrated) that floats over the current page and is inserted by using a bookmarklet in true AJaX fashion . With the toolbar could be relevant ads and there could also be relevant advertisement on the notes themselves. But hey who needs money when you have a few hundred thousand users and ’social tagging/sharing/annotation’ hype to help you implement your ‘Web 2.0′ exit strategy.

Posted in Search, Semantic Web, Tagging, Web2.0 | No Comments » | Add to Delicious | Digg It

Fuel for the Tag Embers | December 23rd, 2005

Om Malik posted about the increasing interest in people power vs the power of Google [1]. I think that tags will lose out to automated clustering (such as Vivisimo) in the short term but that doesn’t mean we will not see more players like Wink trying to get a piece of the tagging pie. Not that I won’t give Wink a chance and don’t get me wrong I do think that services like Wink have a place in the blogosphere today but we already have the likes of Technorati and my new favourite Google Blog Search.

The topic of tag utility has been covered quite a bit in the past by the likes of Tim Bray [2] and Stephen Green [3] (both good canucks) and I am sure it will be discussed well into the future! On the whole I have to agree with Tim, and Stephen brings up some very interesting points from his research that should be considered. I will discuss that in a moment.

But first, there are a few issues that I can see with the new emphasis on the old idea of tagging …

People are lazy. who wants to waste their time rating pages when Google does a _pretty good_ job on its own?
People who are not lazy (like geeks maybe) cause tagged content to be very skewed to their interest group and therefore it becomes inaccesible to the majority of people.
There is lots of meta-data (some may even call it “tags”) available to search engines based on page content - so why do more work?
If I tag a page as “interesting” that is only in the context of what I am thinking at that moment in time. Tags can have temporal/geographic/personal dependence which is something that is not easy to manage with tags today.

For example, a current topic that I am very interested in is the science (or maybe art?) of data binding - ie how to create a binding language that provides rich mechanisms for indirection and how to express it using a declarative / mark-up approach. This is something that is quite difficult to find information about using Google or Yahoo!. Could tagging of content help me find some obscure piece of very relevant and useful information on this topic? If someone has found it before me and tagged it with the pecise tags that I would use for the topic then maybe. However, I’m not convinced [4] and it seems that John Battelle is not either [5].

Here is the thing, people need to look beyond the tag - it is a stop-gap that has been tried many times before (web page keywords?). Places that tags have had some success, as Stephen mentions, are instances where you have defined vocabularies or taxonomies. Content is tagged by domain experts and integrated into a taxonomy at great expense but with great reward (this seems to be a re-occuring theme to me). I am not sure that people using the web want to be constrained like this - yet it is the best way to get value from tagging so that everyone “talks the same language”.

This brings me to a point that I have brought up before [4]. Forget tags. Think semantics. Think Semantic Web [6]. The discussion should not be about the value of tags but about moving towards a richer Web. More on that soon.

References
[1] People Power vs Google - Om Malik, Dec 22, 2005
[2] Do Tags Work? - Tim Bray, Mar 4, 2005
[3] Tags, keywords, and inconsistency - Stephen Green, May 13, 2005

[4] More Tags - Dave Johnson, Dec 14, 2005
[5] Will Tagging Work - John Battelle, Dec 4, 2005
[6] Tagging Tags - Dave Johnson, Dec 1, 2005

Posted in Search, Semantic Web, Tagging, Web2.0 | 1 Comment » | Add to Delicious | Digg It

More Tags | December 14th, 2005

I just stumbled upon a short post by John Battelle where he asks whether tags are going to work in the long run [1].

From my point of view the only good application of tags is for data that has no computer readable meta-data - ie they are a stop-gap. Photos, movies, songs even smells (one day) are the types of information that are hard to find using a search engine. Though sooner than later we should be able to search for “sunset” and Flickr will return a picture like this. However, when it comes to web pages there is plenty of information for search engines to work with. Why use a limited set of usually homogeneous tags to define a web page on del.icio.us when you can likely find it just as fast, or faster, using a search engine instead?

Furthermore, I’m lazy, I don’t like to think up new tags for resources that I find and for the most part I end up tagging almost everything that I find with my homogeneous tag set of XML, JavaScript, blog and AJAX … go figure. So in the end tagging is only slightly better than using my favourites in my web browser.

Having said that, there is one place that tagging might actually be useful, but only to a slightly larger degree, and that is with news. Having the del.icio.us RSS feed for AJAX is great since it is essentially a human aggregated feed for AJAX news. Still, in the future I anticipate that I will likely just ask Technorati or equivalent instead.

All and all I have quickly fallen out of love with tags and the limited use they have [2].

As for the companies that are building businesses based on tagging - it seems to be a pretty good idea.

Update: found a great post about tags here.

[1] Will Tagging Work - John Battelle, Dec 4, 2005
[2] Tagging Tags - Dave Johnson, Dec 1, 2005

Posted in AJAX, Business, Semantic Web, Tagging, Web2.0 | 2 Comments » | Add to Delicious | Digg It

Tagging Tags | December 1st, 2005

I found it quite interesting some months ago when somebody posted a comment on one of my photos in Flickr asking why I had tagged it with the word “photovoltaic”. It appears that I have since taken down that photo but just take a look at this one and most people can likely see the confusion

I am sorry but how can we expect a couple of words describe everything about some picture to someone who doesn’t know me or know anything about the photo? At best they could say something like

“this photo is tagged with barcelona, 2005 and photovoltaic. if I Google those I find the first result is a photovoltaic conference in Barcelona in 2005 so he was probably there. but what the hell do cargo containers have to do with anything”

But when I look at that photo I think

“oh yeah that was at the photovoltaic conference in Barcelona in 2005 where I gave my talk on photon recycling and we were living in London and Ian and Annabelle came from Vancouver to visit and I felt really horrible about all those CO2 emissions from their airplane and we went to that castle in Barcelona where there was a good view of the harbour and I thought that those shipping crates looked kind of cool so I snapped this photo - I wonder what relationship this photo has with the next and previous one other than time and group? oh shit did I leave the stove on? what are the implications of cargo containers on AJAX in Spain? “

Of course this sort of thing even happens when we are talking or reading other people’s writing. Just the other day, and what actually spurred me to write this post, I posted a response to an AJAX question in a group on Google and for some reason a really picky guy replied to my answer complaining about my saying “data transport encoding”. He suggested that I meant to say “data transport formatting” because encoding _really_ means ASCII, UTF-16 etc. While dictionary.com says that encoding is “To format (electronic data) according to a standard format” - ok so it’s just data formatting - like he said. My point here is that if I said “data encoding” and that is all, then you could think that I meant DVD encoding or Huffman coding or ASCII encoding or XML encoding. Only when you take into account the _entire_ context of a statement can you ascertain the _real_ meaning. You have to take into account that I was just reading about phase modulation for wireless communication and so used the word encoding or maybe I had a bad lunch or maybe I was actually thinking a completely different word but just wrote that one instead. Looking at the problem with writing is obviously a bit far fetched but no less interesting to thing about. Incidentally the poster also objected to my use of the term “array” when using it to refer to a group of objects - he insisted it was a data-structure; I can certainly see his concern if he has just had his head in some code for a day.

And my point is what? My point is tags, even writing, is just not good enough. There is too much context to provide to give the tags their proper meaning. I may use the word “photovoltaic” to refer to the fact that a picture was taken while I was in a city attending a photovoltaic conference but I may also use it to describe an actual picture of a PV panel all at the same time.

Tags need tags.

What do other people think?

Posted in Semantic Web, Tagging, Web2.0 | 1 Comment » | Add to Delicious | Digg It

Search Context/Clustering | June 30th, 2005

Michael Mahemoff takes an interesting view on Ajax-isizing search results here Ajaxifying the Address Bar Interface.

I certainly agree to a certain extent - particularly that it makes sense in the intranet environment. As far as public search goes, i am becoming more partial to Vivisimo. It automatically builds a cluster tree of your search results (as well as just the straight goods) such that you can then browse the results by context. Searching “goog” as Michael talks about brings up Stock Quotes, Investing, Blog, Pictures etc - pretty cool I think. Now all it needs is to Ajaxify the cluster tree so that you can browse it in “real time”.

Mayby they will use our Ajax tree control when we release it.

Posted in Search, Tagging, Web | No Comments » | Add to Delicious | Digg It

Blogs

Dave Johnson’s Blog

More on Wink and Tag Search | January 24th, 2006

Social Annotation | December 28th, 2005

Fuel for the Tag Embers | December 23rd, 2005

More Tags | December 14th, 2005

Tagging Tags | December 1st, 2005

Search Context/Clustering | June 30th, 2005

Search Posts

Pages

Archives

Categories