Dave Johnson » Search

Dave Johnson’s Blog

RSS

Archive for the 'Search' Category

More on Wink and Tag Search | January 24th, 2006

I read an interesting post by Jeff Clavier the other day and have been wondering about how an implicit search context, such as that used by Wink, could work for or against you. Btw I also still get a JavaScript error on the Wink homepage when I try to click on the search box .

I have posted on various issues regarding tag based search before and there was good discussion on a recent(ish) post by Om Malik entitled People Power vs Google. The new problem I envision is that when you are searching for something that is syntactically the same but semantically different from concepts which you or other people have tagged, then the results will be skewed in the wrong direction. It is a very good idea on Wink’s part to put Google search results on the same page.

Of course this problem can be overcome with a little work by the searcher who can make a more exact search string; however, one could then argue that if you have to make a more exact search string to find things outside of your tagospehere then why bother when it is likely that Google searching (ie not using tags) in your area of interest will generally return the results you want with or without tags. The same is generally true with using del.icio.us in that it is faster to go and search on Google than to find what you are looking for on del.icio.us.

It is interesting to think about the problem in terms of information theory. When you encode the western alphabet for transmission like using Morse code, one would usually want to devote as few bits as possible to the letters “e” and “s” because they have a high degree of redundancy. Tag supported search is similar, in that it reduces the number of tags needed to find frequently accessed information (like reducing the number of bits that represent the letter “e”) by leveraging the work that people have put into tagging pages. This can also backfire of course when you are looking for AJAX the football club rather than AJAX the wicked-awesome programming technique when most of the pages you tag with AJAX are those relating to the technology. The user essentially has to climb out of this “context pit” created by their tagging habits by specifying “AJAX amsterdam” or “AJAX football”. Really it all depends on your search habits.

I am not sure we can prevent this problem when searching for obscure or syntactically different topics. While this might be a slightly larger problem with tag based searching it can also be a problem with Google - the main difference being that Google bases its results on actual HTML links between pages, which, in my opinion, should generally result in a more robust and less biased result set. Will this problem become even worse once we start using things like the Semantic Web?

Posted in Search, Semantic Web, Tagging, Web2.0 | No Comments » | Add to Delicious | Digg It

Social Annotation | December 28th, 2005

I have just read about a company currently in private beta called Diigo, which is in the business of social annotation (SA).

Apparently SA is a superset of social bookmarking or tagging, which is of course the piece de resistance of ‘Web 2.0′. The question is can SA be an even better route to getting aquired by MAGY? Don’t quite me on ‘MAGY’ though since I am not sure what order those names should go in…

I had been thinking about SA for some time but did not have the time / resources to get anything together for public showing - but this might be a good reason to do so. Of course given my record with getting code up on my blog I won’t have a sample till next year this time. Anyhow, the possiblities for SA are much more attractive than social bookmarking in my mind. With social annotations (at least what I consider it to be) I can surf to any web page and place tagged sticky notes (private or public) in a browser agnostic fashion that will contain my comments and refer to a certain block in the web page DOM. Then I can go to some central place to view / oranize my comments and can also subscribe to RSS of other people’s comments on those pages or from particular people. The main problem that I have with Diigo (from the looks of their Flash demo) is that I need to install their toolbar - yuck!

The useful part of these systems for end users is that they can tag particular bits of content on a page and find exactly what they were referring to with a tag. Then if you combine this idea with microformats and the Semantic Web you might really be cooking with something combustible like methane.

This brings us to the all important (both dreaded and revered at the same time) question of ‘monetization’ - I guess I have to eat somehow but that is why I have a day job In a perfect world I imagine the toolbar from Diigo being essentially a web toolbar (as opposed to browser integrated) that floats over the current page and is inserted by using a bookmarklet in true AJaX fashion . With the toolbar could be relevant ads and there could also be relevant advertisement on the notes themselves. But hey who needs money when you have a few hundred thousand users and ’social tagging/sharing/annotation’ hype to help you implement your ‘Web 2.0′ exit strategy.

Posted in Search, Semantic Web, Tagging, Web2.0 | No Comments » | Add to Delicious | Digg It

Fuel for the Tag Embers | December 23rd, 2005

Om Malik posted about the increasing interest in people power vs the power of Google [1]. I think that tags will lose out to automated clustering (such as Vivisimo) in the short term but that doesn’t mean we will not see more players like Wink trying to get a piece of the tagging pie. Not that I won’t give Wink a chance and don’t get me wrong I do think that services like Wink have a place in the blogosphere today but we already have the likes of Technorati and my new favourite Google Blog Search.

The topic of tag utility has been covered quite a bit in the past by the likes of Tim Bray [2] and Stephen Green [3] (both good canucks) and I am sure it will be discussed well into the future! On the whole I have to agree with Tim, and Stephen brings up some very interesting points from his research that should be considered. I will discuss that in a moment.

But first, there are a few issues that I can see with the new emphasis on the old idea of tagging …

People are lazy. who wants to waste their time rating pages when Google does a _pretty good_ job on its own?
People who are not lazy (like geeks maybe) cause tagged content to be very skewed to their interest group and therefore it becomes inaccesible to the majority of people.
There is lots of meta-data (some may even call it “tags”) available to search engines based on page content - so why do more work?
If I tag a page as “interesting” that is only in the context of what I am thinking at that moment in time. Tags can have temporal/geographic/personal dependence which is something that is not easy to manage with tags today.

For example, a current topic that I am very interested in is the science (or maybe art?) of data binding - ie how to create a binding language that provides rich mechanisms for indirection and how to express it using a declarative / mark-up approach. This is something that is quite difficult to find information about using Google or Yahoo!. Could tagging of content help me find some obscure piece of very relevant and useful information on this topic? If someone has found it before me and tagged it with the pecise tags that I would use for the topic then maybe. However, I’m not convinced [4] and it seems that John Battelle is not either [5].

Here is the thing, people need to look beyond the tag - it is a stop-gap that has been tried many times before (web page keywords?). Places that tags have had some success, as Stephen mentions, are instances where you have defined vocabularies or taxonomies. Content is tagged by domain experts and integrated into a taxonomy at great expense but with great reward (this seems to be a re-occuring theme to me). I am not sure that people using the web want to be constrained like this - yet it is the best way to get value from tagging so that everyone “talks the same language”.

This brings me to a point that I have brought up before [4]. Forget tags. Think semantics. Think Semantic Web [6]. The discussion should not be about the value of tags but about moving towards a richer Web. More on that soon.

References
[1] People Power vs Google - Om Malik, Dec 22, 2005
[2] Do Tags Work? - Tim Bray, Mar 4, 2005
[3] Tags, keywords, and inconsistency - Stephen Green, May 13, 2005

[4] More Tags - Dave Johnson, Dec 14, 2005
[5] Will Tagging Work - John Battelle, Dec 4, 2005
[6] Tagging Tags - Dave Johnson, Dec 1, 2005

Posted in Search, Semantic Web, Tagging, Web2.0 | 1 Comment » | Add to Delicious | Digg It

Search Context/Clustering | June 30th, 2005

Michael Mahemoff takes an interesting view on Ajax-isizing search results here Ajaxifying the Address Bar Interface.

I certainly agree to a certain extent - particularly that it makes sense in the intranet environment. As far as public search goes, i am becoming more partial to Vivisimo. It automatically builds a cluster tree of your search results (as well as just the straight goods) such that you can then browse the results by context. Searching “goog” as Michael talks about brings up Stock Quotes, Investing, Blog, Pictures etc - pretty cool I think. Now all it needs is to Ajaxify the cluster tree so that you can browse it in “real time”.

Mayby they will use our Ajax tree control when we release it.

Posted in Search, Tagging, Web | No Comments » | Add to Delicious | Digg It

Blogs

Dave Johnson’s Blog

More on Wink and Tag Search | January 24th, 2006

Social Annotation | December 28th, 2005

Fuel for the Tag Embers | December 23rd, 2005

Search Context/Clustering | June 30th, 2005

Search Posts

Pages

Archives

Categories