What are MS SmartTags?

Microsoft Smart Tags describes the ability of the IE browser to introduce additional links into any web page. Up to now, it has been the province of the web page author to decide what text should appear as hyperlinks in their pages.

With Smart Tags, Microsoft has added the ability for the browser to scan the text of each incoming web page, and insert extra tags by matching words or phrases to a collection of terms created by the Smart Tag's author.

In practice, this means that a user might go to their favorite sports site and find that the names of their favorite players are now linked to Microsoft-owned sports web sites. Many users will likely not realize that these links were surreptitiously added to the web page without the user's, or the web page author's, knowledge.

Why are Smart Tags bad?

Smart Tag's have been criticized because they alter the appearance and function of web pages without the web page author's consent. Unsophisticated users may not understand that these links were not created by the web page's author. This creates an impression in the user's mind that a Smart Tag-linked site is endorsed or approved by the original web page's author.

For example, the reader of a web page encouraging responsible drinking may find the word "vodka" linked to a web page for a particular brand of licquor. This would hardly be the mental connection the original page's authors intended.

What's good about Smart Tags?

In reading all the coverage about Smart Tags, my programmer's mind started thinking about the positive aspects of these tags. Easy access to information seems, on the face of it, to be a good thing. The disturbing aspects seem to be centered around the rights of web page authors not to have their work tampered with.

When I first read about Smart Tags, it made me think of the advertisements I'd seen for the QuickClick browser add-on from NBCi that boasted that you could click on any word on your computer and get more information about that word. This seemed similar in intent to Microsoft's Smart Tags.

The difference, and the reason why NBCi's software created no furor, is that the NBCi software required the user to take an active role in "where they wanted to go".

Using OSS to create a Linux version of SmartTags

In researching Smart Tags at Microsoft's web site, I found an article that described one method for creating Smart Tags. Simple Smart Tags can be created by building a properly-formatted XML file describing the Smart Tag terms (the words to be "linked") and the action(s) taken when those links were selected.

I began to think how something like Smart Tags could be implemented in Linux.

From a programming perspective, Smart Tags work like this: when a web page is received by the browser, its text is scanned and words are matched to a list of terms the Smart Tag program recognizes. These matched terms are then displayed as "links" when the page is displayed.

When the user selects a link, a pop-up menu appears, and the user is given a choice of one or more actions they may take, such as navigating to a particular website, or adding the linked name to their address book.

Finally, when the user selects an option, a command associated with that option is carried out. For example, a browser window may be opened and a specific web page displayed.

To add functions like this into a browser like Mozilla would be a daunting task, and I was completely unfamiliar with the Mozilla code base. Also, I disliked the idea of these tags insinuating themselves into other peoples pages.

So in the tradition of the Kobeyashi Maru test, I decided to change the rules of the game so I could win. Rather than have the Tags code integrated into Mozilla and other applications, I decided to make it a separate, non-invasive program.

I decided to write an application that would "spy" on text any time the user selected it into the X Window clipboard. If the text matched a tag defined in an XML file, the application would pop up and provide the user with options on what to do with the selected word.

By changing the rules of the game, I decoupled the tags from other applications, while allowing it to now work with virtually any program. The downside, if it is one, is that the user get's no "visual cues" as to what matches a tag and what doesn't.

Design Decisions

The design of the program, uninspiringly dubbed "gTags", quickly fell into place. When the user started and enabled the gTags app, it would monitor the X Window clipboard. Whenever text was selected, it would compare it with a list of terms. If a term matched, the gTags window would pop-up near the selected text, and a pull-down menu would let the user select an action for that text.

With a design sketched out, I selected a collection of "off-the-shelf" Open Source components. The user interface was designed using Glade, a Rapid Application Development environment for Gnome/GTK+ programming. I studied the X Window text selection mechanism and the GTK+ layer that wraps around it. And I used the libXML library to parse the XML files. For the gtags.xml file, which defines the tags themselves, I followed the XML schema used by Microsoft Smart Tags. For the actions.xml file, which creates an association between a tag and an action to take (or more correctly, a program to invoke), I used an XML schema of my own devising.

Working in my spare time, I developed the gTags application in about a week. It may have some rough edges, and I'll not be surprised if it has a few bugs. But it works, and it helped me focus on why certain technologies are implemented the way they are.

Intrusive Technologies

Microsoft could have implemented Smart Tags the way NBCi's QuickClick did, as a separate application not intertwined with Internet Explorer or it's other applications. Why didn't they?

By "integrating" Smart Tags into their apps, Microsoft, intentionally or not, accomplished the following goals:

  • They bound the SmartTag features to just their products.
  • Within Internet Explorer, they blurred the distinction between HTML hyperlinks and SmartTag hyperlinks.
  • They (potentially) funnelled users from non-Microsoft web properties to those operated or endorsed by Microsoft.
None of these goals is technically motivated, nor are they in the best interests of the user.

Evaluation

Creating a Smart-Tag-like application for Linux was not difficult, using available Open Source tools. But the act of developing it brought certain non-technical choices into sharper focus.

Microsoft owns a wide range of applications (Internet Explorer, Word, Excel, and others). This makes it possible for them to tightly integrate a new feature across a wide product line. In the Linux world, where most applications are owned by a core group of volunteer developers, it's harder to coordinate the adoption of a common feature. In my gTags example, I could have designed it as a Gnome library, and lobbied for its inclusion into Gnome-based applications. That would have taken considerable time and effort, and could have only a limited expectation of success.

Microsoft has no such barriers when adding features.

When I realized there was no practical way to integrate Smart-Tag-like features into individual applications, I fell upon the idea of using the X Window selection mechanism (the "clipboard"). The decision had the happy side-effect of making the gTags program work with virtually any X Window-based program that allowed selectable text. This makes gTags much less limited than Microsoft's Smart Tag feature.

Why would Microsoft decide to add a limited, tightly-integrated, feature when a more general and more powerful feature would have been as easy to implement? There are two possibilities.

The first is ease of use. The great "failing" of my approach is that it requires a conscious action from the user. The user must choose and select the text they want gTags to recognize. By contrast, Microsoft's Smart Tags feature is passive. Words are preselected for the user, and visual cues (a hyperlink) is provided. In the first approach, the user tells the computer to look for more information, with the latter approach, the computer volunteers that more information is available.

It can be argued that the latter approach is more user friendly.

The second reason why Microsoft may have opted for a more limited and tightly integrated approach is that this mechanism allows them to add more pathways back to Microsoft-owned or -sanctioned web properties.

I've heard Smart Tags described as a "powerful innovation" but the "technology" behind it is fairly thin and easy to replicate. I've found the feature to be somewhat useful in its non-intrusive, gTags incarnation. I would be interested in understanding why Microsoft implemented their version in the intrusive manner that they did.

[Note: Smart Tags and QuickClick are almost certainly trademarks of Microsoft and NBCi, respectively.]

References

Wall Street Journal: "New Windows XP Feature Can Re-Edit Others' Sites"

Slashdot: "Where Does Microsoft Want You to Go Today?"

Slashdot: "No XP-Smarttags in Europe"

Microsoft: "Developing Simple Smart Tags"

Microsoft: "Developing Smart Tag DLLs"