A couple days ago, Michael Hausenblas suggested I look at something called Ubiquity, and sent me a link. Because it came in the middle of editing the current version of Nodalities Magazine, I did what I often do with interesting concepts: I opened a new firefox tab and left it there for two days, hoping I would notice it before firefox crashed with all tabs on board. Well, since then, there has certainly been a lot of discussion about Ubiquity—both around the office and on the web. To introduce it, I can’t do much worse than pointing at their video and introduction page, and just say that it’s a Mozilla labs project and a Firefox plug-in.
However, what makes it interesting to me, is that it possibly introduces a new metaphor for interacting with web content—and, vicariously, linked data. The thought process behind it is that whenever we want to “do something” online, we are generally forced into round-about processes. Say, for example, I want to email a friend to tell him about a new restaurant I went to, maybe even invite him to meet me there for lunch. To accomplish this task, I’d typically open up three or four tabs in Firefox, and maybe open ICal and Mail application windows too: I’d google the restaurant, find its phone number and address from yell.com; map its address using maps.google.com or similar; I’d check which date I’m free; and finally email him the info, copying and pasting links and map images between multiple tabs, and—if I’m not using gmail’s web interface—into other applications as well. If you followed that last sentence, you’re doing well: it’s long, and complex (technically complex-compound, but we won’t get pedantic here), and it reflects the process.
Although it’s a beta, and many of its functions are very much less-than-polished, it offers a glimpse of a possible interaction future, with drastically more simple processes to complete tasks. What it creates is the ability to interact with content more directly, so you can select some content and start telling the application to DO stuff TO the content, by typing. So, i can select a physical address and type: “map this” into Ubiquity, and it’ll pull up google maps for that address (at the moment, it’s having trouble with some UK addresses because it’s using google.com and therefore not contextualising through the .co.uk which works better for addresses here). I can then use that information on the same screen. I can “yell florists in birmingham” and have a list of flower vendors in Birmingham from Yell.com (yellow pages service), which I can then drop into an email or whatever.
Very quickly, I ran into a conceptual problem with Ubiquity’s idea of natural-language interaction, however. Their strapline is: “An experiment into connecting the Web with language.” The idea being that you can “tell” the computer to “do something with/to this information” or “command” for something to happen, changing the basic interaction metaphor from a visual click/drag/drop/open-window process to a linguistic “I’m telling the computer what I want and it happens” framework. My immediate reaction was: “This isn’t linguistic, it’s command-line”, and was instantly transported to trying to learn Linux without a technical background, with all the frustration of a non-technical user trying to interact with software using a command-line.
You see, from my perspective as a linguist, I often feel frustrated with the computing community’s view of what language actually is. Without exploring propositionality, conceptual metaphor framework or anything else, it’s sufficient to say that language is both simpler and more complex than anything we’ve got software to emulate yet. What Ubiquity actually is, is a very simplified command-line which is “aware” of the information you’re already interacting with. From that perspective, it seems to work very well, with a more streamlined set of commands and more “natural language” feel to the words you actually type.
The upshot of this is that users have to learn a set of commands to interact with their applications, but that these commands are intended to be transparent in meaning. So, you “map this” or “help” or “add 1PM lunch with Dave”. After reading some of the reasoning behind this from one of the designers, Aza Raskin, I started to appreciate it more and more. The current contrasting model to this “Linguistic Command Line” is menus and windows. Menus and windows are inefficient, if you think about it. You have to select text, or images or whatever and physically move your curser to a menu somewhere in the extreme side of a window on your screen, finding and selecting the command from a drop-down list from which you need to remember the path to each command. The problem accelerates when you incorporate windows and applications into this. So if I were to incorporate some text from one window into another, linking to the original, and maybe dropping in a customised image too; I’d have to open multiple windows, executing menu commands or application-specific keyboard “short-cuts” at each stage.
But, I already know what I want to do with the stuff, right? Why not just activate a single keyboard shortcut and begin typing your instructions to the system: send link to <email>. Ubiquity allows this. In this framework, Firefox becomes a bit of a microcosm of the operating system (with tabs being windows, and sites and web-apps being desktop applications). As you type, it short-lists commands, so you don’t even type the full thing: typing “t r a n” ends up with the translate, so you can skip it and begin typing “to eng”, and it will offer you “translate text to English”.
Now, imagine having this ability with any form of Linked Data? Imagine if that bit of text were automatically recognised as a date, or co-ordinates, or person. Imagine selecting a picture of a restaurant and typing: “invite fred for lunch at 3PM on monday, enter”. The system could automatically know that the picture was of a restaurant (whose profile could include co-ordinates, contact info, and even a hypothetical automatic table-reservation system for invites from the web), that fred is your colleague (whose FoaF profile includes email or instant messenger preferences), that lunch is an email subject and a social event, and that 3PM on Monday is a date (in your calendar and in Fred’s calendar once the message is sent) which corresponds with your name + su. All of that information is being used in several processes (Copy/paste, lookup restaurant profile, map location, lookup email or IM, create iCal event, create email or IM message, send) but all you’re really doing is : “inviting fred for lunch at 3PM on Monday.”
This is incredibly intriguing, because it begins to show how some systems can begin to scale up to the immensity of the Web. We, as people, know what we want to accomplish, and if we could just tell our computers that, we’d be much happier. I think this could be a first step, and while I’m not completely convinced with the command-line metaphor, I can see this as a definite step, and a different perspective. My new copy of Aza’s father’s book the Humane Interface, arrived this morning to supplement this, and I’ll be blogging more about that, if it’s ever returned to my desk.
Person Michael Hausenblas
Right click for SmartMenu shortcuts