Futility in alternate pasts and futures in human augmentation
While it’s great to draw inspiration and ideas from the past, recreating the past in the hope that it becomes the future seems like a futile idea. Does anyone really want to return to a command-line interface to manipulate documents? It’s designing for a past that never happened, one where we all became computer scientists and enjoyed manipulating documents via arcane commands....
A better, more productive, use of time would have been to say, what inspiration can still be gained from Engelbart’s ideas? There’s still a lot to be gleaned from his 1962 (!) paper Augmenting Human Intellect. How might some of his thoughts on collaborative intelligence be implemented in our world now, in 2006, within the technology we have now? That’s the question waiting to be solved.
Allow me to engage in some Devil's Advocacy here - although I really am an Engelbart sympathizer:
Consider a program like Microsoft Word, with all its ribbons and toolbars and menus and animated assistance. When you first started working with it, you probably spent time navigating these visual and guided parts of the user interface to get your job done. But, after awhile, you probably discovered keyboard shortcuts and accellerators - CTRL-s to save, for instance. These have likely been invaluable in speeding up your work and helping the application get out of your way. So, having reached this point, do you ever really have a use for the "user friendly" bits anymore? Or, have you graduated to "manipulating documents via arcane commands"?
What if this application had never sugar-coated things and had instead optimized for efficiency and ergonomics in daily expert operation, trading an "intuitive interface" for an offer to incrementally train on its necessarily complex functionality? After awhile, you'll have it all down, and be ready to shed the training wheels.
What if - instead of a maze of menus and toolbar icons - your mouse just had dozens of easily-accessible buttons? You're used to only having a left and a right click from which to choose. If you've splurged, you might have the more expansive choices offered by a fancier pointing device. But, what if you had a chording keyboard under your off-mouse hand, offering an order of magnitude more mouse pointer actions?
For example, how about a "delete word" mouse button? Or a "copy sentence" button? Or maybe even a "jump to the selected link with a custom view filter" button? The important part is that these commands act immediately, just like a mouse button, upon whatever's under the pointer. There's no left-click then CTRL-x to Cut - no, you just point at a word, and say "cut that". There's a lot of power and efficiency here.
These "what if" scenarios are not just wishful thinking, though. They're what Doug Engelbart and crew implemented. These are things I picked up after being invited to try a hands-on self-guided tour of Augment. I only wish I'd had a chording keyboard to get the full experience. The interface was like a mouse-heavy VIM, with verb-object command patterns and structured document interactions. (Or, rather, VIM is like a mouse- and outline-deficient derivation of Augment.)
The basic core of this facet of Augment is this: Computers are powerful tools with great potential to augment human intellect. As such, they offer a lot of complex functionality. But, human beings are trainable, and can assimilate this functionality. Once assimilated, it's best to squeeze out all the performance you can. You don't see today's degree of computer "user friendliness" in chainsaws, tanks, jack-hammers, semi-trucks, or fighter jets. These things are necessarily complex and require training. Why should the most powerful of intelligence enhancement tools offered by computers be any different? Of course, you generally won't lose a limb to a computer, but you might be mentally impaired or lose valuable work in the process.
This is, I think, one of the still-relevant central facets of Doug Engelbart's ideas that could use some re-examination today. It could just be because I'm an übernerd who thinks it's fun to self-train on things like VIM and Augment, but I also think that there's a lot of potential to be unlocked once you clear away expectations of "intuitive interfaces" that are decidedly not nipples.
And, since I've admitted my recently acquired semi-addiction, consider World of Warcraft as an expert application. Advanced players could never succeed by navigating a complex yet "friendly" UI to invoke various spells and skills and in-game actions. Just take a look at some of the customizations and UI revisions being offered at this site. Some configurations of this game smack me as eerily similar to the principles of Augment. In fact, just this weekend, I was considering blowing the dust off this keyset controller I used to use with Everquest, years ago.
Then again, maybe it's a matter of intensity. Coordinating with a 40-player guild to slay something from the molten bowels of the earth is a slightly different activity than composing a memo or even a few-dozen-page report. On the other hand, I really would've liked to strip away most of the Word interface while writing my two books. And someday, who's to say that online interpersonal collaboration in the general case won't more closely resemble a World of Warcraft raid? Having just read Vernor Vinge's latest book Rainbows End, he makes a lot of intelligence augmentation and collaboration tasks look just like WoW.
 
            
Archived Comments
The fighter-jet metaphor is interesting. Obviously a fighter has a narrower scope/focus than the general computer. But perhaps there's a narrower technique/practice of intelligence augmentation that warrants a more specialized/locally-optimized interface design.
But then maybe the Engelbart work isn't focused enough on a particular context and associated method-of-use?
To relate this to a similar software offering, how important are the specific Compendium features compared to the process of IBIS?
And a big factor in the ChangeFunction is how critical the problem/pain is being solved by the new offering. Can you convince people that there will be a pay-back for learning to use HyperScope that compensates for the investment, compared to other uses of your time?
Let's put it this way: if you were picking between 2 start-ups to invest in, how much weight would you associate with 1 of the teams using HyperScope? How does this compare to betting on a dogfight where 1 party has an F-15 and the other a Cessna?
Very nice post.
And Bill's comments too. They remind me of the LEO editor for Python which was always touted as having great productivity benefits if only your team would undergo the three month training required to use it properly. Not sure if it ever took off or could. But there's something nice about the idea that LEO empowered programmers could outperform the norm.
I'm convinced that there are certainly productivity improvements available to power-users, beyond anything currently dreamed of, once we step away from the assumption that "ease of use" equals "1-to-1 correspondance between functionality and UI objects".
As every nerd knows, real (interesting) productivity, comes from higher levels of abstraction. And maybe what's really important about the outliner tradition (from HyperScope to MORE / UserLand / OPML to LEO) is that it remains loyal to this notion. When you collapse a block of text and ideas down to a single-line, you are essentially abstracting away from that detail and working with the higher-level description.
OTOH, the Xerox Parc tradition of the GUI and direct manipulation, lost this core ideal. (At least as it was spread via Apple and Microsoft, although obviously you can probably do all sorts of powerful abstractions via a Smalltalk interface)
I'm pretty sure that this insight is general. The really interesting innovations beyond HyperScope are going to be new ways of giving the power-users yet more abstract ways of manipulating their information. Either by folding more of it together as complex aggrogates, or allowing large-scale cross-cutting processing. (Maybe style-sheets in Word are the only other surviving popular example.)
Y'know - this is what it is like to be an Emacs user. I've come to the opinion that the set of document formats you can work with, and the set of commands you can perform on them, should be somewhat-to-completely separate from the UI.
That way, you can have a learners/beginners UI to get people up to speed, then they graduate to the intermediate UI that assumes knowledge of things like C-s, C-o, C-q etc. And of course, if an application follows strong UI design guidelines, experienced computer users might be able to start a new program in the intermediate UI. Gosh! C-o opens a file in Excel too!
Then, I think there should be a choice of expert-level UIs. For example, VIM and Emacs have both grown together (VIM started small, light and fast, and Emacs stared with everything AND the kitchen sink), so that they both represent reasonable choices for a power-user's text editor.
This is also one of the problems that web developers are working on (or working around). Google and others have started introducing intermediate level UI features in web apps (like shortcut keys), but try building a site that looks and feels like WoW...
What still catches me with systems like Augment (I would call Neuberg's new incarnation "hyperScope".) is precisely the mousing.
The war is long lost but I recall with fond pleasure how I blew a Word user away by using WP5.2 ... ^F6-P boom And when I rolled out the functions I'd cobbled together with WP's lovely macro language? Sonic boom. The key in that situation was that I had a large number of unique tasks and a small very number of tasks carrying a huge workload (MILSPEC change management). So it was ideal for hot-keyed macros: like shooting fish in a barrel.
For one thing, unless I'm reading passively or doing some flavour of CAD my hands are nowhere near the mouse. Or, to invert that, when I'm keyboarding I have to routinely suppress my resentment with reach, swivel, click, drag, select, click, select ... interminable menus and options bla-bla-blah, and nowhere muscle memory comes into play. But even with that aside, to have to right click and then select Delete from a menu /after/ having dragged to select a block ... I can outshoot that action stream using keystrokes anyday, if the app allowed me to.
I don't disagree with the fundamental insight ... far from it. But we've just barely begun to implement the foundational cognitive ergonomics. (I was gratified to see in one thread that Brad explicated his having moved Help to the upper right ... cuz that's where it is most often. When it works it works cuz it works. Tradition is sometimes/often arbitrary; life's like that and we should sometimes just suck it up.)
Harold's point about expert users is, I think, key. It's merely foolish to impose a system that makes good use of habituation onto a newly arrived visitor. I'm quite sure that attentive study shows a clustering or quantum of user intention and expertise ... until and unless we contrive some seamless continuum (a terrible distraction inspired by naive perfectionism) we should focus on differentiating expters from n00bs (no diss) and serve both well. "Intermediate level" sounds quite appropriate ... so long as this isn't just a maelstrum of fish/foul goat/sheep confounds.
Alternatively we can always fall back on the old TRW concept of making people think more like machines. There might be funding for that. ;-P
Muscle memory just jumped up and reminded me of this: in a situation where I was doing Print Preview a gazillion times a day Shift-F7 6 was as effortless as breathing. snap
I can speak as a former Emacs user and coder that the only reason I gave it up for vi was that it hurt my hands too much to make all of those funky keyboard chords, and it started to hurt my head to remember all of the time-saving things I had built.
Classically there's a tradeoff between the ease of typing something and the amount of think time you have to put into remembering what to type.
One thing I am annoyed by on too many blogs is the inability to tab from the comment field to the "submit" button, which forces a mouse event and a scroll event.