Desktop Task Switching Could Be Improved
by havoc
In honor of GUADEC 2012, a post about desktop UI. (On Linux, though I think some of these points could apply to Windows and OS X.)
When I’m working, I have to stop and think when I flip between two tabs or windows. If I don’t stop and think, I flip to the wrong destination a high percentage of the time. I see this clunkiness every minute or two.
For me to do the most common action (flip between documents/terminals/websites) I may need to use my workspace switch hotkey (Alt+number
), app switch (Alt+`
), window switch (Alt+Tab
), tab switch (Alt+PgUp
, Alt+PgDn
, C-x-b
), or possibly a sequence of these (like change workspace then change window or change window then change tab).
I believe it could be reduced to ONE key which always works.
The key means “back to what I was doing last” and it works whether you were last on a tab, a window, or another workspace. There’s a big drop-off in goodness between:
- one key that always works
- two keys to choose from
Once you have two, you have the potential to get it wrong and you have to slow down to think.
Adding more than two (such as the current half-dozen, including sequences) makes it worse. But the big cliff is from one to two.
User model vs. implementation model
Can’t speak for others, but I may have two layers of hierarchy in my head:
- A project: some real-world task like “file expense report” or “write blog post” or “develop feature xyz”
- A screen: a window/tab/buffer within the project, representing some document I need to refer to or document I’m creating
The most common action for me is to switch windows/tabs/buffers within a project, for example between the document I’m copying from and the one I’m pasting to, or the docs I’m referring to and the code I’m writing, or whatever it is.
The second most common action for me is to move among projects or start a new project.
Desktop environments give me all sorts of hierarchy unrelated to the model in my head:
- Workspace
- Application
- Window
- Tab (including idiosyncratic “tabs” like Emacs buffers)
- Monitor (multihead)
None of these correspond to “projects” or “screens.” You can kind of build a “projects” concept from these building blocks, but I’m not sure the desktop is helping me do so. There’s no way to get a unified view of “screens.”
I don’t know what model other people have in their head, but I doubt it’s as complex as the one the desktop implements.
Not a new problem
I’m using GNOME 3 on Fedora 17 today, but this is a long-standing issue. Back when I was working on Metacity for GNOME 2, we tried to get somewhere on this, but we accepted the existing setup as a constraint (apps, windows, workspaces, etc.) and therefore failed. At litl we spent a long time wrestling with the answer and found something pretty good though perhaps not directly applicable to a regular desktop. I wish I had a good video or link to show for litl’s solution (essentially a zoomable grid of maximized windows, but lots of details matter).
iPhone has simplified things here as well. They combine windows and applications into one. But part of the simplification on iPhone is that it’s difficult to do things that involve more than one “screen” at a time. On a desktop, it wouldn’t be OK to make that difficult.
In GNOME 3, I also use the Windows key to open the overview and pick a window by thumbnail. Some issues with this:
- It does not include tabs, only windows.
- In practice, I have to scan all the thumbnails every time to find the one I want.
These were addressed in the litl design:
- Tabs and windows were the same thing.
- Windows remained in a stable, predictable location in the overview.
- The overview was spatially related to the window, that is you were actually zooming in and out, which meant during the animation you got an indication of where you were.
- I believe you could even click on a window before the zoom in/out animation was complete, though I could be wrong. In any case you could be moving toward it while it was coming onto the screen.
As a result, the litl design was much faster for task switching via overview key plus mouse. If you were repeatedly flipping between two tasks, you could memorize their location in space and find them quickly based on that. If other windows were opened and closed, the remaining ones might slide over, but they’d never reshuffle entirely.
I think GNOME tries to “shrink the windows in their current location” rather than “zoom out”, so it’s trying to have a spatial relationship. A problem is that I have everything maximized (or halfscreen-maximized). “Shrink to current location” ends up as “appears random” when windows don’t have any meaningful relationships on the x/y axes (they’re just in a z-axis stack). (Direction for thought: is there some way maximized windows could be presented as adjacent rather than stacked?)
Overall I vastly prefer Fedora 17 to my previous GNOME 2 setup and I think it’s a step on the path to cleaning this up for good. In the short term, a couple things seem to make the problem worse:
- The “application” layer of hierarchy (
Alt+Tab
vs.Alt+`
) adds one more way to switch “screens,” though for me this just made an existing problem slightly worse (the bulk of the problem is longstanding and we were already far from one key). - The window list on the panel had a fixed order and was always onscreen, so it was faster than the thumbnail overview. I believe the thumbnail overview approach could be fixed; on the litl, for me zoom-out-to-thumbnails was as fast as the window list. The old window list was an ugly kluge (it creates an abstraction/indirection where you have to match up two objects, a button and a window — direct manipulation would be so much better). But its fixed spatial layout made it fast.
GNOME 3 opens the door to improving matters; GNOME 2’s technology (e.g. without animation and compositing) made it hard to implement ideas that might help. GNOME 3 directions like encouraging maximized apps, automatic workspace management, the overview button, etc. may be on the path to the solution.
Can it be improved?
I’ll limit this post to framing the problem and hinting at a couple of directions. I don’t know the right design answer. I’m definitely going to omit speculation on how to implement (for example, getting tabs into the rotation would be possible, but require some implementation heroics).
I know everything is the way it is now for good historical reasons, valid technical and practical constraints, and so on. But I bet there’s a way to get past those with enough effort.
Oh Lord, you’ve been reading my mind. I’ve had the very same problem. But I think the situation was a bit better in GNOME 2 in a way. Alt+Tab, switching to another window and then Alt+Tab would bring you back to where you were. This doesn’t work anymore if the app you’ve switched to is on another workspace…
This happens to me when I want to check my email, which is on workspace 1, and then get back to what I was doing on another workspace. The only workaround I found was to pin my email client so it is always on visible workspace, using mutter. This way I can check emails without changing workspace, but this has other problems. It pollutes the overview, and shows up in the empty workspace that exists by default.
As for spacial location, I completely agree with you. I don’t know how litl solved that problem, but what you described reminded me métisse, a French project that Mandriva Linux shipped back in 2007:
The current GNOME overview could take a few lessons of what Métisse used in their “bird eye view”.
So the hierarchy I see is:
“back in app” (alt-left (browser mainly))
“back to previous tab/buffer” (na)
“back to previous window” (see below)
“back to previous app” (alt-tab)
“back to previous workspace” (na)
A thing that’s always niggled me is that apps are implementing tabs/buffers at all. It would be great to let the system deal with that and so avoid that level of the hierarchy altogether. Anyway…
90% of the time I want “back to previous window”.
That function is IMHO the biggest missing feature/regression of Gnome 3.
I realise you can bind alt-tab to “switch windows directly”. That really should be the default.
Now I had problems (I can’t remember exactly) with “switch windows directly”, so I use gnome-shell-extension-alternate-tab which works well as a “back to previous window”.
It even works across workspaces.
As for the other functions overloaded on the same key, maybe. But “back to previous window” is nicely in the center of the hierarchy and handles 90% my needs at least.
“back to previous window” describes the Android interaction model quite nicely. However, I don’t think that model would work well on an interface that doesn’t use direct manipulation. I do like the cross-app back button though, which usually does the right thing.
I seem to remember that the Epiphany guys were planning to include the browser tabs in the overlay
For me, this is one of the biggest problems in gnome3…
In gnome2, “task” switching worked most of the time, with the exception of multiple tabs in chromium (browser) or gedit.
But now, in gnome3, this almost never works. If I have two windows opened in LibreOffice, I want the *other* LibreOffice window, not some unrelated windows as I always get now… I learned the hard way that in gnome3, application switching never works.
Hi,
I think a couple of simple fixes could enhance the experience considerably.
We need to use gnome-shell workspaces as a conceptual equivalent to working tasks: each workspace holds a number of documents / apps relative to one task.
For this to work, the desktop has to fill isolated, which means to only the windows useful to this task should visible in the overview, on the dash and in the window switcher.
Another important thing is to prevent the shell to move you from one workspace to another automatically: it’s your job to manage tasks. For example, if a terminal is open on workspace #1, and you’re working on workspae #2, the terminal icon in the dash should *not* be highlighted (as if no terminal was already running), and clicking on it should open a new terminal window on workspace #2. Now, if you go to the dash again, the terminal icon should be highlighted (because it’s runnin on the current workspace), and clicking on it should bring you to the running terminal window on your current workspace.
https://bugzilla.gnome.org/show_bug.cgi?id=650030
https://bugzilla.gnome.org/show_bug.cgi?id=621287
https://bugzilla.gnome.org/show_bug.cgi?id=662835
I keep running into this all the time; I think in Gnome a huge improvement would be even if Alt+TAB simply navigated *windows* in MRU order, or if Alt+` would traverse across applications boundaries. (Do normal people ever navigate through applications these days?)
I found the same thing you did, with the exception that I don’t find the mouse sufficiently fast for switching windows; I rely on the keyboard for that. As a result, I found GNOME3 both better and worse in ways. On the one hand, Alt-Tab now switches between apps and raises *all* the windows of the switched-to app, which works nicely with things like Pidgin and its buddy list. On the other hand, multi-window apps are much rarer than “single-window with occasional non-dialog other window” apps, such as Firefox with its download manager or occasional popup window, or gnome-terminal with an occasional second window open for a programmatically launched mutt or vim. And for those, the need to distingish between “switch app” and “switch window” sucks mightily.
I agree with you that we have too many levels of hierarchy. A unifying approach seems preferable.
On the direct-manipulation N900, I found window switching quite usable, and the native browser on that platform would open additional windows rather than tabs, which integrated nicely. (I don’t know how well that interface would work with the number of tabs I open on a laptop rather than a phone, though.)
I also like the idea of making *everything* a tab: suck all the tabs out of each window and put them all at the top level, not as windows but as tabs. I’d love to switch between terminals and browser tabs in the same “window”.
However, the idea of leaving the window/tab distinction but putting tabs in the overview seems quite confusing to me.
Following up: personally, I see a few ways to improve this problem. Firefox plans to integrate downloads in the main window, as both a dedicated tab and a quick overview; preferences seem likely to move into a tab as well, which would make Firefox pretty much a single-window app. If gnome-terminal knew how to launch new tabs programmatically from the command line, programs could open a new tab for mutt, vim, or other command-line apps, rather than opening a new window. Even GIMP has gone single-window. With those changes in place, Alt-` will become more and more obscure.
However, I’d still like to see a good unifying model.
The WebOS cards method of grouping new windows of a single app into a ‘stack’ is also relevant. This replaced browser tabs, kept a block of ‘tabs’ together, but still allowed reorganization along with other cards/windows.
one potential issue with the idea of showing the tabs (discounting, for sake of argument, the issue of having an API in the shell to query the contents of a tab out of an application) inside the overview and/or the Alt+Tab/Alt+` selectors, is that people may have inhuman amounts of tabs opened; suddenly, instead of five/ten windows in the overview you have 50/100 tabs.
with high-DPI/high-resolution displays it may not be a huge issue – and who doesn’t own a MacBook Pro Retina, right? 😉 – but I think it’s going to be an issue nonetheless.
I think with this kind of issue the solution is usually to step back and change some other assumption or constraint to fix the problem. For example maybe tabs are shown in a stack, sort of like iPhone Chrome (which may also be what Steven is describing for WebOS earlier, I haven’t used WebOS though). Or maybe the overview needs to be different in some other way so it can handle more stuff. I don’t know, maybe it scrolls for example. Or maybe windows/tabs never overlap but are conceptually in one huge plane, and the overview is a viewport onto that with variable size and as you move further from your original window it zooms out more.
Just making stuff up, probably the first few ideas suck but the point is that it’s worth trying to get out of the local maximum. Maybe each small change from state A would make things worse but some collection of coordinated changes together would make things better and get to superior state B.
The thing with browser tabs is that the best possible UI for them depends a lot on the form factor and the user, and might not lead itself to work nicely with the other windows/content in the desktop.
I personally use Firefox with tree-style tabs on the side and session saving, with a little tweak so that there is only one tree of tabs expanded at a given time.
Tabs stay in the same place during a session and between sessions, so I can use spatial memory to find them. Links open in a subtree from the current tab, so there is a very clear link between my navigation history and the tabs that are open as a result. I don’t even need to use “read later” tools: there are half a dozen articles hanging from my Google Reader and Twitter tabs, saved between sessions and hidden until I move to that particular tree.
I have 30-40 tabs open at any given time, and it is never much trouble to find a particular one because the UI suits my behaviour and memory much better than the alternatives.
My point is, different problems might need different solutions.
there’s definite asymmetry on my desktop where browsers and terminals vastly dominate other kinds of window (and also differ in that they “contain” apps recursively, most other apps do a specific thing rather than hosting sub-apps).
So it feels somewhat mandatory to me to special-case browser and terminal.
Or else design for them and then shoehorn the other apps in. Anyway, definitely can’t just expect “regular” apps and the browser to be equivalent.
iPhone makes the browser have its own special navigation universe, while litl required all apps to basically look like web tabs even if they weren’t. Those are just two examples of ways to tackle it.
browsers and terminals (and document viewer) are where the alt+` app-based nav feels most “off” because the category “browser windows” is near-useless while categories like “windows related to the patch I’m working on” or “windows related to researching roofing contractors” could be very useful.
having a shortcut to navigate MRU of
workspaces and another to do MRU of
windows within current workspace could work well for people who are
careful to organize their workspaces by project.
you could also imagine features like only allowing interruptions such as IM notifications while on certain workspaces, time tracking based on workspace, etc.
anyway, when one’s major apps are browser and terminals the app-based organization just doesn’t seem to quite work. “it’s gmail” or “it’s failblog” is so much more relevant than “it’s a browser”
Quick thought in passing: Whenever tabs are used to group windows, the windows not visible are represented by only icons, names, or both. They are in that sense iconified. This might be a useful way to think about tabbed MDIs for an API or a window manager protocol.
task switching : if we mean “application switching”, this solution simply means no tab and no workspace.
Maybe do we look for “Document switching”. Far more complicated. How to switch from one document to another?
Applications can display several documents.
One set of documents can be worked on several applications.
For me, task switching worked extremely well in GNOME 2, since the desktop was 2-dimensional. I could just have one task/virtual desktop and rely on muscle memory. But with the one-dimensional desktop in GNOME 3 it might take me 6-8 keypresses (depending on whether I’m at work, where I use a 2×3 desktop — system still running GNOME 2, or at home where I used to use a 3×3 desktop before GNOME 3 took away that option).
I don’t really mind the new “switch between applications” alt-tab behaviour, but when I’ve got 2-3 different tasks, all using consoles, it’s pretty much impossible to judge at a glance if that iconified version of it is the source code for project A or project B.
The desktop applet of gnome 2 was really great because it was always visible and it used icons to show the contents of each desktop. So before I even started moving my hands I already knew exactly how many times I had to press ctrl-alt-right or ctrl-alt-left.
I still had to stop and think, but the entire though process was over before I started moving my hands.
[…] Task Switching issues in Desktop and specifically in Gnome3. You can read this on his blog “Desktop Task Switching Could Be Improved“. Tweet […]