3D Interaction

8 minute read.

UX (user experience) design is hard - for those not in the industry, it's basically a slightly broader term for UI (user interface) or GUI (graphical user interface), which is to say how we as engineers let you the user interact with our software. I'll preface this by saying I have no formal training in UX, but then again I don't have formal training in sculpture or software engineering either, and I like to think that hasn't stopped me. 2D UX is plenty tricky, but 3D is a real bear. I've ended up designing and building a fair amount of 3D UX simply because there was no one else (a common deficiency of large software companies is a dearth of UX experts). I'm pretty proud of what I've built and I'll describe it in the hopes that it's copied, because I have spent a lot of time being frustrated by awful 3D interfaces.

What makes 3D so hard? The biggest issue is that the interface (screen, mouse, touchscreen) is all 2D. Worse yet, it turns out you're not just one dimension short: you the observer, often called the camera, cannot represent your view of the space with just three dimensions since you have three for orientation (yaw, pitch, roll) as well as three for position (X, Y, Z). Often forgotten is a seventh dimension: field of view, or focal length. In the real world, we're perfectly comfortable controlling all these dimensions with our head (you'd need to be looking through a camera zoom lens to adjust your focal length), but mapping this motion onto a mouse or touch screen is a real stretch for intuition. 

This is a selling point of VR, where the interface does in fact give your head natural control of all six dimensions. It has potential, but the VR interfaces I've tried suffer from even bigger problems: selecting things with a laser pointer makes me feel like I have cerebral palsy, my arms get tired from constantly having to reach for core parts of the UX, and typing anything is utterly maddening. Not to mention the lack of haptic feedback which makes interacting with virtual objects surprisingly difficult. What we really need is that hard light hologram from Red Dwarf. Regardless, most people are not interfacing with software through a VR headset, so the point is moot. 

3D UX became a sore point for me early on because as an aerospace engineer in school I had to learn a variety of CAD packages for design. I loathed them all. And it's not because I disliked 3D design (see above)! It's because they are terribly complex with UX clearly designed by and for engineers, which is to say not designed at all. Now I occasionally need to use Blender or some other 3D animation software, which might be even worse, if that's possible. Of course many people have productive careers dedicated to the use of these programs - they can certainly be learned. But since I've only ever been a beginner over and over, all I see is how awful the learning curve is for each and every one. 

I could go on about the dizzying nests of toolbars inside of menus inside of toolbars, but that's all 2D UX. Even worse is the 3D side. I'm not going to go into detail about selection in 3D, because that is a gnarly problem and one which I have thankfully gotten to avoid so far in my projects. Here I'll focus just on viewing, which is to say moving the camera (yourself) around, or the object if you'd prefer (in a virtual environment without a reference, they are equivalent). I'm also going to focus on single-object viewing, where the goal is to inspect every bit of an item, rather than exploring a large scene, where game design is probably a better UX guide. 

The basis of nearly any 3D object viewer is an orbit camera: rather than controlling the camera's position directly, you control the position of a target point a fixed distance in front of the camera. Then the position of the camera is related to its orientation relative to this target point, akin to a satellite orbiting and observing the Earth. This intuits both the concepts of walking around an object, and of turning an object in your hand in front of you. Now all your UX needs is a connection between your 2D inputs and these orbit parameters.

It's pretty standard practice to separate the interactions into three: orbit (orientation), zoom (radius), and pan (target position), which are often controlled one-at-a-time with different modifiers of your 2D input (right-click, hold ctrl, use two fingers, etc). For quite some time the trackball interface was popular for orbiting: you map X (sideways) and Y (up-down) input to the local motion of the camera orbit's sphere. It does have the nice property than when you're zoomed in close, the surface moves along with your mouse the same no matter how you're oriented. 

Friends don't let friends use trackball orbit controls.

Please don't ever use trackball orbiting! If there's one thing humans are used to, it's the concept of up, which a trackball does not have. They also have a funny side-effect: try dragging your mouse/finger in a clockwise circle on the above example around and around. Do you notice how the model doesn't come back to where it started, but slowly rolls counterclockwise? In controls this is called a Lie bracket, but in UX it's called WTF. 

A simpler and much more intuitive interface is azimuth-elevation, where sideways input orbits the camera in the horizontal plane while vertical input moves the camera towards or away from the top pole. In this system the third orientation angle (roll) is fixed relative to the up direction. This is beneficial because now your 2D input is only mapped to two dimensions, and humans prefer to keep their heads upright anyway. If you've ever used a globe on a gimbal, it's like that.

Gimbal orbit controls are more intuitive by being more constrained.

Next we have zoom, which ought to be simpler since it's one-dimensional. However, with zoom we run into some of the most vexing UX problems. The first is that often when you zoom in too close your camera enters the object. 3D power users are so used to this they don't give it a second thought, and there might even be an occasional use case for it, but think about how baffling this looks to someone who is only familiar with the physical world. Why on earth am I suddenly seeing my supposedly solid object as just the inside of a shell? If you find yourself trying to describe boundary representations to lay users, your UX has already dramatically failed. 

Flying through the surface of an object is baffling.

The second problem with zoom is that in general your radius can only get as small as zero, but if you're inspecting something behind the target point, that may not be zoomed-in enough. Likewise, if the target is on the inspected surface, that last little bit of lost radius is going to look like an almost infinite zoom. How do we get a long zoom range with constant sensitivity and keep our camera out of the object? The key is to use field of view (focal length is equivalent, but I prefer unitless measurements) in addition to radius. 

Orbit radius and field of view both zoom, but they do so in different ways. Radius changes the perspective (you're actually getting nearer) whereas field of view simply magnifies the image (like the zoom on a camera). You can guarantee your camera never enters your object by setting the target at its center and keeping the radius greater than the object's bounding radius, or constraining the target to be in the bounding sphere and keeping the radius greater than the object's bounding diameter. I first applied this technique when I was a PM helping design 3D Builder: in order to get more zoom when the radius hit its lower bound, we started narrowing the field of view instead. 

When I rewrote the camera interaction for <model-viewer>, I made it smoother by changing the radius and field of view in sync. However, field of view is tricky because a qualitatively equal amount of zoom is attained by cutting the angle by the same fraction, so subtracting constant angle increments is highly nonlinear. Instead I apply the zoom increment to the log(FoV) which feels nice and consistent. And the great thing is there is no mathematical limit to the zoom - as the field of view approaches zero you can see as much detail as your machine's precision will allow. 

Finally we have pan, which is by far the trickiest, largely because you're trying to control three dimensions (X, Y, and Z of the target position) with your 2D input. The natural thing is to move the target only along the image plane, but the difficulty is that the target point will not tend to end up on a surface. Often after you've zoomed in to look at a detail, when you rotate you suddenly find that you're orbiting a random point in space and the detail you were interested in has been whisked away. 

The solution I finally found and implemented for <model-viewer> was to reset the radius of the orbit camera at the end of the pan gesture to place the target on the surface without moving the camera itself. This way you can immediately start orbiting around the detail you're looking at. I even added a small central bullseye to help indicate the focus point. 

Of course by changing the radius but not the field of view, your zoom level is now in a state you would not normally reach. This can be smoothly handled by making further zooming change these two variables in proportion to their distance to their respective stops, so they arrive coincidentally. And since your radius now tends to represent your distance to the inspected surface, you can normalize the pan sensitivity by this radius to make the surface feel attached to your fingers.

This is what we've been building up to.

Setting the target on the surface has the drawback that you can never return to the initial condition: the target at the center of the object. For this reason I find it important to have a reset feature, which I enable with a tap outside of the object. A double-tap might be a nice backup, in case of being zoomed in too much to have any background visible. I also zoom all the way back out to complete the reset. When tapping on the object, I shift the target to the tapped point to allow for instant detail inspection.

The other difficulty with pan is discoverability, particularly for mouse users. With touch it can be combined with zoom using two fingers, which is pretty natural. For mouse, you need some kind of drag modifier, generally a right-click or some modifier key. Inevitably every 3D tool chooses different ones, and there the situation is often far worse because they need to multiplex even more for various selection modes. For our simpler use case, I simply say every modifier (including right-click) activates pan so the user will always guess right. 

That's about all I have on 3D camera control, and good thing because this has gotten pretty long. But one last thing is it feels a lot nicer if your inputs are smoothed a bit, for the same reason as we prefer UI changes to have transitions. Most UX folk think of transitions as fixed curves, which is fine if you just go from one finite state to another, but when following arbitrary user input they are hard to apply sensically. Here my controls background says you can't do much better than a critically-damped second-order filter, where you accelerate toward the user's input like a mass dragged by a spring. 

A critically-damped filter has only a single parameter: a decay time constant. This makes it easy to tune and it can be applied independently to any number of variables that need smoothing. The only trick is that it is asymptotic, which means when the user stops dragging, the model will never quite stop moving. Especially since re-rendering drains the battery, it is important to put a stop to this motion by choosing a small "close enough" distance at which the variable jumps exactly to its destination without being noticeable. 

I wrote this post because I want interactive 3D to become ubiquitous, and for that it must have excellent user experience. For this reason I want other 3D developers to copy my work, build upon and improve it. Everything I have described here is open-source, so feel free to inspect and remix. There are many applications that will have other requirements, but hopefully this thought process will still come in handy.


Post a Comment

Popular posts from this blog

Perseverance - a history of Manifold

Manifold Performance