Could eyetrackers replace the computer keyboard?
by Dr David Souto
16 Jun 2020
Eyetracking, the use of a device to monitor eye movements, has a rather longer history than one would think. At the beginning of the 20th century, the first eyetrackers confirmed that we read by fixating different words, not by scanning the page as we subjectively feel. Still, we had to wait until the 1940s to see the first eye-movement recordings. Even then, eyetrackers were expensive and cumbersome, confining their use to specialised research for a very long time. This research provided invaluable insights on how we process information and how we pay attention – we are much more selective than one would think – yet widespread applications of eyetracking had to wait for cheaper and easier-to-use devices that appeared in the noughties.
Practical applications of tracking eye movement
The multiple uses of such an exciting tool were soon recognised. Knowing where people pay attention has an obvious application in marketing, for instance, allowing an implicit analysis of the impact of product placement. Another type of application is in gaze-control, the use of gaze measurements online to operate a device. Fighter-jet pilots have been able to acquire their targets by moving the head. Helmets for F-35 pilots are being developed to aim at a target with a simple look, by integrating a diminutive eyetracker. Gamers can now do the same in first-person shooters taking place in an immersive virtual reality, thanks to the newest head-mounted displays such as the HTC Vive Eye. Those consumer-grade devices permit not only interaction with gaze, but will also, in a near future, allow foveal rendering: a simulation of the world that depends on where you are looking. The fovea is the central part of your retina (the layer of photosensitive cells that cover the back of your eye), thanks to which you are able to process fine detail, such as the tiny symbols that make this text. Visual elements in the periphery of your visual field project farther away from the fovea and are therefore seen in much less detail. It means that we can get away with a much coarser simulation of those elements: Ultra-HD in the centre and Nintendo NES pixelation in the periphery, so to say. Foveal rendering is here to stay, as it saves computing power while enhancing the visual realism of the simulation.
Arguably, the most life-changing application of gaze-control is in assistive technology; helping people communicate with others and interact with computers. UK charities, such Mick Donegan’s Special Effect, have been at the forefront in bringing those technologies to the context of assistive technology, participating in developing the first gaze-control devices in 2003 with the Swedish company Tobii, which since then grew to branch out in virtual reality (HTC Vive Eye). This line of work culminated recently in the possibility of using gaze control in Windows 10.
People suffering from amyotrophic lateral sclerosis (ALS), the incurable neurodegenerative condition that affected Stephen Hawking, see a progressive weakening of their muscles to a point where breathing and speaking become difficult. Gaze-control can bring independence to people with ALS, such as the French music producer Pone who composes music using sophisticated software using only his eyes. His website “ALS for dummies” offers a window on his day-to-day experience.
How does gaze-control technically work?
In broad strokes, a high-speed camera records the eye as illuminated by infrared light, avoiding interaction with ambient light. The centre of the pupil is extracted as the main indication of gaze up to hundreds of times per second. In a calibration phase, the user may look at a few known locations on a screen. After a simple calculation, the eyetracking system will then know where the user is looking, as long as we look within the screen plane. With gaze-control technology, those calculations are so quick that we can move a cursor or interact with a computer by shifting our gaze in what seems to the user like real time. A multitude of factors affect the precision of those calculations. A major one is the number of pixels representing the eye. As this keeps increasing in consumer-grade cameras, we can imagine a future where eyetracking is pervasive and reaches the precision of the research-grade eyetrackers of today.
We may tire of looking at a screen, but our eye muscles are active all day long, scanning the world for information about three to four times a second.
On paper, gaze-control technology can offer an alternative to the keyboard. Writing speed is an obvious issue: we have a pair of eyes that move in concert, whereas we have up to 10 fingers that can move independently. The same principle at play in predictive texting can be applied to eye movements, increasing words per minute with gaze-control. Although slower, gaze-control offers undeniable advantages. Eye muscles do not tire. We may complain that our eyes are tired, but it is not about the movement. We may tire of looking at a screen, but our eye muscles are active all day long, scanning the world for information about three to four times a second.
Distinguishing between looking and seeing
A major issue with gaze-control is the ‘Midas touch’ problem. King Midas’ superpower was to turn everything he touched into gold; this was a blessing and a curse. Similarly, how can we tell apart gaze for the sake of seeing and gaze for the sake of, say, icon selection? Gaze has a different purpose depending on the observer’s goal; sometimes we look to select, sometimes we look to process global information and sometimes we think and don’t look, which we call colloquially a blank stare. Although only we are privy to those goals, there are also external manifestations of those inner states. For instance, we fixate for longer when we want to process information (is that a C?) than when we pay attention to global information. The pupil also betrays the focus of one’s attention. When focusing on a distant object the pupil dilates, taking in more light. Pupil size can then be used as a communication tool for patients in a locked-in syndrome.
Our own work has focused on understanding how the use of gaze-control differs from hand pointing (using the computer mouse) and how one can learn to use it. Among other challenges, it seems harder to a plan sequences of eye movements. While the hand jumps swiftly from letter to letter, the eyes pause as if to figure out where to go next. We also noted that it is much harder to reach the very short selection times reached with the hand without error. This highlights a fundamental limitation of gaze-control: gaze normally serves movement. We look where the hand is going, helping our brain compute the movement that will to get us there. It is part of what we call eye-hand coordination. When gazing is the goal, we may be missing this preparation time and we need to slow things down accordingly. On the upside, we found people learn quickly to gaze-type, within only a few 30-minute sessions, and become better at suppressing involuntary eye movements in general after gaze-typing..
One day those psychological insights will hopefully help improve the communication capabilities of people with ALS and other disabilities, by informing how gaze is best used in replacing the keyboard and computer mouse. In the mainstream, eyetracking may well become a favourite mode of control in immersive VR, with the increasing precision of the future generations of eyetracking devices. Companies all over the world are looking for ways to exploit this new technology and commercialise a killer app that justifies the extra cost. I wouldn’t be surprised if gaze-control were at the centre of this quest, given its potential to enhance both display and interaction.
Dr David Souto is a Lecturer in the Department of Neuroscience, Psychology and Behaviour at the University of Leicester. He received a BA / Leverhulme Small Research Grant in 2018-2020. His research into gaze control and visual perception featured in the British Academy Virtual Summer Showcase.