The Avatar Mirror: Exploring the Parameters of Proprioceptive Adoption in Virtual Environments
With a title like that I’d better use some science to back things up, eh? So what’s this all about then? Short answer, it’s finding out the parameters and boundaries for how “you” a digital “you” can be when working with virtual reality. The rest of this article is the long answer with the sciency stuff.
Background
Proprioception is the psychology term for the sense of self and habitation of the body, how one “owns” and identifies oneself in a physical way. The brain is a wonderfully adaptive organ, and pulling off truly astounding feats in assembling an apparently contiguous reality from its disparate sensory inputs, but it also takes some significant shortcuts in doing so (mostly in the name of efficiency for speed and energy management, since the brain is also the most metabolically expensive matter in the body). This is why optical illusions work – they exploit some of those shortcuts by interfering with the assumptions on which they’re based, inserting errors or otherwise tripping them up. Con men and magicians have been doing this forever, not just to the visual system but the whole collection, and eventually the sciences became formally involved in their own right: specialized regions of the brain were studied frequently through their malfunction or maladaptation, observing how disease or injury (naturally occurring, though there may have been a mad scientist or two along the way; lobotomies came from somewhere…) correlated with behavior. Fortunately more modern techniques allow for neurological exploration without having to crack the skull open all the time.
Anyway, proprioception has its own faults and shortcuts; it’s very good, but it’s possible to step into the middle of the process and interfere with its assumptions and preconditions. I’ll cover 3 quick examples that demonstrate how this works in principal (and otherwise generally refer you to the awesome work of Dr. Vilayanur Ramachandran as the Neil DeGrasse Tyson of Neuroscience if you want to know more). First up is the “rubber hand” experiment, so called because it actually involves a rubber hand (imaginative, no?). The dominant or first order proprioceptive cues come from a combination of tactile and visual stimuli: feeling something, and seeing where you’re feeling it at the same time (that synchronization is important), combine to create the perceived effect of inhabiting a limb. The Rubber Hand demonstration interferes with the visual side of that equation: the setup consists of a subject placing their hand out-of-sight behind a curtain or partition, with the fake hand placed in view in a similar attitude and arrangement. The experimenter then provides similar synchronous stimuli to both the real (unseen) and fake (in plain sight) hands: using a stylus or paintbrush, etc., and touching both hands in the same place and way (e.g., poking or stroking the same fingers or points on the back of the hand with identical pressure, direction, etc.) at exactly the same time. Since the fake hand is where the subject sees the stimulus taking place, the brain puts that together with what they’re feeling to say “that’s where I’m feeling it from,” essentially adopting the rubber hand as their own in place of their physical hand. This works at the subconscious level: once acclimated, people will gesture toward the artificial hand when making references, or reach over to scratch that one if they have an itch; anecdotes aren’t especially effective data, though (unless you have a persistent thread of commonality and a rating scale across multiple subjects to create some statistical significance from the data). The really telling part of this is what happens when the fake hand is threatened: even consciously knowing that the fake hand is fake (which can even be bright pink, for all that matters), and definitely not attached to your body, a threat presented to it will cause a spike in skin conductivity (as measured with galvanic skin sensors, since people sweat almost constantly and the degree to sweat which increases during stress can be measured by its conductivity or “galvanism”). Showing a knife does a decent job, smashing it with a hammer can also induce protective reflexes (jerking away) and a strong startle response. This is a really fun one to do to your friends.
That’s a good introductory demonstration to the proprioceptive effect and illusion: it’s usually an illusion which corresponds to reality, but always illusion nonetheless since it takes place in our heads in response to our environment. The second demonstration turns up the intensity and induces “out of body” experiences; or, more accurately, “disembodied” ones. In this case, the seated subject wears a head-mounted display through which is shown the output of a camera (or, better yet, multiple cameras with a stereographic off-set for a 3D effect). The camera or cameras are placed some distance behind the subject and pointed directly at them, and the experimenters provide synchronous stimulation: lightly tapping or poking the subject’s torso, whilst making a poking motion toward where a torso would be if the camera corresponded to a subjects eyes. They feel the prod from the front, but see the motion from the camera, and since the timing and the sensation agree, once again the brain’s sensory fusion approach adopts the camera’s perspective as being “where I am.” Doing this with their physical body in view of the camera causes a certain amount of cognitive dissonance, which resolves in favor of the visual stimulus from the camera: it feels weird, but the cameras’ location becomes the psychological location for the subject. That dissonance can also be amplified to interesting effect by invalidating the preconception about the shape and nature of the psychological body the subject instinctively assumes is sitting below their perspective (“if this is where I am, then this is where my body is”): after achieving acclimation, take the trusty hammer (how cool is psychology, that they get to mess with people’s heads using hammers?) and, in full view of the camera/proprioceptive vantage point, swing it through that space. An anticipated stimulus (expectation and stress again measured galvanically) is not experienced, and suddenly the body concept is at odds with the sensory information – at this point the subject can become ungrounded, essentially having an out-of-body experience.
My favorite work so far along these lines comes from Sweden in 2008 (mentioned in my last VR article), doing an experiment I wanted to try since hearing about the first one and the plans for the second: body swapping, again using head-mounted displays and cameras, but this time attaching the cameras either to mannequins or to another person. In this scenario, not only are they able to inspire proprioceptive adoption of the camera’s viewpoint, but also the body to which it belongs, either a dummy or living person (even if that person is of a different body shape or gender). So long as the tactile and visual stimulus match up both topographically (happens to the same area of the respective bodies) and temporally (at the same time), the cues are accepted and the new perspective becomes that of the psychological “I”. Movement of the subject and the surrogate must be carefully choreographed in order to preserve the illusion, so a subject reaching out a hand should correspond well with the camera-wearing experimenter reaching out his or her hand in a similar attitude and timing. There’s obviously likely to be some manner of gap here, but the body concept is surprisingly malleable as long as it matches up within a reasonable margin of error: the brain maintains a remembered map of the body, which can even be deeply in-grained, but there’s always a certain amount of dynamic assessment (this is necessary as muscle movement is not absolute and requires negative feedback looping – particularly from visual input – to ensure proper limb placement; variability comes primarily from metabolic load, available fuel sources, fatigue, injury, etc.). The effects here are pretty amazing: subjects were able to shake hands with themselves while maintaining the illusion.
A note here about mental health: this isn’t all just for fun and curiosity (though, to the sciences, “curiosity” is reason enough, and I happen to agree – it’s always good to probe the boundaries and know more versus less; even utility and profit, if those are your objectives, need knowledge as a raw material). Many of the discoveries in psychology, be it this area or others, come about from studying brain injury and maladaptation which dramatically affect the quality of life for sufferers. Dysmorphic body identity disorder, associative identity disorders, phantom limb syndrome, even eating disorders, etc., are related to or influenced by how mind and body do or do not agree. Increased understanding of the underlying mechanics means better treatment, possible preventative care, and a positive total contribution to quality of life (and this is just on the proprioceptive side, which is decidedly niche – there’s a lot more on the general behavioral side that similar scientific undertakings can and will be helping). Good ethical guidelines are also very necessary because some of what’s discovered here could indicate harm: trying on different body types, shapes, gender, may exacerbate underlying conditions or introduce identity dissociation effects.
The Premise
Moving into the virtual world several research techniques are available now which weren’t before. Simulation of environment and avatar through reality augmentation or complete computer generation provides a great deal of freedom, and the ability to extract measurements from the process means a much richer set of data for assessing efficacy (also known as “finding out how well it works”, when I’m feeling less sesquipedalian). For example, instead of attaching physical cameras to a physical embodiment (biological or artificial), they can be placed literally anywhere within a virtual environment. Instead of requiring choreography and synchronization between disparate participants, information from the subjects’ own movements can be used to control an avatar with higher precision and lower latency, and the corresponding behavior measured using high-resolution processing with consumer grade hardware (sub-millimeter accuracy at 120 samples a second for all fingers on both hands in the operating space – that’s just amazing, and is even on the verge of being out-of-date).
Using the newly available technology (Oculus Rift DK2 and Leap Motion) I plan on creating a foundation for engaging with and assessing proprioceptive cues and influences to really see what works effectively, where the boundaries are, and determine the principles which should be used by other developers or researchers (both to strive toward, and to specifically avoid). Questions will be along the lines of, how synchronous does the stimuli need to be in time and/or space to be effective, and with what kind of distribution across different subjects and ages? What’s the resolution required for adoption – earlier experiments established that a humanoid form was required (getting the subject to empathize with a chair was a non-starter, for instance), but how “humanoid” does it have to be, and how well defined? For example, can a person adopt a blocky identity, a skeletal one, wireframes, and so on? How isomorphic does it have to be? If we simulate 3 fingers on each hand does that affect how someone moves, or how they reconcile to their environment? Does the use of a virtual mirror or mirrors enhance the effect or diminish it, and what happens if they don’t agree? (a common effect in suspense or horror films, and probably for good reason) What about other environmental cues? (there’s a lot of discussion amongst VR developers about “presence”, which is as much about the adoption of the environment as it is about the avatar; clearly identifying those parameters would also be extremely helpful).
What happens if a consistent error rate is introduced? This is under research in locomotion as redirected walking, allowing people to feel as though they’re walking around an endless space while really confined to 25m2. Can you maintain the illusion if you constrain movement to certain ranges or speed? (which would be helpful for controlling robotic or telepresent equipment not capable of recreating the same speed or range of human movement). Can you train surgeons, engineers, or musicians, to use finer motor controls by acclimating them to a simulation which grossly exaggerates their movements while performing various tasks? (this ends up happening in these and other professions anyway, where a whole world of movement and perception can happen in the space of a fingertip; enhancing this effect or speeding up the training would be a win)
For people with abnormal (which really just means “non-standard”, or outside various N-tiles or sigmas of the bell curve) proprioceptive processing, or who are trying to adapt to changes in brain or body such as stroke patients, amputees, or other physical therapy participants, what can we learn about their abilities, limitations, and perceptions by measuring their performance against expected norms for a task? More importantly, what can we learn about the effectiveness of treatments (traditional, or newly devised using this or other research) in restoring function and quality of life?
That’s the gist, and the general what-ifs; what I propose to create to measure all of this will start with a simple set of environments and experiments. I’ll also need to come up with a reliable way of measuring the degree of adoption; skin conductivity works well (or at least is a known quantity) under stressful conditions, but what about surprise or confusion? What happens if someone is provided with unexpected stimuli, like feeling something they don’t expect (a touch, or encountering an obstacle during movement), or not encountering something they do expect based off of other cues? If we establish a good baseline for these conditions then we can also learn from a lack of response when interacting with an adopted environment. Coming up with solid measurement will be key.
Specific experiments
The general simulation environment for these experiments will focus on being easy to reproduce (both computationally, requiring inexpensive computer resources, and physically where analogs are employed), both to reduce variables and increase the accessibility to other researchers. Subjects will be seated and primarily engaging in activities in the space immediately in front of them, requiring minimal torsion of either the head and neck or torso (e.g., don’t have to turn very far in any direction, or for very long), and with appropriate considerations given to the metabolic load of the activity – limiting prolonged static positions, or hand and arm engagement which cause strain (there are good guidelines for this from existing work, and from recent commercial applications in gaming such as the Wii and the Kinect). Engagement with the physical environment will need to be simplified as well: walls, surfaces, and components (objects to be moved, toggled, or otherwise interacted with) should be easily constructed, positioned, measured, or simulated; for example, keeping track of cylinders is easier than cubes as rotation about the vertical axis is immaterial. Caveats still apply – people will have different reactions to the virtual reality apparatus itself, which if possible should be measured and recorded independently (or interdependently if there’s a good way of separating out the influences from the measurements) in order to eliminate or account for those variables.
Within that environment, then:
- Tactile calibration for adoption – exploration of techniques for mapping tactile stimulus to visible avatars, such as using the index finger of either hand to touch targets on the opposite hand, wrist, and arm, while observed by the subject. Test will be degree of adoption from different types of stimulus (either different targets or different instigators, if not by the subject), influences of timing and synchronicity, and avatar model resolution.
- Mirrors and environmental cues – exploration of the use of perception of elements in the environment on adoption, in the form of lighting (material properties of the avatar, its shadow, etc.), reflection (standing water, windows, mirrors), audio (sound source identification and occlusion), and so on.
- Isomorphic boundary conditions – measuring persistence of adoption (given results of prior experiments and starting from a variety of adoptive states) and utility of engagement when deviating from the baseline tracking: exaggerating or reducing effects of movement, changes in scale and potency of tactile stimuli, modifications to avatar structure and attributes, etc.
That’s pretty much it – nothing huge, but hopefully foundational. The first step is creating the tools and software for the basic environment, data collection, and for assessing results (measuring engagement – I don’t have a biometric recording array, or even a skin conductivity sensor hooked up yet). Each step will be built on the next, measuring as I go in order to compound lessons learned oven the course of development. This may change the direction of the experiment(s), or invalidate some of the assumptions in either the design or the expected utility of the available tools (it could be that the Leap motion resolution is insufficient, and/or needs to have predictive filtering added to it in order to overcome lag effects). Finding such limitations is one of the necessary risks of the undertaking, and still represents lessons learned which can then be used by others, or which can define the parameters necessary to try again in the future (using the next Rift, Leap, better rendering hardware, etc.).
Caveat
A huge grain of salt, here: I’m not a professional psychologist. I have some education in the field, and I’ve read a lot of material on both the general principles and this specific area, but this is still amateur. It’s the psychology equivalent of someone watching all the seasons of Dr. Who and then thinking they can successfully write (and even produce) an episode themselves; there’s a huge difference between consuming and producing in any discipline, and though I try to talk a good game I know I’m beyond my lettered expertise. My hope is that this work will be useful to other developers at least anecdotally, or can be adopted by students or professionals as an effective tool in their efforts (if anybody’s interested, drop me a line – I’d love to lend a hand).
Latest Comments:
Comments are closed for this post.
As a game developer, I’m super interested in what you find with these experiments. A great game experience is one in which the user is engaged. I can’t wait to see what’s worth spending my time on and what won’t have much of an effect.
Very interesting post, Paul! I recently wrote my BA thesis in Media and Communication Science about pretty much the same stuff you write about here (unfortunately only available in german so far). I tested a well balanced array of 20 people using the DK1 and two demos to determine whether the subjects sense of presence is affected by suboptimal control interfaces or direct interaction or passive reception. If you’re curious or interested in the results just send me an email. 🙂
I am afraid that you have made a small mistake just at the begining of your article.
Proprioception is not the “sense of self and habitation of the body (…)”, at least not in the nomenclature of psychology and cognitive science. This term describes one of our senses which gathers data about position and rotation of your body, especially your joints. It is more abstract and sophisitcated than for example sens of taste, but it’s nature is not much different. “The sens of self” is constructed upon all of our senses but it’s definitely not one of them. The phenomenon which you are talking about is releted to such things as consciousness and self-awerness, however I do not know how to call it properly.
Since you are looking for conscientious person, I thought that you sholuld know.
cheers, Dominik.
I abused nomenclature a little bit, but the title is more nuanced: what I’m really talking about it proprioceptive adoption, in that the data coming from the virtual representation has been grafted into the process of proprioception itself in that the subject references it as a means of determining “where am I” and even “what am I”. You are correct with regard to the strict definition and use in a clinical setting where it would refer only to the sensory-integration process for reconciling relative limb and joint position (and in that case, to the specific exclusion of visual stimuli, so it’s moot anyway). It’s a pretty fine hair to split, though.
Nice article, do you have any thoughts on VR combined with BCI as the primary input?
BCI & VR are a perfect combination, especially for rapid iteration and using simulations as proxies for other activities – which is to say, it’s easier to navigate a virtual space than a physical one, and translate the results of that navigation and interaction into physical counterparts (in the case of mobility assistance, for example). However, most interfaces currently use statistical and analytical smoothing – they require a lot of training, and don’t have an exceptionally detailed level of granularity. That translates into a limited number of discernable inputs and a fair amount of lag.
Increased resolution requires more involved scanning, either fMRI or direct wiring (invasive). There are good advances coming though, to be sure.