Sunday, December 14, 2014

Adventures in Analytics: Touchless Phone Locking and Unlocking

Everything Looks Easy at First.
When we first started working on this project, our initial thoughts were, "how hard can it be to break down 3D accelerometer data?"  It looked like it should be pretty straightforward at first, knowing that accelerometers are also equipped with a gyroscope to let us know which direction the phone is facing in space.  The next image is a schematic that shows the 3-dimensions of acceleration that the phone is able to detect.  We should be able to know that when the Z-axis values are positive, the phone screen is facing up, away from the ground.  This works the same for the other axes, which allow us to know the orientation of the phone with respect to the ground.  But, then, we encountered Problem #1The accelerometer only knows how it is oriented in space, but tells us NOTHING about what the user's body is actually doing.
Accelerometer Axes
Accelerometer axes for the phone. Values for the Z-axis are positive when the screen is facing away from gravity.
We then quickly ran into Problem #2: Noise.  Accelerometer data, even when sampled at 100Hz (100 samples per second), it's still really fuzzy.  These were test movements of the phone that we tried to make as smooth as possible.  While the user's movement, at least, per the user and human eye is smooth, the acceleration pattern and the rotation of the phone in space are not.  The only somewhat clearly discernible points that were easy to detect were the positive "humps" on the Z-axis to indicate that the screen is face up.
Accelerometer Read
Here's what the raw data look like from the accelerometer during these movements.
Then comes Problem #3Redundancy.  Partly related to Problem #1, combined 3D acceleration patterns do not describe unique movements.  For example (and much to our chagrin), moving the phone up toward the ear generates a fairly similar 3D acceleration pattern to rotating the phone from a screen up to screen down position.  This was a big problem, because we needed any algorithmic solution to be stable when the phone is placed in a pocket or bag and not turn on unnecessarily.

Stonewalled...
Problem #1, without any special hardware to track the user's body movements cannot truly be overcome.  Problem #2 was probably just as tricky, but, at least had the potential to be overcome computationally.  Taking a data-centric approach, we launched an all out assault.  We tried practically every biomechanical and computational trick in the book.  Filtering, 3D Euler angles, angular analyses, differentiation, integration...  you name it, we probably tried it.  Nothing worked.  Even the most complicated algorithms with enough if/else/or clauses to fill an Olympic-sized swimming pool failed.  Needless to say, Problem #3 is a bigger problem cannot be solved without solving the first two.
Folly.
Our frustration levels had pretty much boiled over and we decided to turn our focus toward other projects.  While working on behavioral biometrics problems, we came to the realization that our analytical approach was simply, plain wrong.  Doing what anyone would with biometrics, we tried to develop a template movement that we would use to match other subsequent movements in order to tell the screen when to turn on or off.  Bad idea, especially since the human brain and body are naturally variable, where every performance of the same movement would be like snowflakes, same general pattern, different details.

Occam's Razor.
Simplicity was key.  Trying to capture the actual movement of the user was not the right level of analysis.  Instead, we needed to be able to capture the holistic, general characteristic of the user's movements to engage and disengage with the phone.  Reinvigorated, we went back and attacked our problems, coming up with the following solutions:
Solution #1 - Assume that the only way that user's engage with their phone is in the hand (duh).
Solution #2 - Find a way to understand the general characteristics of each of the actions we want the phone to respond to.  Forget the small stuff.  This mean a generalized pattern in time and space that was a loose representation of the movements in question.
Solution #3 - Create a context-conditioned algorithm.  We were only worried about the present, that is, the movement itself.  But, our biggest error was that we forgot about the past.  The algorithm had to be cognizant of where the phone had been in order to know where it was going and what events to account for and what to ignore.

Demo Time. Our next step was to then begin to construct a working algorithm and then deploy it on a smartphone as a proof-of-concept demonstration.  We will cover this in our next post and provide a narrative for the YouTube videos that we have floating around.

No comments: