Our next post on this topic will cover the solutions to these problems, where we'll include images to take you through the data transformation process.
Sunday, December 14, 2014
Randomness and Regularity on the Road to Sensor-Based RNG Seeding
So, what do a series of random numbers look like? Here we have an
image with 10,000 random integers. The data sequence is very
unpredictable, making it very difficult to figure out what number will
come up next in the sequence. A data sequence like this makes for a
very good set of seeds to be fed to a PRNG
(pseudo-random-number-generator), extremely secure if used for data
encryption.
What
about sensor data? For simplicity, let's assume the accelerometer is
our sensor of choice. Now, we can be pretty sure that the accelerometer
data will look like the numbers generated by the PRNG. But, let's be
honest, most people are rarely separated from their mobile devices. To
see how random accelerometer data are during use, we left an app running
in the background on a phone while the user went about their daily
activities and interactions with the phone. Here's what a 10,000 point
data sequence looks like with user interaction.
As
per some of the discussion points in the stackoverflow discussion we
included in our previous post, the accelerometer data are far from
random. Every tap on the screen, every movement of the phone and change
in orientation is captured, making this a set of bad seeds. You can
see how footsteps can also introduced unwanted predictability into the
data sequence, so, no go here for direct entry of these data into the
PRNG.
As
we said in our previous post, more is not merrier. Let's look at all 3
axes of motion on the accelerometer. What becomes clear immediately is
that the all of the three data sequences are quite strongly
correlated. They go up and down at the same time, or in other cases,
when one goes up, the other goes down. Overall, there are 2 Problems to making the accelerometer data workable. Problem #1 - Traces
of user behavior produce predictable patterns. In addition, if
multiple data streams from different sensors in the same device are
being used, their natural correlations have to be removed. Problem #2 - Insufficient unpredictability in the data sequence, even once Problem #1 has been solved.
Our next post on this topic will cover the solutions to these problems, where we'll include images to take you through the data transformation process.
Our next post on this topic will cover the solutions to these problems, where we'll include images to take you through the data transformation process.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment