It turns out, a good way to generate random numbers from sensor data
is much like getting your laundry done. It's a 3-step process. Step 1 - WASH: Clean out and remove major traces of user behavior from the data sequence. Step 2 - RINSE: Increase unpredictability in the data sequence. Step 3 - SPIN: The
accelerometer data sequence goes into the random number generator as
seeds and are spun (twisted if you're using the Mersenne) into a nice
series of random numbers. WASH At this stage, the
main goal is to remove some of the major, slow drifts in the data
sequence. These are the most predictable parts of the sequence as they
are our biggest source of predictability. It is important
to keep in mind that drift is also natural in electronic sensors, even
if untouched, for example due to changes in ambient temperature. This makes the first step essential in eliminating the bulk of the predictability within the sequence.
For
the sake of maintaining some of our "trade secrets," we cannot reveal
exactly how we "wash" the data to remove the major sources of
predictability in the data. What you will see is that the effects are
clear in the image below. Once the wash cycle is completed, all that is
left are 'pulses' at certain points in the data sequence. What's
important at this point is that we can test how well we have eliminated
the drift using measurements of "stationarity" to let us know whether the average and variance in the data sequence change over time.
At
this point, the data sequence is now more unpredictable than it was
initially, but, probably not unpredictable enough. Next comes the rinse
cycle. Just like the laundry, we've removed the dirt, now the soap
needs to go too. The data now consists of pulses and "bare spots" where
there is little activity. The goal of this cycle is to transform
the data further so that it actually looks like a random sequence. To
protect from additional vulnerabilities, our method does not manipulate
the data directly. Instead, we transform the data into complex numbers
and change only the imaginary parts. And, here we go:
The final
results are quite compelling. At least visually, they are comparable to
a 10,000 point data sequence generated with the
Mersenne Twister, displayed in an image from
our previous blog. We now have an 8-bit integer sequence that can be
fed into a random number generator as seeds for encryption, password
management, etc. What's Next? In our upcoming post, we
will give you the hard facts. Using the NIST random number generation
and testing toolkit, we will test the quality of our results, measuring
the unpredictability against the NIST standard and other RNGs. We will
also outline the benefits and best practices of using this method. Stay
tuned!
No comments:
Post a Comment