Necessary randomness. We rarely (if ever) pay much
attention to the fact that all of our computing devices need to be able
to produce random number sequences. In reality, random numbers are
absolutely essential for so many functions. Perhaps the way that random
numbers affect us the most are games with an element of chance.
Without these random numbers, a game of Yahtzee on your mobile phone
would be really boring, every roll of the dice would just come out the
same. Computer generated random numbers are also needed to
allow statistical predictions to be made and forecasting to be done.
Here, random numbers are needed to introduce a level of uncertainty into
the data, allowing many simulations of a single condition, testing how
probable a given outcome might be. We see this process in weather
forecasting, where there is an X% chance of rain. Furthest from sight
and mind is the role of random numbers in secure communication and the
encryption of data. in order to prevent unwanted parties from gaining
access to the data is achieved using these random numbers. What
encryption protocols do is “scramble the data” in order to prevent
unwanted parties from gaining access to the data is achieved by hiding
the actual information using the random numbers. Unfortunately, like the
Albert Einstein quote: "As I have said so many times, God doesn't play dice with the world,"
computers and computing devices are also unable to play dice. It's not
for the lack of arms and hands, producing random numbers is just a
really difficult process. To generate random number sequences,
computers rely on algorithms known as “pseudo-random number generators"
or PRNG. This recent article published by the SC Magazine UK (an
outlet for IT security professionals) really sheds light on this
problem and how password managers are affected by the lack of good
quality seeds for the PRNG.
Not stork-delivered. Just like in biology, you need seeds to
produce random numbers with a PRNG. The process works by taking a
single number or numbers as starting values that are then entered into a
PRNG which then takes the starting values and "twists" the data into a
(pseudo-)random sequence. The Mersenne Twister (see Wikipedia entry
here: http://en.wikipedia.org/wiki/Mersenne_twister) is probably the
best example of this procedure. Unlike biology, seeds that grow into a
random number "tree" cannot be used to create new seeds and grow new
trees for one major reason. If you use the same seed(s) and the
same PRNG algorithm, you will always get the number sequence from the
PRNG. Now, this presents us with a big problem, especially on the
encryption front. If someone knows the seed and the PRNG algorithm,
they can easily decrypt the data, since all of random numbers used to
scramble the data are known (see this link).
Sourcing seeds.
What becomes critical to security is that the seed itself comes from a
fairly unpredictable source. This is also not a trivial process. Just
take a quick look at this discussion on stackoverflow.
Some might hesitate before using the CPU clock for data encryption
since the random numbers used for encryption since all anyone would need
to know is when you actually performed the encryption process in order
to gain access to your data. One possible alternative would be to
leverage the sensors that are embedded in our computing systems. For
example, the accelerometer is now being proposed as a possible source of PRNG seeds.
Sensor overload. Looking
at the stackoverflow discussion, there is a problem with using sensors
for seeding purposes. Yes, if left idle, the accelerometer would
generate a pretty random sequence of numbers due to the electrical
background noise that exists in any electronic sensor. But, sensors
that are embedded in our computing devices are designed to track and
obey the commands of a user, not to just sit idle. All a hacker would
then need to do is move the sensor through a known sequence of events
and the seeds are known. This could be anything moving the device a
certain way to changing the temperature. In addition, values generated
by sensors in a device are available to the operating system, accessible
through software. And, as the discussants on the stackoverflow rightly
point out, human behavior is really quite predictable and systematic,
as even some of our own work has shown.
More is not merrier. There's
a scramble right now to jam in and integrate as many sensors as
possible in the device. On the surface, it would seem like having more
data streams would provide more random numbers for seeds. Not really,
because all of the sensors are housed together on the same device, the
data streams will be correlated to a good extent. This means that if we
wanted a designated sensor for this purpose, it would have to be housed
outside the device and transmit data across some connection, making
that a bad solution. To make the sensor solution workable, correlations
between data streams must be broken before the data sequences can be
used to seed a PRNG.
How can these problems be solved? Our next post will present solutions to these problems and outline a method of obtaining seeds from sensors.
No comments:
Post a Comment