Copyright © 2005 jsd
Turbid User’s Guide
John S. Denker

ABSTRACT: We discuss how to configure and use turbid, which is a Hardware Random Number Generator (HRNG), also called a True Random Generator (TRNG). It is suitable for a wide range of applications, from the simplest benign applications to the most demanding high-stakes adversarial applications, including cryptography and gaming. It relies on a combination of physical process and cryptological algorithms, rather than either of those separately. It harvests randomness from physical processes, and uses that randomness efficiently. The hash saturation principle is used to distill the data, so that the output is virtually 100% random for all practical purposes. This is calculated based on physical properties of the inputs, not merely estimated by looking at the statistics of the outputs. In contrast to a Pseudo-Random Generator, it has no internal state to worry about. In particular, we describe a low-cost high-performance implementation, using the computer’s audio I/O system.

Other keywords: TRNG, HRNG, hardware random number generator, uniform hash, unbiased coin flip.

*   Contents

1  Goals and Non-Goals

1.1  Basics

The basic objective is to provide a copious supply of randomly-generated numbers, with quality and quantity suitable for a wide range of applications. Some specific applications are discussed in reference 1. Quality requires, among other things, defending against various possible attacks. Some well-known attacks are discussed in reference 1. Some background and theoretical ideas are outlined in section 6 and developed more fully in reference 1.

Right now, in this section, we give a broad outline of what is possible and what is not possible.

So far, we have analyzed the HRNG mainly in terms of the density of adamance, denoted by ρ. We choose a criterion ρmin very close to 100%, and then build a generator that produces ρ > ρmin. We are not pretending to achieve ρ = 100% exactly.

1.2  Tradeoffs

Here are some of the criteria on which a random generator can be judged:

  1. Efficient use of CPU cycles.
  2. Efficient use of the randomness coming from the raw inputs.
  3. Ability to meet short-term peak demand.
  4. Resistance to outside attack, i.e. cryptanalytic attack on the outputs.
  5. Rapid recovery from compromise, including inside attack, i.e. capture of the internal state.

These criteria conflict in complex ways:

These conflicts and tradeoffs leave us with more questions than answers. There are as-yet unanswered questions about the threat model, the cost of CPU cycles, the cost of obtaining raw randomness from physical processes, et cetera. The answers will vary wildly from machine to machine.

2  Emergency Operation

2.1  Big Hurry

If you are stuck somewhere without turbid or other fancy tools, and you need some hard randomness right away, you might try the following:

Open-circuit the mic input on the computer. On a desktop this is trivial. On a laptop with a built-in microphone, this means finding a piece of wood or plastic with a diameter between 3 and 3.5 mm and sticking it into the mic jack. Use alsamixer to turn all the gains all the way up. Then use the command:

        :; arecord --disable-softvol -f S32_LE -r 44100 -d 3 noise.wav
        :; aplay noise.wav
        :; cat noise.wav > /dev/random

Play back the file, and listen to it using speakers or headphones. If it doesn’t sound like white noise, try again until it does. Then stuff the file into /dev/random. You can use the .wav as-is; there is no need to reformat the data.

If the machine doesn’t have audio circuits at all, for a few dollars you can buy a USB audio dongle and use that.

If you have a decent A/D converter, it should have a bandwidth on the order of 18kHz or more, and should provide several bits of randomness per sample ... provided the mic input really is open-circuited and the gains are turned all the way up. If we degrade that to 1 kHz and 1 bit per sample, the noise.wav file should still contain 3000 bits of hard randomness, which should be plenty enough for initializing a cryptologically-strong PRNG.

Then use the PRNG to generate as many random samples as you like.

2.2  Closer Look

If you are in not such a big hurry, you should verify the randomness of the data. This can be done, approximately, using only relatively prosaic tools, as folllows:

        :; arecord -f S32_LE -r 44100 -t raw --disable-softvol -d 3 noise.raw
        :; aplay   -f S32_LE -r 44100 -t raw noise.raw
        :; <noise.raw od -t d4 -j400 -v -w4 -Anone -N8192 > noise.csv
        :; gnumeric noise-graphs.gnumeric
        :; gnumeric noise.csv   # copy-and-paste into the other worksheet

Then you can use a spreadsheet app to look at the data. First look at it as a function of time, as in figure 1.

Figure 1: Noise Voltage as a Function of Time

Then take the Fourier transform and look at it as a function of frequency, as in figure 2.

Figure 2: Noise Power as a Function of Frequency

The spreadsheet used to produce these figures is cited in reference 2.

It is worth doing a certain amount of checking, because it’s easy to get fooled. The red signal (in figure 5) has at most 1 bit of randomness per sample. As you can see, it’s a two-state system. Problems like this, with too little randomness per sample, can arise naturally when the audio system digitizes the signal too coarsely.

Figure 3: Noise Voltage as a Function of Time
Figure 4: Noise Power as a Function of Frequency

Things are even worse in figure 5. It is a low-pass filtered version version of the previous signal. The samples are now correlated with one another, to a significant degree. The red signal has an effective bandwidth of only about 6000 Hz, even though the sample rate is the same as the blue signal, namely 44100 Hz. So the red signal has much less total randomness.

You can’t determine any of this just by listening, because all three signals sound more-or-less the same.

Figure 5: Coin Toss Voltage as a Function of Time

The Fourier transform will tell you the bandwidth, which is important, but it won’t tell you much about the amount of randomness per sample, which is also important.

Figure 6: Coin Toss Power as a Function of Frequency

3  Installing the Ingredients

  1. Install ALSA (reference 3), including the drivers, utilities, and development libraries (libasound2-dev). The current turbid code is compiled against version 1.0.25 of the libraries. Slightly older versions should be satisfactory; much older versions are not. You might want to apply the patch in turbid/src/excl.patch, so that the mixer device can be opened in “exclusive” mode. To do that, cd to the directory in which you unpacked the turbid and alsa packages, and then incant something like patch -p1 < turbid/src/excl.patch. Then configure, make, and install the ALSA stuff (drivers, libraries, utilities). If you have never run ALSA on your system, you will need to run snddevices once. It may take some fussing to get the proper entries to describe your soundcard in /etc/modules.conf. Install the utils/alsasound script in /etc/rc.d/init.d/.
  2. Test the sound system. Verify that you can play a generic .wav file using aplay.
  3. Compile turbid. This should require nothing more than untarring the distribution and typing make in the turbid/src directory.
  4. If you want to run turbid as non-root, make dirs to create the needed directories.
  5. Test it and calibrate it as discussed in the next section. You don’t need to be root in order to compile and test turbid.
  6. Send me <jsd@av8n.com> the results of the configuration, including alsamixer’s .ctl file, and turbid’s -Q -R -B and -K settings. I will add it to the distribution, so that others may use that brand of cards without having fuss with calibration.
  7. You can (if you want) become root and do a make install.
  8. If you want turbid to be started and stopped automatically according to the system runlevel, you need to install a few things by hand. There is a file turbid/src/init.d/turbid which you can use as a model for something to put in to /etc/rc.d/init.d/ (but don’t forget to edit the options, according to the calibration, as described below). This is not installed automatically; you have to do it by hand. Similarly you need to install by hand the symlinks in /etc/rc.d/rc?.d/.

If turbid runs as non-root, it will open its output FIFOs in your $HOME/dev/ directory, look for its control files in your $HOME/etc/turbid/ directory, and scribble a .pid file in your $HOME/var/run/ directory.

If turbid runs as root, it will use the system /dev/ and /etc/turbid/ /var/run/ directories.

4  Configuration and Calibration

It is remarkably easy to mis-configure a soundcard in such a way that it produces much less randomness than it otherwise would. This section outlines the general procedure; see appendix B for details on specific soundcards.

Terminology: we use the term soundcard loosely, to apply not just to individual cards per se, but also to audio I/O systems embedded on the mainboard, and to USB dongles, external pods, et cetera.

4.1  Configuring Audio Output

Cables: You will need at least 3 audio cables. Color-code the cables. That is, wrap a bit if yellow tape near both ends of one cable, and red tape on another, et cetera. Otherwise you’ll drive yourself crazy.

Note that at any time, you can run turbid show args to see what it thinks it is doing. For example, the card line tells you what ALSA device it wants to use.

Turbid automatically takes commands from the configuration file turbid.tbd if it exists in the working directory. Other configuration files can be invoked using the “@” sign, as in turbid @foo.tbd for example.

Configure turbid to use the proper card. See section 7.3 for guidelines on choosing a suitable card. If you’re lucky, the default ALSA device (hw:0,0) corresponds to the proper card. Hook up headphones to the output port, then fire up turbid sine 1 and see what happens. You may need to adjust the volume using alsamixer. There may also be a physical mute button, volume controls, etc. on your keyboard.

If that doesn’t work, some investigation is required. The command aplay -L (capital letter L) and/or aplay -l (lowercase letter l) to see what audio devices are available. If there are none, or only trivial devices such as “null”, you might try bestowing membership in the “audio” group to the user who will be running turbid. If the card you want to use is not visible, check that the hardware is installed (lspci -v -v or lsusb -v -v). Then check that the appropriate drivers are loaded.

Once you have identified a suitable card, if it’s not the default, you will have to specify card ... on the turbid command line (or in some .tbd configuration file) from now on.

Choose an output channel. In order of preference, you want to plug into one of the following ports: Line-Out, Headphone-Out, or perhaps Speaker-Out.

Plug something such as a cable or a Y-adapter into the output port for the chosen channel and leave it plugged in from now on, at all times when turbid is running. That’s because I have seen situations where plugging and unplugging causes the mixer settings to get changed behind your back. The pulseaudio daemon is infamous for this, but there may be other culprits.

Temporarily connect headphones to the chosen output signal. You could perhaps use loudspeakers instead of headphones, but for simplicity and consistency this document will speak in terms of headphones.

Fire up turbid mixer "" sine 1 to invoke the calibration feature. It plays a chord: The left channel is Concert A (440 Hz) and by default the right channel is a perfect 4th above that (a factor of 1.3333).

Use alsamixer to adjust the output to some reasonable level. You don’t want the output to clipped or otherwise distorted.

Once you have a good mixer configuration, save it to disk, as follows: Use turbid ./turbid show args exit | grep mixer-ctl to find what turbid wants to use as the default mixer control filename. Then use the command alsactl store -f whatever.ctl to save the configuration to that file.

If you want to use some other filename, that’s fine, but then you will need to add the mixer someother.ctl option to every turbid invocation from now on.

Unplug the headphones. All critical measurements should be made with the headphones disconnected. 1

From this point on, we will calibrate one channel at a time. In theory you could speed things up by working on multiple channels at once, but there are an awful lot of things that could go wrong with that. The examples that follow assume channel 0.

Fire up turbid sine 1 ochan 0. Note that turbid numbers the channels starting from zero, as indicated in table 1. See figure 7 for the layout of a standard stereo connector. Use a voltmeter to measure the voltage of the chosen channel relative to ground (i.e. the shield). Use the “AC Volts” mode on the voltmeter.
Figure 7: Audio Plug : Tip, Ring, Shank

name:    tip    ring    shank     
name:    tip    ring    sleeve     
channel:    #0    #1    shield     
output usage:    left    right    ground     
line-in usage:    left    right    ground     
mic-in usage:    left    Vbias    ground     
Table 1: Audio Connector Properties

Adjust the calibrator amplitude so that the RMS voltage is around 0.2 volts. One way to do this is to pass the amplitude ... option to turbid, repeatedly restarting the command turbid ochan 0 sine ⋯ until you find something you like.

Another option is to leave the turbid amplitude alone and use alsamixer to adjust the mixer settings until the desired output level is obtained. Then re-save the configuration as described in step 7.

In all cases, when we talk about a measured voltage – output voltage or input voltage – we mean RMS voltage (unless otherwise stated). If you are using an oscilloscope or something else that measures peak voltage, keep in in mind that the zero-to-peak amplitude of a sine wave is 1.4 times the RMS value ... and the peak-to-peak excursion is 2.8 times the RMS value.

[Optional] Do a sanity check on the frequency dependence. Measure the voltage at (say) 220 Hz, 440 Hz, and 880 Hz. All readings should all be roughly the same. At some sufficiently high frequency, the voltmeter will no longer be reliable.

As a point of reference, the ancient voltmeter I use for this is rock solid up to 1 kHz. There is a 1-pole corner at 4 kHz. Above roughly 30 kHz the bottom drops out. You do not need to check your voltmeter with anywhere this level of detail.

Figure 8: Voltmeter Frequency Response

[Optional] Check to make sure the output is not clipping. At an output level of 0.2 volts, clipping is exceedingly unlikely, but it’s easy to check. Add an option sine -6 to the turbid command line. The observed voltage should go down by a factor of 2. Example: turbid ochan 0 sine -6.

Note that turbid applies the sine option cumulatively, so that turbid ochan 0 sine -3 sine -3 is the same as turbid ochan 0 sine -6.

Calculate the output calibration factor. XVout is the open-circuit RMS output voltage that we would hypothetically observe if the calibrator amplitude was 0 dBx. To calculate this, use something like:
        turbid sine  0 Vopen .272 show xvout exit
        turbid sine -6 Vopen .136 show xvout exit

This will print a line of the form:

  XVout: 0.272      #    freq: 440  sine_level: 0  Vopen: 0.272
  XVout: 0.271356   #    freq: 440  sine_level: -6  Vopen: 0.136

The two XVout values are not quite identical, because 6 dB is not exactly a factor of 2 in voltage. Also the voltmeter is subject to roundoff error and other inaccuracies.

Put the XVout: line into a file with a nice short name, perhaps “x”. Put @x on the turbid command line from now on.

Check output impedance. Load the output with a resistance small enough to decrease the voltage to roughly half of the open-circuit voltage. Plugging into the low-impedance port on the calibration box should do it. Fire up
        turbid @x ochan 0 sine 1

and measure the voltage across the resistor. Then get turbid to do the calculation for you

        turbid @x sine 1 Rload 100 Vloaded .173 show Rout exit

This will print a line of the form

  Rout: 57.2254   #  Rload: 100  Vloaded: 0.173  sine_level: 0

Stick that line into the “x” file. That allows turbid to know the XVout and Rout values. That means turbid knows what voltage it is outputting, so you don’t have to measure it with a voltmeter.

For an explanation of how this is calculated, see appendix E.

4.2  Choosing an Input Port

Choose an input port. On some machines, including some laptops, you have only one possible input, namely mono microphone-in. On other machines you have multiple possiblities, in which case the following considerations should guide your choice:
  1. More sensitivity is good, more bandwidth is good, and more channels is also good. Alas there is sometimes a tradeoff between sensitivity and multiplicity, as discussed below.
  2. Microphone inputs are sometimes more sensitive than line-level inputs.
  3. Microphone inputs are often mono. Beware of the following:
    • On a mono Mic port, the tip is the actual input. In a stereo situation this would be the left channel, but here it is the only channel. Beware that very commonly the ring is an output, namely a bias voltage for powering electret microphones. It is not an input at all. The typical “soundblaster” product supplies a +5V bias voltage through a 2.2kΩ resistor. However, you can’t count on this; I’ve seen some laptops where the open-circuit voltage is only 2.5 volts (again with a 2.2kΩ output impedance).

      This is scary, because if you try to monitor the input in such a way that the microphone bias voltage gets routed to a real stereo input, it is entirely possible for the DC on the right channel to blow out your speaker.

    • Some soundcards allow you to make a “stereo” recording from a mono Mic input. Hypothetically, it is conceivable that sometimes the two channels have uncorrelated noise. However we have no way of proving this, and no way of calibrating how much independent noise there is. Therefore, in such a situation, it is safest to assume that only one channel is usable as our source of unpredictability.
  4. I’ve seen some machines where the input port was configurable, such that it was optionally mono Mic-in and optionally some sort of stereo input.

    However, on most machines I don’t know how to do this. It may be that the hardware doesn’t support it, or the ALSA drivers don’t support it, or I’m not smart enough to figure it out. If anybody has good information on this, please let me know.

  5. If there is a tradeoff between more channels (stereo line-in) and more sensitivity (mono Mic-in), then:
    • Sometimes the line-level input is so insensitive as to be essentially useless. In this case, the mono Mic input is the correct choice.
    • If the line-level input has some reasonable sensitivity, it is unlikely that the Mic input is twice as good. In this case, the stereo input is your best choice.

    It may not be entirely obvious a priori what is your best is. You may have to go through the calibration process on each input to decide which provides the most unpredictability per unit time.

4.3  Configuring Full-Duplex Audio

Take a moment to think about the following issue: We want to set up full-duplex audio. Full duplex means that the input subsystem is independent of the output subsystem. Most likely, you will have to fiddle with mixer settings (using alsamixer) to get this to work. Presumably you already have the output working (section 4.1), so now we need to get the input working without messing up the output.

This is trickier than it might seem, because recording engineers very commonly want to monitor the signal that is being recorded, so mixers usually provide a path whereby the input signal can be looped back to some output. This kind of loopback interferes with what we are trying to do. This kind of mixer mis-configuration is sometimes hard to detect. Note the contrast:

Recording engineers want to do input, with immediate re-output via the mixer (hairpin).   We want to do output, with immediate re-input via the QA box (loopback).

Therefore, when configuring the input subsystem, by far the safest procedure is to use a signal that comes from somewhere else. So we need two systems:

  1. The computer being configured to run turbid; call this the target system.
  2. Something to produce a well-behaved test signal; call this the generator system. Your options for this include:

So ... here’s the plan:

Set up an audio generator system. Use the headphones to listen to its output. Verify that it is at a reasonable level. Then disconnect the headphones. Do not attempt to listen to the signal at the next stage, when it is connected to the computer input. In other words, do not use a “Y” splitter. That’s for two reasons:

Run a cable from signal generator to the chosen input port on the target system.

Fire up turbid show stats so you can see what turbid is seeing at the input. Use alsamixer to configure the recording-mode properties of the mixer so that turbid is getting a reasonable input signal. You don’t need to worry about exact signal levels at this stage, but the following qualitative criteria should be satisfied:
  1. Turbid should get a good-sized reading when the signal is connected.
  2. The reading should drop to a low level when the signal is removed, e.g. by pausing the signal generator.
  3. No vestige of the signal is heard at the output port, i.e. no “record monitor” behavior.
  4. When you restart turbid as turbid show stats freq 660 the calibrator signal is heard at the output ... and the stats should remain the same.

In other words, turbid needs to do input, it needs to do output, and the two things need to be independent.

To help you interpret the numbers that turbid puts put, please refer to the discussion of units (such as ibu and dBx) in appendix D. Here are some of the things you might observe at this stage:

Use alsactl store -f ... to save the mixer configuration.

If the generator system is borrowed, you can give it back now. The rest of the configuration can be done using your computer to calibrate itself.

4.4  Calibrating Audio Input

Hook things up so that the cable from the soundcard output port goes to the input of the calibration box, and the cable from the high-impedance output goes to the soundcard input port.

We assume the input data path was configured correctly, qualitatively, as discussed in section 4.3. Now we want to set the gain, quantitatively

Set the recording-mode controls on the mixer to enable the chosen input port as the capture source. To check for clipping, fire up

  turbid @x ochan 0 ichan 0 show clipping

For pedestrian listening-type mixer settings, the output might be something like this:

 Input clip check: top   1 times @  0.0402663 ibu   bot   1 times @ -0.0399549 ibu
 Input clip check: top   1 times @  0.0379069 ibu   bot   1 times @ -0.0378201 ibu
 Input clip check: top   1 times @  0.0433064 ibu   bot   1 times @ -0.0342267 ibu

Here’s how to decode that: An ibu is an input buffer unit. The value that can possibly occur in the input buffer is 1 ibu. In the example here, the signal in the input buffer achieved a high-water mark that is only a few percent of the maximum possible value, and it hit that mark only once. Similar words apply to the minimum value. Conclusion: Clipping is not a problem here. We have lots of headroom.

The next step is to turn the mixer gain sliders up as high as they can reasonably go (usually all the way). If there is a “Mic Boost” slider, turn it up also, as high as it can go without clipping when there is nothing but noise on the input. You want every element in the input-chain to be turned up as high as it can go without clipping.

Typical output, with the gain all the way up, feeding the input with high impedance:

 Input clip check: top   1 times @    0.1709 ibu   bot   1 times @   -0.1704 ibu
 Input clip check: top   1 times @    0.1733 ibu   bot   1 times @   -0.1754 ibu
 Input clip check: top   1 times @    0.1669 ibu   bot   1 times @   -0.1762 ibu

This is still good, no clipping, lots of headroom.

With the output still going to the calibration box, fire up turbid @x ochan 0 ichan 0 sine 1.

Use the voltmeter to re-measure the output level signal. It should be within 1% or so of the open-circuit voltage measured previously. It won’t be quite the same, because the calibration box imposes some slight load on the output. This mostly serves to make sure you did not inadvertently disturb the output-channel mixer settings during step 19.

Shift the input cable so that it is now coming from the low-impedance port on the calibration box.

Fire up turbid @x ochan 0 ichan 0 sine 1 show clipping to see what happens when the calibrator signal is applied to the input. The results may look like this:

 Input clip check: top 834 times @         1 ibu   bot 784 times @        -1 ibu
 Input clip check: top 831 times @         1 ibu   bot 787 times @        -1 ibu
 Input clip check: top 779 times @         1 ibu   bot 847 times @        -1 ibu
 Input clip check: top 799 times @         1 ibu   bot 821 times @        -1 ibu

Here’s how to decode that: The maximum possible value in the input buffer is 1.0 ibu. The signal achieved that level hundreds of times. There is rampant clipping going on.

Lower the calibrator amplitude to make the clipping go away. For example: turbid @x ochan 0 ichan 0 sine -15 show clipping. Typical output:

 Input clip check: top   1 times @    0.5824 ibu   bot   1 times @   -0.5783 ibu
 Input clip check: top   1 times @    0.5851 ibu   bot   1 times @   -0.5744 ibu
 Input clip check: top   1 times @    0.5764 ibu   bot   1 times @   -0.5774 ibu

A peak value on the order of 0.9 ibu is nice.

Do not cure saturation by lowering the gain of the input channel! That would reduce the rate of production of unpredictability. Instead, just lower the amplitude of the calibrator signal. In the unlikely event that you can’t lower the output signal enough, increase the amount of attenuation. This may require an additional resistor in the calibration box.

Lower the calibrator amplitude by 6 dB and verify that the input buffer reading goes down by a factor of 2. It’s OK if the new signal is slightly bigger than half the previous signal; we can attribute that to additive noise.
turbid @x ochan 0 ichan 0 sine -15 show clipping lbw 1 sine -6
 Input clip check: top   1 times @    0.2993 ibu   bot   1 times @   -0.3091 ibu
 Input clip check: top   1 times @    0.2938 ibu   bot   1 times @   -0.3074 ibu
 Input clip check: top   1 times @    0.3034 ibu   bot   1 times @   -0.3044 ibu

If you’re curious, you can measure the noise floor directly:

 turbid @x ochan 0 ichan 0 sine -11 show clipping lbw 1 sine -100
 Input clip check: top   1 times @   0.03504 ibu   bot   1 times @  -0.03155 ibu
 Input clip check: top   1 times @   0.03184 ibu   bot   1 times @  -0.03368 ibu
 Input clip check: top   1 times @   0.03663 ibu   bot   1 times @   -0.0304 ibu

The most important objective at this step is to make sure there is no AGC (automatic gain control) on the input. AGC will drive you crazy. If necessary, fiddle with the mixer controls to make sure AGC is turned off.

Note that you can give the sine ... option more than once, and turbid will do the arithmetic for you. So you can use the first sine ... to set the level at a nice round number, and then use another sine -6 to reduce the voltage by a factor of 2 from there.

4.5  Calibrating Input Impedance

Now hook up the QA cable. From now on we will not be jumpering across the QA cable resistors; we are using them as a standard of comparison for calibrating the input impedance of the soundcard. Attach the voltmeter to the high side of the resistor, the side toward the output stage of the soundcard. (The low side will be measured by the input stage of the soundcard.)

Restart turbid, passing it the new voltmeter reading. This might be slightly higher than the previous reading (because the high impedance of the QA cable loads the soundcard output somewhat less than a direct connection to the soundcard input does), but not much higher (and certainly not lower).

Using this information, turbid will be able to determine the input impedance R of the soundcard port you are using.

Restart turbid, passing it this R value. The command should look something like turbid -v -A -8 -F 440 -V .200 -Q 1.31e-6 -R 9400. As a check, observe that the Vin/Vmeter/divider number is close to unity. (The Vin/Vmeter number itself will be small compared to unity, because there is a voltage divider between what the voltmeter is measuring and what the soundcard input is measuring.)

4.6  Calibrating the Output Scale Factor

At this point turbid will also be displaying a number (Kout) which is basically the voltage (as would be measured by the voltmeter) corresponding to the internal representation of a certain “standard” sine wave.

Restart turbid, passing it this K value. From now on, we no longer need the voltmeter. You should erase the -V argument from the command line. We will rely on the calibrated output of the soundcard to complete the calibration of the input. This has the advantage that the soundcard can probably be trusted over a wider frequency range than the voltmeter can.

You can test this by changing the frequency of the calibrator. The command should look something like turbid -v -A -8 -F ... -Q 1.31e-6 -R 9400 -K .152. Verify that Vin/Vout/divider ratio remains near unity over a wide range of frequencies. It will start to drop off at high frequencies (above 15 or 20 kHz).

4.7  Measuring the Bandwidth

Restart turbid, passing it the option “-F BB”. Overall, command should look something like turbid -v -A -8 -F BB -Q 1.31e-6 -R 9400 -K .152. This does not actually emit a negative-frequency sine wave; instead it emist a broadband probe signal. The program will then print out a measurement of the bandwidth. See appendix C for an explanation of the measurement technique.

This completes the calibration process. See section 5 for a discussion of normal (non-calibration) operations.

4.8  File a Report

Please send me <jsd@av8n.com> the .ctl file for your card, and the values for the -Q -R -B and -K settings. I will add it to the distribution, so that others may use that brand of cards without having fuss with calibration.

4.9  How to Make a Calibration Box

Note the contrast:

Calibration Box   QA Box

A calibration box is used temporarily for calibration. You need only one, since it is unlikely that you will be calibrating more than one machine at a time. You don’t need this at all if somebody has previously calibrated an identical machine and given you the the .ctl file for your mixer, and the .tbd file for turbid itself.   For long-term quality assurance, a simpler box is used. For this you need one box per machine. You don’t need this at all if you’re not interested in integrity checking.

  If you are mass-producing things, you may want to build an QA cable with resistors built right into the cable (rather than a box per se). That is fine for integrity monitoring, but not convenient for calibration. We will pursue that topic here.

The following tools will be needed only temporarily for building the calibration box, so if you don’t have them on hand you will have to buy or borrow them:

Figure 10: Calibration Circuit Box

The following components will be needed, so if you don’t have them on hand you will have to go to an electronics-hobby store and buy them:

The following tools will be needed whenever you are calibrating a soundcard. This assumes you have already built the calibration box:

Figure 11 shows the circuit diagram for the QA box.

Figure 11: Calibration Box Circuit

5  Normal “Production” Operation

“Production mode” stands in contrast to “calibration mode”.

In production mode, you must provide values for Q, R, and B. No other parameters are needed, but there are a lot of options, as we now discuss.

To verify that turbid is working, you can look at the output, e.g. od -A n -t u1 -N 32 /dev/hrandom (or $HOME/dev/hrandom if you are running it as non-root). Also, turning up the verbosity level (turbid -v or turbid -vv) will display some information about the internal workings. The meaning of some of the dislayed items is documented in section 4.

5.1  Feeding the Kernel

If you are running a lot of legacy applications you might want to turn on the -f option as described in section 5.3. The program will write random bits to the output fifo, and will block waiting for somebody to read them. For example: turbid -Q 2.076e-07 -R 2800 -B 36200 -f. You can add the -d option if you want it to detach and run as a daemon.

5.2  Buffering

The output of turbid is double-buffered, 512 bytes per buffer. The main reasons for this include:

Note that you have to be careful when using dd, because it unwisely counts blocks, not bytes. If you want to use dd to capture, say, a megabyte from /dev/random, it is advisable to use two copies of dd in a pipeline:

dd if=/dev/random obs=512 | dd bs=512 count=2k

The first dd performs reblocking while the second dd does the counting. A similar scheme is required at the output of turbid if you want to use dd with a blocksize that is not a divisior of 512 (and possibly under other conditions).

In code that you write yourself, no reblocking is required. You can keep out of trouble very easily, just by paying attention to the number of bytes actually returned by each read() request (which will sometimes be less than the number requested).

While we’re discussing buffering, note that the Linux /dev/random can buffer approximately 4096 bits, but you can’t trust ioctl(, RNDGETENTCNT, ) to tell you how much is really there. Sometimes when the ioctl reports 4096 bits, the buffer can be exhausted by reading as little as 342 bytes (2736 bits).

5.3  Feeding the Kernel RNG

If you specify turbid -f, a subprocess will take some of the output and feeds it to /dev/random whenever the latter is running short of randomness. This means that applications such as GnuPG that rely on /dev/random will run nicely, without blocking.

Runtime efficiency would be improved by rewriting the applications to use /dev/hrandom directly, but this is not always convenient.

You should leave this feature turned off if you are not running as root, since /dev/random doesn’t trust deposits from non-root sources.

Also, you may want to leave it turned off under the following conditions: Suppose some application is consuming vast quantities of low-grade randomness from /dev/urandom (perhaps to obliterate a multi-gigabyte file or some such). Due to a misfeature of the Linux kernel random-number system, every byte you read from /dev/urandom takes one byte away from the pool of stored randomness in /dev/random. Therefore the feeder subprocess, in the attempt to keep /dev/random happy, will consume a great deal of randomness from /dev/hrandom, in competition with other processes that want to use /dev/hrandom.

As discussed in section 10, this problem is intrinsic to /dev/urandom. Avoid using /dev/urandom. The solution is to use /dev/srandom instead. See section 5.4.

5.4  Stretched Random Generator

The turbid program comes bundled with a Stretched Random Generator. Its density of randomnes is strictly greater than zero but less than 100%. To use it, read from /dev/srandom.

The SRNG seeds itself, and reseeds itself every so often, using random seed material from the HRNG. Reading from /dev/srandom consumes randomness 10,000 times more sparingly than reading from /dev/hrandom or /dev/random or /dev/urandom. It is also less compute-intensive than /dev/urandom.

The principle of operation is as follows: In addition to the seeding/reseeding process, there is a state variable, a deterministic update function, and a hash function. Between reseedings, at each iteration we deterministically update the state variable and feed it to the hash function. The output of the hash function is the output of the SRNG.

The update function treats the state variable as three 32-bit registers. Each register has a random starting point. The first register is updated by adding a random odd addend. The second is a Linear Feedback Shift Register. The third is another LFSR, with a different period. (This design was chosen because each register can be separately tested, in contrast to one big 96-bit LFSR which would be harder to test.) Each register separately has a period on the order of 232 and combined they give a period on the order of 296. The period is billions upon billions of times longer than it needs to be, because after 10,000 iterations the state variable is strongly perturbed using 128 bits of pure randomness from the HRNG.

It is commonly assumed that having a super-long period is a hallmark of a good PRNG. However, that’s neither necessary nor sufficient. A 4000-bit LFSR has a period of 24000 or so, but it can easily be cryptanalyzed on the basis of only 4000 observations (unless it is protected by a strong one-way function installed between the LFSR and the observable output). It is necessary to have a period long compared to the interval between reseedings, but it is not necessary to make it much longer than that. In summary, if the period is long enough, making it longer confers no advantage.

Unlike the HRNG, the SRNG must assume that the hash function has the one-way property. That is, we assume an adversary (even after ∼10,000 observations of the output of the hash function) cannot infer anything useful about the state variable.

The defining property of a SRNG is that it has an average density of adamance that is greater than zero but less than 100%. Let’s take a moment to discuss where in the output-stream this adamance is to be found. Suppose you have 1024 bits of adamance available for seeding your generator, which you then use to put out a very long output string.

6  Background: Basic Notions

6.1  Turbid: Basic Features

Turbid is meant to be easy to use, yet robust enough to handle a wide range of applications, including high-stakes adversarial applications. It is a True Randomness Generator (TRNG), specifically a Hardware Randomness Generator (HRNG)

This stands in contrast to garden-variety “Random Number Generators” that may be good for non-demanding applications, but fail miserably if attacked. See reference 1 for a discussion of the range of applications, and of some of the things that can go wrong.

We now explain some theoretical ideas that will be important:

We have implemented a generator using these principles. Salient engineering features include:

6.2  How to Define Randomness, or Not

If you ask three different people, you might get six or seven different definitions of “random”.

At a minimum, we need to consider the following ideas:

  1. At one extreme there is perfect randomness. The output is completely and unconditionally unpredictable.
  2. There is also pseudo-randomness, which means that there is a pattern, but it is computationally infeasible for the adversary to discover the pattern, so long as we keep secret the pattern and the method we used for generating the pattern. For a wide range of purposes, the adversary finds this to be just as unpredictable perfect randomness.
  3. At the opposite extreme, there is perfect determinism, i.e. no randomness at all, such as a source that produces an endless string of bytes all the same, or perhaps successive digits of π, or some other easily-recognizable pattern.
  4. There is also something I call squish, which is neither reliably predictable nor reliably unpredictable. See figure 12.

    For example: Once upon a time I was working on a project that required a random generator. The pointy-haired boss directed me to get rid of my LFSR and instead use the value of the program counter (PC) as saved by the last interrupt, because that was «unpredictable». I kid thee not. In fact the PC was a terrible RNG. Successive calls tended to return the same value, and even if you got two different values, one of them was likely to be the PC of the null job (which was hardly surprising, since it was an event-driven system).

    I tried to explain to him that even though the PC was not reliably predicatable, it was not reliably unpredictable. In other words, it was squish, with no useful lower bound on the amount of randomness.

  5. Combinations of the above. For example, there could be a distribution over 32-bit words having a few bits of randomness in each word, with the other bits being deterministic and/or squish. Figure 12 shows a simplified view of the space of possibilities. Note that randomness and pseudorandomness are not the same thing, but this diagram assumes they are more-or-less equally unpredictable, from the adversary’s point of view.
Figure 12: Predictability, Unpredictability, and Squish

The terminology is not well settled. I have heard intelligent experts call items 1, 2, 4, and 5 on the previous list “random”. I have also heard them call items 3, 4, and 5 “non-random” or “not really random”. Sometimes even item 2 is included in the catagory of “not really random”.

We consider the terms “random” and “non-random” to be vague, clumsy, and prone to misunderstanding. If a precise meaning is intended, careful clarification is necessary. This remains something of an unsolved problem, especially when talking about PRNGs.

Beware that in the cryptology community, the word «entropy» is rather widely abused. Sometimes it is thrown around as an all-purpose synonym for randomness, even in situations where the signal contains absolutely no real physics entropy. This is a bummer, because in physics, entropy has an agreed-upon, specific, and exceedingly useful meaning.

On the other hand, in the long run it’s not much of a problem, because it turns out that the physics entropy was almost certainly never what you wanted anyway. There exist distributions where the entropy is infinite but the adamance is only two bits. An adversary would have a rather easy time guessing some outputs of such a distribution, even though the entropy is infinite.

The essential concept here is probability distribution. The entropy is only one property of the distribution. See reference 1 for more discussion of this point.

In this document, and in reference 1, we use the word “entropy” in the strictest sense: “entropy” means physics entropy. The terms entropy and adamance are assigned very formal, specific meanings.

6.3  TRNG versus PRNG

The contrast between a TRNG and a PRNG can be understood as follows:

A Pseudo-Random Generator (PRNG) depends on an internal state variable. That means that if I wanted to, I could give my friend a copy of my PRNG (including the internal state) and then his PRNG would produce the same outputs as my PRNG.   A True Random Generator (TRNG) such as a Hardware Random Generator (HRNG) has the property that the outputs are unpredictable in principle. It does not have any internal state. If I give a copy to my friend, not only can he not predict the output of my TRNG, I cannot even predict the output of my own TRNG.

A high-quality PRNG depends on the assumption that it is computationally infeasible for any adversary to infer the internal state by observing the outputs. More generally, the assumption is that the adversary will find it not worthwhile to infer the internal state. This is a moving target, depending to some degree on the value of whatever the pseudo-random distribution is protecting.   A proper TRNG does not have any internal state. It does not need to be initialized.

Every PRNG depends on some crucial assumptions, namely the existence of a one-way function that is resistant to cryptanalytic attack.   To a lesser extent, the same is true for every practical HRNG that I know of.

This is probably not a serious limitation for a well-implemented PRNG. That’s because all of modern cryptography relies on similar assumptions. If these assumptions were to break down, the RNG would be the least of your problems. Still, one should always keep in mind that these assumptions are unproven.   This is even less of a problem for a HRNG, because any attack will be much more difficult to carry out. It is a known-codetext attack, with vastly less codetext available.

6.4  PRNG and SRNG

Pseudo-random generators, as a class, suffer from a number of problems. For starters, the state variable must be initialized, which is often remarkably problematic. Thereafter, the state variable must be preserved and defended for all time, which is problematic, too.

Initializing the secret internal state variable is called seeding the PRNG. A basic PRNG is seeded only once, at the start of operations. There is some finite amount of randomness in the seed. As the PRNG produces more and more output, the density of randomness in the output goes asymptotically to zero.

The rand() or random() functions provided by typical computer languages are in fact basic pseudo-random generators of this kind.

The concept of PRNG cannot stand on its own, because any PRNG requires a seed. We have to ask where the seed came from. If it came from some other PRNG, we have to ask where the other seed came from. The only way to avoid infinite regress is to use a HRNG to provide a seed.

To be clear: A HRNG can exist without a PRNG, but not vice versa.

Higher security can be obtained if the PRNG is wrapped within a larger system that re-seeds the PRNG every so often. The overall system could be considered an “improved” PRNG, but really it sits in a gray area between the basic PRNG and a TRNG. I prefer to call it a Stretched Random Generator (SRNG).

The SRNG is a hybrid, with some internal state and some physical source of randomness. This may be able to produce symbols more copiously than a purely stateless system, while remaining highly resistant to cryptanalysis.

7  Randomness in the Raw Data

7.1  Users and Analysts

Note the contrast:

7.2  Principles versus Intution

This section exists mainly to dispel some misconceptions. Sometimes people decide that the the laws of physics do not apply to electronics, or decide on the basis of “intuition” that the noise in their audio system is negligible. If you do not suffer from these particular misconceptions, feel free to skip to section 7.3.

  1. For many decades, ever since the dawn of the computer age, mathematicians and algorithm designers have tried and failed to find an algorithm for generating randomness. There is a solution to this problem, but the solution requires thinking outside the “mathematical algorithm” box:
  2. Intuition based on digital electronics can be misleading when extended to analog electronics.

    A digital logic gate has considerable noise immunity. Noise, if it is not too large, is eliminated at every step. This requires both nonlinearity and power dissipation.   A linear analog system cannot eliminate noise in the same way. The fundamental equations of motion do not allow it.

  3. Humans often do not pay much attention to noise. In an ordinary home situation, there is a considerable amount of background noise coming from household appliances, air conditioning, the wind in the trees outside, et cetera. In a concert hall, there is noise from audience members fidgeting in their seats. There is also noise from your own breathing and heartbeat. People learn to ignore these things. It may be that you have never noticed the thermal noise that is present in audio systems, but it is there nevertheless.
  4. In a typical audio system, the input impedance is considerably greater than the output impedance. The noise level is proportional to the resistance, as has been well observed and well explained since 1928 (reference 7 and reference 8). When line-out is connected to line-in, the input impedance is largely shorted out ... but if the same line-in port is open-circuited, we see the full input impedance, so there is two or three orders of magnitude more noise than you might have expected.
  5. Consumer-grade microphones tend to have high impedance and/or low signal levels. Therefore it is fairly common to find audio systems that can apply additional analog gain to the mic-in signal, upstream of the analog-to-digital converter. This can increase the amount of observable noise by another two orders of magnitude.

7.3  Choice of Input Device

In principle, there are many physical devices that could be used as the source of raw randomness. We prefer the audio system (the “soundcard”) because such things are cheap, widely available, and easy to calibrate. We have a provable lower bound on the rate at which they produce randomness, and this rate in sufficient for a wide range of applications. This provides a way to make typical hosts very much more secure than they are now.

In contrast, a video frame-grabber card would produce randomness more copiously, but would be much more costly. Calibration and quality-assurance monitoring would be much more difficult.

There are many other sources that “might” produce usable randomness, but for which a provable nonzero lower bound is not available.

7.4  Lower Bounds are Necessary

As mentioned before, we need a lower bound on the amount of randomness per sample. This is needed to ensure that the hash function’s buffer contains “enough” randomness, as discussed in reference 1.

Let us briefly explore the consequences of mis-estimating the amount of randomness in the raw data:

1a) If we make a slight overestimate, it means that the hash function, rather than putting out 2160 different outputs, will put out some rather smaller ensemble. In mild cases this may not be noticeable in practice.
1b) A major overestimate may allow an adversary to focus an attack on the few output-words being produced.
2a) If we modestly underestimate the amount of randomness in the raw data, there is no serious harm done.
2b) A major underestimate will reduce the total randomness production capability of the machine. This means the processes that consume random symbols may block, waiting for the HRNG to produce a usable output.

7.5  Other Contributions to the Audio Signal

We looked at the power spectrum of the signal produced by a number of different sound cards. Based on these observations, and on general engineering principles, we can list various hypotheses as to what physical processes could be at work, namely:

  1. Johnson noise, due to thermal fluctuations in the front-end input resistor.
  2. Johnson noise in other resistors and dissipative elements later in the processing chain.
  3. Shot noise and generation/recombination noise, arising in small structures because electrons are not infinitely small.
  4. 1/f noise.
  5. Hum at powerline frequency and harmonics thereof.
  6. Interference from other subsystems such as video cards and switching power supplies.
  7. Sums of the above.
  8. Intermodulation. For example, intermodulation occurs when we digitize a superposition of fairly-low-amplitude hum plus very-low-amplitude high-frequency noise. That’s because the digitizer is sensitive to the high-frequency noise component only when the hum component is near one of the digitizer steps.
  9. Various filter functions applied to the above.
  10. Deceptive signals injected by an adversary.

Some of these are undoubtedly fine sources of randomness. The only problem is that except for item #1, we cannot calculate their magnitude without a great deal more engineering knowledge about the innards of the card. Using these sources might produce a HRNG that is correct, but not provably correct. Since we want something that is provably correct, we simply ignore these other contributions. They can’t hurt, and almost certainly provide a huge margin of safety.

That is: We need randomness in the raw data, and we need to know a provable lower bound on the amount of randomness. Turbid consistently applies the principle that having extra randomness is OK, and underestimating the amount of randomness is OK. This may give us less-than-optimally efficient use of the available randomness, but it guarantees correctness. Turbid consistently chooses correctness over efficiency.

7.6  Examples of Soundcard Properties

Just to help you get the lay of the land, here are the key characteristics of a few soundcards. The older cards are of historical interest only. More recent cards perform better.

Card Q / µV RIn / Ω Kout   B   N σ / µV s / bits ROut / Ω
Deskpro onboard 6.6  8800.1541850048000 1.630.3 
Delta-1010 (pro)  0.22 3020.22336500960001.284.65 192
Delta-1010 (con)  1.31110001.2436500960002.553.041005
Thinkpad cs4239 (Line) 19.6 7900.037 9000441001.070 
Thinkpad cs4239 (Mike)  1.6510100.038 7300441001.091.58 
Extigy (Line) 22.520200.32718000480002.47e-5 
Extigy (Mike) 12.4 6100.33212000480001.13e-7 
Table 2: Soundcard Properties

Note that the CT4700 reports itself as “ENS1370” so it uses the ens1370.ctl control file.

8  The Role of Measurement and Testing

8.1  Issues of Principle; Cat and Mouse

By way of analogy, let us consider the relationship between code-breakers and code-makers. This is a complex ever-changing contest. The code-breakers make progress now and then, but the code-makers make progress, too. It is like an “arms race” but much less symmetric, so we prefer the term cat-and-mouse game.

At present, by all accounts, cryptographers enjoy a very lopsided advantage in this game.

There is an analogous relationship between those who make PRNGs and those who offer tools to test for randomness. The PRNG has a hidden pattern. Supposedly the tester “wants” the pattern to be found, while the PRNG-maker doesn’t.

We remark that in addition to the various programs that bill themselves as randomness-testers, any off-the-shelf compression routine can be used as a test: If the data is compressible, it isn’t random.

To deepen our understanding of the testing issue, let’s consider the following scenario: Suppose I establish a web server that puts out pseudo-random bytes. The underlying PRNG is very simple, namely a counter strongly encrypted with a key of my choosing. Each weekend, I choose a new key and reveal the old key.

The funny thing about this scenario is the difference between last week’s PRNG and this week’s PRNG. Specifically: this week’s PRNG will pass any empirical tests for randomness that my adversary cares to run, while last week’s PRNG can easily be shown to be highly non-random, by means of a simple test involving the now-known key.

As a modification of the previous scenario, suppose that each weekend I release a set of one billion keys, such that the key to last week’s PRNG is somewhere in that set. In this scenario, last week’s PRNG can still be shown to be highly non-random, but the test will be very laborious, involving on the order of a billion decryptions.

Note that to be effective against this PRNG, the randomness-testing program will need to release a new version each week. Each version will need to contain the billions of keys needed for checking whether my PRNG-server is “random” or not. This makes a mockery of the idea that there could ever be an all-purpose randomness tester. Even a tester that claims to be “universal” cannot be universal in any practical sense.2 An all-purpose randomness tester would be tantamount to an automatic all-purpose encryption-breaking machine.

8.2  Necessary but Not Sufficient

To paraphrase Dijkstra: Measurement can prove the absence of randomness, but it cannot prove the presence of randomness. More specifically, any attempt to apply statistical tests to the output of the RNG will give an upper bound on the amount of randomness, but what we need is a lower bound, which is something else entirely.

As described in reference 1, we can calculate the amount of randomness in a physica process, based on a few macroscopic physical properties of the hardware. It is entirely appropriate to measure these macroscopic properties, and to remeasure them occasionally to verify that the hardware hasn’t failed. This provides a lower bound on the randomness, which is vastly more valuable than a statistically-measured upper bound.

As discussed in reference 1, tests such as Diehard (reference 9) and Maurer’s Universal Statistical Test (reference 10) are far from sufficient to prove the correctess of turbid. They provide upper bounds, whereas we need a lower bound.

When applied to the raw data (at the input to the hash function) such tests report nothing more than the obvious fact that the raw data is not 100% random. When applied to the output of the hash function, they are unable to find patterns, even when the density of randomness is 0%, a far cry from the standard (100%) we have set for ourselves.

A related point: Suppose you suspect a hitherto-undetected weakness in turbid. There are only a few possibilities:

The foregoing is a special case of a more general rule: it is hard to discover a pattern unless you have a pretty good idea what you’re looking for. Anything that could automatically discover general patterns would be a universal automatic code-breaker, and there is no reason to believe such a thing will exist anytime soon.

There are some tests that make sense. For instance:

On the one hand, there is nothing wrong with making measurements, if you know what you’re doing. On the other hand, people have gotten into trouble, again and again, by measuring an upper bound on the randomness and mistaking it for a lower bound.

The raison d’etre of turbid is that it provides a reliable lower bound on the density of randomness, relying on physics, not relying on empirical upper bounds.

8.3  Actual Measurement Results

We ran Maurer’s Universal Statistical Test (reference 10) a few times on the output of turbid. We also ran Diehard. No problems were detected. This is totally unsurprising, and it must be emphasized that we are not touting this a serious selling point for turbid; we hold turbid to an incomparably higher standard. As discussed in section 8.2, for a RNG to be considered high quality, we consider it necessary but nowhere near sufficient for it to pass statistical tests.

8.4  Summary

To summarize this subsection: At runtime, turbid makes specific checks for common failures. As discussed in section 8.3 occasionally but not routinely apply general-purpose tests to the output.

We believe non-specific tests are very unlikely to detect deficiencies in the raw data (except the grossest of deficiencies), because the hash function conceals a multitude of sins. You, the user, are welcome to apply whatever additional tests you like; who knows, you might catch an error.

9  Error Detection versus Error Concealment

9.1  Basics

We must consider the possibility that something might go wrong with our randomness generator. For example, the front-end transistor in the sound card might get damaged, losing its gain (partially or completely). Or there could be a hardware or software bug in the computer that performs that hashing and control functions. We start by considering the case where the failure is detected. (The other case is considered in section 9.2.)

At this point, there are two major options:

We can ornament either of those options by printing an error message somewhere, but experience indicates that such error messages tend to be ignored.

If the throttling option is selected, you might want to have multiple independent generators, so that if one is down, you can rely on the other(s).

The choice of throttling versus concealment depends on the application. There are plenty of high-grade applications where concealment would not be appropriate, since it is tantamount to hoping that your adversary won’t notice the degradation of your random generator.

In any case, there needs to be an option for turning off error concealment, since it interferes with measurement and testing as described in section 8.

9.2  Combining Generators

We can also try to defend against undetected flaws in the system. Someone could make a cryptanalytic breakthrough, revealing a flaw in the hash algorithm. Or there could be a hardware failure that is undetected by our quality-assurance system.

One option would be to build two independent instances of the generator (using different hardware and different hash algorithms) and combine the outputs. The combining function could be yet another hash function, or something simple like XOR.

10  Context, History and Acknowledgments

For a discussion of the Fortuna class of random generators, and how that contrasts with turbid, see reference 11.

For a discussion of the Linux device driver for /dev/random and /dev/urandom, see reference 12.

There is a vast literature on hash functions in general. Reference 13 is a starting point. Reference 14 is a useful literature survey. Reference 15 was an influential early program for harvesting randomness from the audio system; it differs from turbid in not having calibration features. Reference 16 uses the physics of a spinning disk as a source of randomness.

Methods for removing bias from a coin-toss go back to von Neuman; see reference 17 and references therein; also reference 18.

Reference 19 suggests hashing 308 bits (biased 99% toward zero) and taking the low-order 5 bits of the hash to produce an unbiased result. This is correct as far as it goes, but it is inefficient, wasting 80% of the input adamance. Presumably that reflects an incomplete understanding of the hash-saturation principle, in particular the homogeneity property discussed in reference 1, which would have led immediately to consideration of wider hashcodes. There is also no attempt to quantify how closely the results come to being unbiased.

Thanks to Steve Bellovin, Paul Honig, and David Wagner for encouragement, incisive questions, and valuable suggestions.

11  Conclusions

Practical sources of randomness can be characterized in terms of the adamance density. The raw sources have an adamance density greater than zero and less than 100%. The methods disclosed here make it possible to produce an output that approaches 100% adamance density as closely as you wish, producing symbols that are random enough for any practical purpose. It is important to have a calculated lower bound on the adamance density. In contrast, statistical tests provide only an upper bound, which is not what we need.

It is possible to generate industrial-strength randomness at very low cost, for example by distilling the randomness that is present in ordinary audio interfaces.

A  The Definition of Randomness and Surprisal

A.1  Introduction and Intuitive Examples

In order to refine our notions of what is random and what is not, consider the following million-character strings. We start with

Example E1
: “xxxxxxxx...xx”      (a million copies of “x”)

That seems distinctly non-random, very predictable, not very complex. Next, consider the string

Example E2
: “xyxyxyxy...xy”      (half a million copies of “xy”)

That seems also non-random, very predictable, not very complex.

Example E3
: “31415926...99”      (the first million decimal digits of π/10)

That is somewhat more interesting. The digits pass all known tests for randomness with one narrow exception (namely tests that check for digits of π). However, they must still be considered completely predictable. Finally, consider the two strings

Example E4
: “AukA1sVA...A5”     
Example E5
: “Aukq1sVN...G5”     

The million-character string represented by example E5 is, as far as anybody knows, completely random and unpredictable. Example E4 is very similar, except that the letter “A” appears more often than it would by chance. This string is mostly unpredictable, but contains a small element of nonrandomness and predictability.

A.2  Definitions

Following Solomonoff (reference 20 and reference 21) and Chaitin (reference 22) we quantify the surprisal of a string of symbols as follows:

Let z be a string of symbols. The elements of z are denoted zk and are drawn from some alphabet Z. The number of symbols in the alphabet is denoted #Z.

Let PC(z) be the probability that computer programs, when run on a given computer C, will print z and then halt. The surprisal is defined to be the negative logarithm of this probability, denoted $C(z) := − logPC(z). In this paper we choose to use base-2 logarithms, so that surprisal is measured in bits. Surprisal is also known as the surprise value or equivalently the unexpectedness. It is intimately related to entropy, as discussed below.

Note that we are using one probability distribution (the probability of choosing a computer program) to derive another (the probability PC(z) of a symbol string). To get this process started, we need to specify a method for choosing the programs. Better yet, using the notions of measure theory, we realize that probability need not involve choosing at all; instead all we really need is a method for assigning weight (i.e. measure) to each of the possible programs. Here is a simple method:3 Consider all possible programs of length exactly L* (measured in bits, since we are representing all programs as bit-strings). Assign each such program equal measure, namely 2L*. Then test each one and count how many of them produce the desired output string z and then halt, when run on the given computer C. The measure of a string is the total measure of the set of programs that produce it. A string that can be produced by many different programs accumulates a large probability.

In this construction, a short program X (i.e. one which has a length L(X) less than L*) is not represented directly, but instead is represented by its children, i.e. programs formed from X by padding it with comments to reach the required length L*. There are at least 2L*L(X) ways of doing the padding, and each way contributes 2L* to the total measure. This means that a string z that can be produced by a short programs X will have a probability at least 2L(X), no matter how large L* is. We take the limit as L* becomes very large and use this in the definition of PC(z).

Usually the probability PC(z) is dominated by (the children of) the shortest program that produces z. Therefore some people like to use the length of this shortest program as an estimate of the surprisal. This is often a good estimate, but it should not be taken as the definition.

The explicit dependence of PC(z) on the choice of computer C calls attention to an element of arbitrariness that is inherent in any definition of surprisal. Different people will assign different values to “the” surprisal, depending on what resources they have, and on what a priori information they have about the situation.

In an adversarial situation such as cryptography, we suggest that probabilities be defined in terms of the adversary’s computer. If you have multiple adversaries, they can be treated as a single adversary with a parallel computer.

A.3  Properties of the Surprisal

In this section, for conciseness, we drop the subscript C ... but you should remember that the probabilities and related quantities are still implicitly dependent on the choice of C.

Upper bound:   If z is a string of bits, the surprisal $(z) can never exceed the length L(z) by more than a small amount, because we can write a program that contains z and just prints it verbatim. More generally, if z is a string of symbols, $(z) can never exceed L(z) log(#Z) by more than a small amount. Again, that’s because we can write a program that contains a literal representation of z and just prints it verbatim.
Tighter Upper Bound:   The surprisal cannot exceed the length of the compressed representation4 of z by more than a bounded amount, because we can write a program that contains the compressed representation and calls the uncompress utility.
Lower Bound:   With very few exceptions, the surprisal $(z) of any string z cannot be not much less than L(z) log(#Z). You can see why this must be so, using a simple pigeon-hole argument: Consider all bit-strings of length 500, and suppose that a certain subset contains 1% of the strings but still has 490 bits of entropy. Well, then we are talking about 1% of 2500 strings, each having probability 2−490 — which adds up to more than 100% probability. Oops.

A.4  Limitations

A relatively minor objection to this definition of surprisal is that PC(z) includes contributions from arbitrarily-long programs. That is a problem in theory, but in practice the sum is dominated by relatively short programs, and the limit converges quickly.

A much more serious objection is that even for modest-sized programs, the definition runs afoul of the halting problem. That is, there may well be programs that run for a million years without halting, and we don’t know whether they would eventually produce string z and then halt. This means the surprisal, as defined here, is a formally uncomputable quantity.

We will duck both these issues, except for the following remarks.

A.5  Surprise Density

In this section we shift attention from the unpredictability of strings to the unpredictability of the individual symbols making up the string.

Let us re-examine the examples given at the beginning of this section. Example E5 has surprisal $(z) very close to L(z) log(#Z). We classify strings of this sort as absolutely-random, by which we mean algorithmically-random.

Examples E1, E2, and E3 all have surprisal much less than their length. These strings are clearly not absolutely-random.

The interesting case is example E4. Intuitively, we think of this as “somewhat unpredictable” but more predictable than E5. To make this notion quantitative, we introduce the notion of surprise density. The following quantity tells us the surprise density of the string z in the region from i to j:

σj|i(z) := 
$(Front(zj)) − $(Front(z,i))

where Front(z,i) is the substring formed by taking the first i symbols of the string z.

The surprise density for examples E1, E2, and E3 is zero for any region not too near the beginning. The surprise density for example E5 is as large as it possibly could be, namely 100% of log(#Z). Example E4 illustrates the interesting case where the surprise density is quite a bit greater than zero, but not quite 100% of log(#Z).

As mentioned in section 6.2, we consider the unadorned term “random” to be ambiguous, because it means different things to different people. Some people think “random” should denote 100% surprise density, and anything less than that is “non-random” even if it contains a great deal of unpredictability. Other folks think that “random” refers to anything with an surprisal density greater than zero, and “non-random” means completely predictable. Yet other folks extend “random” to include pseudo-random strings, as long as they are “random enough” for some application of interest. Even some professionals in the field use the word this way; reference 23 and the more recent reference 24 speak of “deterministic random number generators”, although it could be argued that it is impossible in principle for a process to be both deterministic and random. The randomness of a PRNG comes from the seed. Calling a PRNG “deterministic” seriously understates the importance of the seed.

In the space of all possible strings, almost all strings are absolutely-random, i.e. are algorithmically-random, i.e. contain 100% surprise density. However, the strings we find in nature, and the strings we can easily describe, all have very little surprise density. We can summarize this by saying: “All strings are absolutely-random, except for the ones we know about”.

We can use similar ideas to describe PRNGs and contrast them with HRNGs.

Most of modern cryptography revolves around notions of computational intractability. (I’m not saying that’s good or bad; it is what it is.   In contrast, there is another set of notions, including adamance, entropy, and the related notion of unicity distance, that have got nothing to do with computational intractability.

The long strings produced by a good PRNG are conditionally compressible in the sense that we know there exists a shorter representation, but at the same time we believe them to be conditionally incompressible in the sense that the adversary has no feasible way of finding a shorter representation.   The long strings produced by a HRNG are unconditionally, absolutely incompressible. Specifically, the set of strings collectively cannot be compressed at all. As for each string individually, it is not compressible either, except possibly for trivially rare cases and/or trivial amounts of compression.

B  Soundcard Quirks

B.1  General Observations

This section is a collection of details about certain soundcards. (You can refer back to table 2 for a summary of the performance of a few soundcards. Also, section 4 provides step-by-step procedures that apply to “generic” soundcards.)

Soundcards generally include mixer functions. You have to initialize the mixer, to set the sliders on the features you want and/or to set the switches to bypass the features you don’t want. This is a bit of a black art, because the soundcards typically don’t come with any “principles of operation” documentation.

Graphical user interface (GUI) programs such as alsamixer may help you make some settings, but may not suffice to fully configure the mixer. All too often, the card has “routing switches” and other features that the GUI doesn’t know about. A reasonable procedure that gives complete control is to invoke “alsactl -f record.ctl store”, edit record.ctl, then “alsactl -f record.ctl restore”. Or you can use amixer to fiddle one setting without disturbing others.

B.2  Intel High-Definition Audio

The Intel High-Definition Audio spec (reference 25) describes the codec but says remarkably little about mixer functions.

B.3  Extigy

This is yet another example of hype and deception from Creative Labs. The device itself, the box it comes in, and the advertising all prominently mention USB, 24-bit audio, and 96 kHz sampling rates. However, according to all available evidence, the device will not transmit 24-bit audio via USB to the DACs or from the ADCs (no matter what the data rate). It also will not transmit 96 kHz samples via USB to the DACs or from the ADCs (no matter what the word size).

Apparently the only way in which the Extigy can deal with 24 bits or 96 kHz is by shipping it digitally (e.g. via S/PDIF) to/from some other device.

It also has astonishingly poor input sensitivity. It’s about 5 times worse than a typical cheap sound card, and about 100 times worse than a decent 24-bit soundcard, as you can see from table 2.

The Extigy is no good for present purposes. I recommend you do not buy this product. Many other Creative Labs products have problems, too, so in fact I recommend that you never buy anything made by them.

B.4  M-Audio Delta 1010

This is a high-end system, with eight analog inputs, eight analog outputs, 96000 frames per second, and 24 bits of resolution. M-Audio was previously known as Midiman.

Each of the eight analog input and each of the eight analog outputs has a pushbutton that I call the “pro/con” switch. Professional studio equipment (pro) runs at lower impedances and lower signal levels than consumer-grade home audio equipment (con). That is, the pro setting (switch pushed in) means lower impedance and lower open-circuit voltage on output, and lower impedance and greater voltage-sensitivity on input. Whether you care about the lower impedance depends on the impedance of whatever the card is hooked to.

The card does not have an 8×8 or 10×10 mixer board. It is apparently 10×2. We bypass it, as discussed below.

The alsa driver always expects 10 channels of output: 8 analog plus 2 S/PDIF. If you are only interested in a subset thereof, you have to pad things so the buffer has 10 samples per frame anyway. Similarly, the alsa driver always produces 12 samples per frame on input: 8 analog plus 2 S/PDIF plus 2 from the mixer. For present purposes we use the 8 analog inputs and ignore the others, since they contain no additional randomness. Therefore you must pass the --channel-mask 255 option to turbid; otherwise the program will have less randomness than it thinks it has, by a factor of 8/12ths.

Each of the eight analog inputs is always routed directly to its own DAC, so the PCM input functionality is always unaffected by the sliders and switches on the mixer.

The routing can be set so that each of the eight analog outputs is fed directly from the PCM data. This is recommended. This makes the PCM output functionality independent of the sliders on the mixer.

B.5  Thinkpad 600

Getting this to work requires the following tricks:

As usual, you will need the .ctl file (included in the turbid bundle) plus the settings in table 2.

The same jack doubles as a Line-In port and Mike-In port, depending on software settings. Unlike the typical Mike-In port as described in section 4.3, this Mike-In port is stereo and does not provide power to the microphone. This would be bad if you wanted to connect a microphone that needed power, but it’s ideal for our purposes. Basically it’s like an ordinary Line-In port, with higher sensitivity.

C  Bandwidth Measurement

For reasons discussed in reference 8, the Johnson noise in any Ohmic resistor is nice white noise. Voltage samples are independent and identically distributed (IID), according to a Gaussian normal distribution with zero mean and variance given by

d⟨V2 = k T   R   df

where k is Boltzmann’s constant, T is the temperature (300 K), B is the power-averaged bandwidth as defined in section 4.7, and the brackets ⟨⋯⟩ denote the ensemble average.

The principle of the bandwidth measurement is as follows: Equation 2 tells us the variance of the voltage right at the resistor. The quantity we actually measure has been amplified, according to some gain function G which depends on frequency. So a more complete description is

σo2 = 

 4 k T R |G(f)|2 df              (3)

where the subscript o on σo indicates this is the quantity we observe, and fmax is some “large enough” frequency; the real, relevant bandwidth B is hidden in G, which goes to zero at large frequencies. Since G is presumably an even function of frequency, we can rewrite this as

σo2 = 

− fmax
 2 k T R |G(f)|2 df              (4)

and in a discrete (sampled) system this becomes

σo2 = 
 2 k T R |G(k Δf)|2 Δf              (5)

where we identify the sampling frequency N = MΔf = 2 fmax, covering the whole range of positive and negative frequencies. We can write this in the suggestive form

σo2 = 4 k T R G*2 B              (6)

where G*2 is some nominal mid-band power gain, if we identify

B = 
 |G(k Δf)|2 Δf              (7)

and we call B the power-averaged bandwidth.

By definition the gain G is Vo / Vr, where Vr is the voltage right at the resistor, and Vo is what we observe.

B = 
|Vo(k Δf)|2 
|Vr(k Δf)|2 
Δf              (8)

This expression applies to any Vo/Vr ratio, whether Vr comes from thermal noise or from any other source.

We choose to inject a non-thermal probe signal Vr. It is constructed to be “white” – i.e. it has constant magnitude independent of frequency. That allows us to pull it outside the summation:

B = 
G*2 |Vr|2
|Vo(k Δf)|2 Δf
 ½ M Δf   
G*2 |Vr|2 Δf
|Vo(k Δf)|2 Δf

and by Parseval’s theorem we can rewrite that frequency-domain expression as a time-domain expression

B = 
 ½ M Δf   
G*2 |Vr|2 Δt
|Vo(k Δt)|2 Δt

It is easy to get a good lower bound on B. At compile time, we know the root-mean-square (RMS) magnitude of the internal representation of the probe signal; by construction it is -15dBFS. Then given K and R (and Zref) we know Vr2 in the passband. At frequencies outside the passband this voltage will roll off a little, because of the anti-aliasing filter in the soundcard output stage. This will cause us to under-estimate B by a little, which is harmless. Finally we determine ⟨Vo2⟩ by summing the squares of the observed voltage samples and multiplying by the appropriate calibration factors (involving Q).

You might have thought based on the form of equation 3 that we would need to take a Fourier transform, but we don’t, because of Parseval’s theorem.

D  Units: IBU and dBx

In the turbid software, whenever possible, voltages are measured in SI volts.

If that is not possible, e.g. when representing raw readings as they come from the hardware, before calibration, anything that is linear in voltage is measured in input buffer units such that 1.0 ibu is full scale i.e. the largest thing that can be represented.

Meanwhile, anything that is measured in dBx is measured relative to a reference level that is 15 dB below full scale. This means that a rail-to-rail square wave (such as might result from extreme clipping) peaks at 1.0 ibu, has an RMS of 1.0 ibu, and a power level of +15 dBx. Similarly a sine wave that peaks at exactly 1.0 ibu has an RMS of 0.707 ibu, and a power level of +12 dBx.

As always, a dB is a power measurement. This includes dBx. It does not depend directly on the peak voltage. As a corollary, the peak voltage of a zero-dBx sine wave √2 higher than the top of a zero-dBx square wave.

E  Calculating Impedances

E.1  Output

Suppose we have an output circuit with some voltage source and some Thévenin equivalent series resistance Z that we want to determine. We connect some external load resistance and measure the voltage. Then we connect some different load, and measure the voltage again. If one of the loads has infinite resistance, that gives us a direct measure of the open-circuit voltage.

On the ith measurement, the voltage is:

Vi = 
Ri + Z


(1 + 

So if we plot the inverse voltage (1/Vi) as a function of the inverse resistance (1/Ri), we should get a straight line with slope Z/Voc and intercept 1/Voc.

E.2  Input

The output port plus the QA box form a linear circuit with some known open-circuit voltage Vj and Thévenin equivalent series resistance Rj. The input port has some input impedance Z and some gain G, both of which we need to determine. We can observe the digitized input signal Xj.

Xj = 
Z + Rj


 (1 + 

So if we plot the observed inverse gain Vj/Xj as a function of the calibration network equivalent resistance Rj, we should get a straight line with slope 1/GZ and intercept 1/G. You can easily check this formula in the limiting cases Rj=0 and Rj→∞.

F  Noise in Voltage Dividers

For some perspectives on Johnson noise and the Nyquist formula, including what happens when there is a frequency-dependent gain, see reference 26.

This has some interesting implications and ramifications for our randomness generator. There are a couple of regimes to consider:

G  ALSA Timing

Three concepts:

G.1  Unlinked

Some observations:

Some ALSA timing is shown in figure 13. This is peculiar in a couple of ways. For one thing, the capture i/o buffer is a different size from the playback i/o buffer. Also, the first readi has been artificially delayed by 10 milliseconds relative to the first writei, just to clarify the relationship between the two streams.

Figure 13: Odd Timing

Each rectangle represents an i/o call. The bottom of the rectangle is the number of frames (input or output) before the call, and the top is the number of frames (input or output) after the call. Similarly the left edge is the time at the start of the call, and the right edge is the time upon return from the call.

In figure 13 the command was turbid show timing comb 0 freq 1000 ochan 0. You can see that the playback process has a lot of work to do, so the corner of one red rectangle does not meet the corner of the next. In contrast, the capture process has very little work to do, so the rectangles very nearly meet corner-to-corner.

G.2  Linked

Now consider what happens when we use snd_pcm_link(...) to synchronize the two streams. At the time of the first write to the playback device, the playback stream starts running (obviously) ... and the capture stream also starts running (because it is linked). The artificial 10 ms delay is still there, but it affects only the left edge of the first rectangle, not the right edge.

Figure 14: Linked Timing

H  References

John Denker
“Generating a Random Distribution”

John Denker,
“Spreadsheet to Plot Noise Time Series and Spectra”

Advanced Linux Sound Architecture

Feynman, Leighton and Sands,
The Feynman Lectures on Physics
Addison-Wesley, Reading, MA (1963).

All of Volume I is available online:

Horowitz and Hill
The Art of Electronics

Charles Surya,
“Network Noise and Intermodulation Distortion”
Chapter 3 of Lecture Notes,

J. Johnson,
“Thermal Agitation of Electricity in Conductors”
Phys. Rev. 32, 97 (1928) http://prola.aps.org/abstract/PR/v32/i1/p97_1

H. Nyquist,
“Thermal Agitation of Electric Charge in Conductors”
Phys. Rev. 32, 110 (1928) http://prola.aps.org/abstract/PR/v32/i1/p110_1

George Marsaglia, “A battery of tests for random number generators”.

Ueli Maurer,
“A Universal Statistical Test for Random Bit Generators”.
Advances in Cryptology - CRYPTO ’90,
Lecture Notes in Computer Science, Springer-Verlag, 537, pp. 409-420 (1990).

Final version in J. of Cryptology (1992).
Abstract: http://www.crypto.ethz.ch/cgi-bin/show?what=abstractlabel=Maurer90b
Code: http://matrix.irb.hr/~mario/ftp/pub/random/utest.tgz

John Denker,
“Discussion of the Fortuna Randomness Generator”

Theodore T’so,
“random.c – A strong random number generator”

Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone,
Handbook of Applied Cryptography,
CRC Press (1996). http://www.cacr.math.uwaterloo.ca/hac/
(The strict avalanche property is discussed on page 277.)

Terry Ritter,
“Random Number Machines: A Literature Survey”

Peter Gutman,
“Software Generation of Practically Strong Random Numbers”,

Don Davis, Ross Ihaka, and Philip Fenstermacher,
“Cryptographic Randomness from Air Turbulence in Disk Drives”,
in Proceedings of Crypto 94, Springer-Verlag Lecture Notes in Computer Science, No. 839 (1994).

Yuval Peres,
“Iterating von Neumann’s Procedure for Extracting Random Bits”
The Annals of Statistics pp 590-597 (1992).

Michael Mitzenmacher,
“Tossing a Biased Coin”

D. Eastlake, 3rd, S. Crocker, J. Schiller,
“Randomness Recommendations for Security” (1994)

R.J. Solomonoff,
“A Preliminary Report on a General Theory of Inductive Inference”,
Report V-131, Zator Co., Cambridge, Mass (4 Feb 1960).

R.J. Solomonoff,
“A Formal Theory of Inductive Inference I and II”,
Information and Control 7, 1-22 & 224-254.

Gregory J. Chaitin,
Algorithmic Information Theory,
Cambridge University Press (1987). http://www.cs.auckland.ac.nz/CDMTCS/chaitin/cup.pdf


Elaine Barker and John Kelsey,
NIST Special Publication: “Recommendation for Random Number Generation Using Deterministic Random Bit Generators”

Intel Corp.,
“High Definition Audio Specification” Revision 1.0a (17 June 2010)

John Denker,
“Perspectives on Johnson Noise”

F. Chabaud, and A. Joux,
“Differential collisions in SHA-0”
in CRYPTO’98, H. Krawczyk ed., LNCS 1462, pp 56–71, (1998)

Charanjit S. Jutla and Anindya C. Patthak
“Is SHA-1 conceptually sound?”
Cryptology ePrint (2005).

B. Yurke and J.S. Denker, “Quantum network theory”
Physical Review A - General Physics, 3rd Series 29, 1419–1437 (March 1984)

Ian Goldberg and David Wagner,
“Randomness and the Netscape Browser”
(Break of Netscape SSL PRNG.)

The VENONA project
especially http://www.nsa.gov/docs/venona/venona_remember.html

Common Vulnerabilities and Exposures,
“OpenSSL 0.9.8c-1 up to versions before 0.9.8g-9 on Debian-based operating systems uses a random number generator that generates predictable numbers, which makes it easier for remote attackers to conduct brute force guessing attacks against cryptographic keys.”

Bryn Dole, Steve Lodin, and Eugene Spafford
“Misplaced Trust: Kerberos 4 Session Keys”
in The Proceedings of the 1997 Symposium on Network and Distributed Systems Security

Dan Goodin,
“Fatal crypto flaw in some government-certified smartcards makes forgery a snap”
Ars Technica (16 Sept 2013)

John Viega and Gary McGraw,
Building Secure Software, Addison-Wesley (2001).
See especially the section “How to cheat in online gambling” in chapter 10.

Dan Goodin,
“Google confirms critical Android crypto flaw used in $5,700 Bitcoin heist”
Ars Technica (14 Aug 2013)

John Kelsey, David Wagner, Bruce Schneier, and Chris Hall
“Cryptanalytic Attacks on Pseudorandom Number Generators”

J. Kelsey, B. Schneier, and N. Ferguson,
“Yarrow-160: Notes on the Design and Analysis of the Yarrow Cryptographic Pseudorandom Number Generator”,
in Sixth Annual Workshop on Selected Areas in Cryptography, Springer Verlag, August (1999).

National Security Agency
“TEMPEST Fundamentals”.

You can compromise on this, if you are sure the headphones aren’t affecting the measurements. This might be OK if you have a super low-impedance output and super high-impedance headphones ... but it’s not very likely.
I do not wish to discuss other less-practical more-technical notions of universality, which are very widely misunderstood.
Other methods are possible. It can be shown that the choice of method doesn’t matter much.
As always, we have to choose some units for measuring length and surprisal, and use those units consistently. This is tantamount to choosing a consistent base for our logarithms and exponentials. A conventional and convenient unit is the bit.
Copyright © 2005 jsd