The "entropy per bit" value is misleading (for non-bit-oriented entropy sources) #13

joshuaehill · 2018-09-20T16:11:16Z

The "Min Entropy per bit" value that the tool outputs is likely to be misinterpreted and abused.

The tool supplies an average value for the min entropy per bit (the per-symbol min entropy divided by the bits per symbol). It is likely that folks will make the assumption that entropy is uniformly distributed throughout the symbol (a wildly incorrect assumption for many sources!) and attempt to sub-divide the symbols and credit the proportional entropy for the sub-portion of the symbol that is being used.

dj-on-github · 2018-09-20T20:50:16Z

It reports both. I'm not sure why this matters - the entropy per bit is useful for getting normalized results for comparisons. From a certification perspective, you want to show that you are meeting the input requirements of the extractor and all the vetted extractors take multi-symbol inputs. So (number_of_input_symbols_to_ext * entropy_per_symbol) == (number_of_input_bits_to_ext * entropy per bit).

joshuaehill · 2018-09-20T21:24:12Z

That last equality statement is not true for many sources if you truncate the samples. It is true if complete samples are used.

For example, one common scheme is to sample a fast running counter (e.g., a TSC value), where the sampling occurs as a consequence of some event whose exact timing is difficult for an attacker to guess. If you look at how the min entropy is distributed in the samples from such a system, the low-order bits are often more difficult for an attacker to predict than the higher-order bits (the high order bits are often basically wholly known to any suitably informed attacker). Thus the low order bits tend to have more min entropy than the high order bits.

Providing a min entropy assessment in terms of a per-bit average suggests that one can freely subdivide a sample and credit each bit of the sample as containing the stated average. If one includes the entire sample, then (by definition) you get the total sample entropy, and the equality you state is clearly true. If you instead subdivide the sample, it's hard to comment about the entropy of the part that remains, and for systems where min entropy isn't uniformly distributed, it's very likely that the number of bits multiplied by the per-bit average min entropy won't be the correct value that should be credited.

I have witnessed this occurring "in the wild" on several instances, and the results are sometimes unfortunate.

dj-on-github · 2018-09-20T21:34:14Z

Ah. We wouldn’t touch that sort of source with a bargepole. How would tou use that for certification? It’s also why the scope of data that the online test is important, so non stationarity doesn’t lead to over assumption of the extractor input. I still want to see the per bit entropy so I can compare analysis at different bit widths. The output formatting is in the works - to try and match nists format and also provide a useful csv format similar to djent’s So the per bit entropy numbers will be a function of what you ask for.

…

On Thu, Sep 20, 2018 at 2:24 PM Joshua E. Hill ***@***.***> wrote: That last equality statement is not true for many sources if you truncate the samples. It is true if complete samples are used. For example, one common scheme is to sample a fast running counter (e.g., a TSC value), where the sampling occurs as a consequence of some event whose exact timing is difficult for an attacker to guess. If you look at how the min entropy is distributed in the samples from such a system, the low-order bits are often more difficult for an attacker to predict than the higher-order bits (the high order bits are often basically wholly known to any suitably informed attacker). Thus the low order bits tend to have more min entropy than the high order bits. Providing a min entropy assessment in terms of a per-bit average suggests that one can freely subdivide a sample and credit each bit of the sample as containing the stated average. If one includes the entire sample, then (by definition) you get the total sample entropy, and the equality you state is clearly true. If you instead subdivide the sample, it's hard to comment about the entropy of the part that remains, and for systems where min entropy isn't uniformly distributed, it's very likely that the number of bits multiplied by the per-bit average min entropy won't be the correct value that should be credited. I have witnessed this occurring "in the wild" on several instances, and the results are sometimes unfortunate. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHDBGUlZPF__GfSfWzXesZHfUJHIvBf4ks5udAd8gaJpZM4Wydiu> .

joshuaehill · 2018-09-20T21:42:31Z

You may want to wait to put a bunch of time into making the output look like the NIST's 2016 python implementation, as NIST plans on releasing a completely different C++ tool "real soon now". The last I heard (about a month ago), they had all the development done, and were performing testing.

dj-on-github · 2018-09-20T23:03:51Z

OK. I’ll focus on the csv. I’m travelling today. So I’ll be working on it sporadically.

…

On Thu, Sep 20, 2018 at 2:42 PM Joshua E. Hill ***@***.***> wrote: You may want to wait to put a bunch of time into making the output look like the NIST's 2016 python implementation, as NIST plans on releasing a completely different C++ tool "real soon now". The last I heard (about a month ago), they had all the development done, and were performing testing. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHDBGbfvUlfve82Z32ux9U5D6HeY_Eo3ks5udAvHgaJpZM4Wydiu> .

dj-on-github · 2018-09-21T18:29:17Z

CSV is in. Multi file isn't.

joshuaehill · 2018-09-21T19:44:35Z

NIST released their updated reference implementation today.

dj-on-github · 2018-09-21T19:45:59Z

Something to compare against. Yay.

…

On Fri, Sep 21, 2018 at 12:44 PM Joshua E. Hill ***@***.***> wrote: NIST released their updated reference implementation today. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHDBGWA66yCt0OLWZ7IfffPVFZTNI6wVks5udUGjgaJpZM4Wydiu> .

yuyinw · 2021-07-06T07:49:46Z

when i use cpu jitter collect 3840byte(30720bit), and i use this python tool , it out Minimum Min Entropy = 0.6581506573264674, so the final result is (30720 * 0.6581506573264674 = 20218 bit )?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The "entropy per bit" value is misleading (for non-bit-oriented entropy sources) #13

The "entropy per bit" value is misleading (for non-bit-oriented entropy sources) #13

joshuaehill commented Sep 20, 2018

dj-on-github commented Sep 20, 2018

joshuaehill commented Sep 20, 2018

dj-on-github commented Sep 20, 2018 via email

joshuaehill commented Sep 20, 2018

dj-on-github commented Sep 20, 2018 via email

dj-on-github commented Sep 21, 2018

joshuaehill commented Sep 21, 2018

dj-on-github commented Sep 21, 2018 via email

yuyinw commented Jul 6, 2021

The "entropy per bit" value is misleading (for non-bit-oriented entropy sources) #13

The "entropy per bit" value is misleading (for non-bit-oriented entropy sources) #13

Comments

joshuaehill commented Sep 20, 2018

dj-on-github commented Sep 20, 2018

joshuaehill commented Sep 20, 2018

dj-on-github commented Sep 20, 2018 via email

joshuaehill commented Sep 20, 2018

dj-on-github commented Sep 20, 2018 via email

dj-on-github commented Sep 21, 2018

joshuaehill commented Sep 21, 2018

dj-on-github commented Sep 21, 2018 via email

yuyinw commented Jul 6, 2021