CovertMark.analytics.entropy module
- class CovertMark.analytics.entropy.EntropyAnalyser[source]
Bases:
object
Entropy and entropy-based distribution tests, primarily designed for obfs4 but useful to analyse a wide range of client-to-server packets that include encrypted handshake messages.
- anderson_darling_dist_test(input_bytes, block_size)[source]
Perform an Anderson-Darling distribution hypothesis test on whether the input_bytes was likely drawn from the same distribution as a random distribution, based on Shannon entropy of individual blocks of fixed size.
- Parameters
input_bytes (bytes) – input in bytes to be tested.
block_size (int) – the block size for each entropy-calculation block.
- Returns
{min_threshold, p}, where min_threshold is the minimum threshold in float under which the null hypothesis can be rejected, between 0.25 and 0.01, 1 if non-rejectable (definitely from random distribution), and 0 if always rejectable (definitely not from random distribution); and p is the p-value from the test.
- Raises
TypeError – if the input were not supplied as bytes or the block size is not a valid integer.
ValueError – if block size is greater than the amount of bytes supplied.
- static byte_entropy(input_bytes)[source]
Calculate the shannon entropy of the input bytes.
- Parameters
input_bytes (bytes) – input in bytes.
- Returns
the base 2 shannon entropy of input_bytes.
- entropy_estimation(input_bytes, window_size)[source]
Estimate the level of entropy of input bytes by running a sliding window through the payload bytes and counting the number of distinct values in each window. A fast, low-leading constant O(n) operation.
- Parameters
input_bytes (bytes) – input in bytes to be tested.
window_size (int) – the size of the sliding window.
- Returns
the mean proportion of windows tested with distinct values.
- Raises
TypeError – if the input were not supplied as bytes.
- kolmogorov_smirnov_dist_test(input_bytes, block_size)[source]
Perform a Kolmogorov-Smirnov distribution hypothesis test on on whether the input_bytes was likely drawn from the same distribution as a random distribution, based on Shannon entropy of individual blocks of fixed size.
- Parameters
input_bytes (bytes) – input in bytes to be tested.
block_size (int) – an integer block size for entropy-calculation block.
- Returns
the p-value from the KS two-sample test, hypothesis rejectable if p is very small (usually <0.1), meaning that likely drawn from non- uniform distribution.
- Raises
TypeError – if the input were not supplied as bytes or the block size is not a valid integer.
ValueError – if block size is greater than the amount of bytes supplied.
- kolmogorov_smirnov_uniform_test(input_bytes)[source]
Perform a Kolmogorov-Smirnov distribution hypothesis test on on whether the input_bytes was likely uniformly distributed (not by entropy value).
- Parameters
input_bytes (bytes) – input in bytes to be tested.
- Returns
the p-value from the KS two-sample test, hypothesis rejectable if p is very small (usually <0.1), meaning input likely not uniformly distributed.
- Raises
TypeError – if the input were not supplied as bytes.
- request_random_bytes(request_size, block_size)[source]
It is computationally expensive to generate fresh uniform distributions each time a block is analysed, therefore a constant uniformly-distributed sample is kept, unless enlargement is required due to request size. Each regeneration is repeated five times with the highest entropy sample taken, to prevent accidental low entropy distribution from being used.
- Parameters
request_size (int) – the size of requested uniformly distributed bytes.
block_size (int) – the number of bytes in each block.
- Returns
list of blocks of uniformly distributed bytes of the size required.
- Raises
ValueError – on an invalid block size or request size.