CovertMark.analytics.entropy module

class CovertMark.analytics.entropy.EntropyAnalyser[source]

Bases: object

Entropy and entropy-based distribution tests, primarily designed for obfs4 but useful to analyse a wide range of client-to-server packets that include encrypted handshake messages.

anderson_darling_dist_test(input_bytes, block_size)[source]

Perform an Anderson-Darling distribution hypothesis test on whether the input_bytes was likely drawn from the same distribution as a random distribution, based on Shannon entropy of individual blocks of fixed size.

Parameters
  • input_bytes (bytes) – input in bytes to be tested.

  • block_size (int) – the block size for each entropy-calculation block.

Returns

{min_threshold, p}, where min_threshold is the minimum threshold in float under which the null hypothesis can be rejected, between 0.25 and 0.01, 1 if non-rejectable (definitely from random distribution), and 0 if always rejectable (definitely not from random distribution); and p is the p-value from the test.

Raises
  • TypeError – if the input were not supplied as bytes or the block size is not a valid integer.

  • ValueError – if block size is greater than the amount of bytes supplied.

static byte_entropy(input_bytes)[source]

Calculate the shannon entropy of the input bytes.

Parameters

input_bytes (bytes) – input in bytes.

Returns

the base 2 shannon entropy of input_bytes.

entropy_estimation(input_bytes, window_size)[source]

Estimate the level of entropy of input bytes by running a sliding window through the payload bytes and counting the number of distinct values in each window. A fast, low-leading constant O(n) operation.

Parameters
  • input_bytes (bytes) – input in bytes to be tested.

  • window_size (int) – the size of the sliding window.

Returns

the mean proportion of windows tested with distinct values.

Raises

TypeError – if the input were not supplied as bytes.

kolmogorov_smirnov_dist_test(input_bytes, block_size)[source]

Perform a Kolmogorov-Smirnov distribution hypothesis test on on whether the input_bytes was likely drawn from the same distribution as a random distribution, based on Shannon entropy of individual blocks of fixed size.

Parameters
  • input_bytes (bytes) – input in bytes to be tested.

  • block_size (int) – an integer block size for entropy-calculation block.

Returns

the p-value from the KS two-sample test, hypothesis rejectable if p is very small (usually <0.1), meaning that likely drawn from non- uniform distribution.

Raises
  • TypeError – if the input were not supplied as bytes or the block size is not a valid integer.

  • ValueError – if block size is greater than the amount of bytes supplied.

kolmogorov_smirnov_uniform_test(input_bytes)[source]

Perform a Kolmogorov-Smirnov distribution hypothesis test on on whether the input_bytes was likely uniformly distributed (not by entropy value).

Parameters

input_bytes (bytes) – input in bytes to be tested.

Returns

the p-value from the KS two-sample test, hypothesis rejectable if p is very small (usually <0.1), meaning input likely not uniformly distributed.

Raises

TypeError – if the input were not supplied as bytes.

request_random_bytes(request_size, block_size)[source]

It is computationally expensive to generate fresh uniform distributions each time a block is analysed, therefore a constant uniformly-distributed sample is kept, unless enlargement is required due to request size. Each regeneration is repeated five times with the highest entropy sample taken, to prevent accidental low entropy distribution from being used.

Parameters
  • request_size (int) – the size of requested uniformly distributed bytes.

  • block_size (int) – the number of bytes in each block.

Returns

list of blocks of uniformly distributed bytes of the size required.

Raises

ValueError – on an invalid block size or request size.