CovertMark.analytics.entropy module¶

class
CovertMark.analytics.entropy.
EntropyAnalyser
[source]¶ Bases:
object
Entropy and entropybased distribution tests, primarily designed for obfs4 but useful to analyse a wide range of clienttoserver packets that include encrypted handshake messages.

anderson_darling_dist_test
(input_bytes, block_size)[source]¶ Perform an AndersonDarling distribution hypothesis test on whether the input_bytes was likely drawn from the same distribution as a random distribution, based on Shannon entropy of individual blocks of fixed size.
Parameters:  input_bytes (bytes) – input in bytes to be tested.
 block_size (int) – the block size for each entropycalculation block.
Returns: {min_threshold, p}, where min_threshold is the minimum threshold in float under which the null hypothesis can be rejected, between 0.25 and 0.01, 1 if nonrejectable (definitely from random distribution), and 0 if always rejectable (definitely not from random distribution); and p is the pvalue from the test.
Raises:  TypeError – if the input were not supplied as bytes or the block size is not a valid integer.
 ValueError – if block size is greater than the amount of bytes supplied.

static
byte_entropy
()[source]¶ Calculate the shannon entropy of the input bytes.
Parameters: input_bytes (bytes) – input in bytes. Returns: the base 2 shannon entropy of input_bytes.

entropy_estimation
(input_bytes, window_size)[source]¶ Estimate the level of entropy of input bytes by running a sliding window through the payload bytes and counting the number of distinct values in each window. A fast, lowleading constant O(n) operation.
Parameters:  input_bytes (bytes) – input in bytes to be tested.
 window_size (int) – the size of the sliding window.
Returns: the mean proportion of windows tested with distinct values.
Raises: TypeError – if the input were not supplied as bytes.

kolmogorov_smirnov_dist_test
(input_bytes, block_size)[source]¶ Perform a KolmogorovSmirnov distribution hypothesis test on on whether the input_bytes was likely drawn from the same distribution as a random distribution, based on Shannon entropy of individual blocks of fixed size.
Parameters:  input_bytes (bytes) – input in bytes to be tested.
 block_size (int) – an integer block size for entropycalculation block.
Returns: the pvalue from the KS twosample test, hypothesis rejectable if p is very small (usually <0.1), meaning that likely drawn from non uniform distribution.
Raises:  TypeError – if the input were not supplied as bytes or the block size is not a valid integer.
 ValueError – if block size is greater than the amount of bytes supplied.

kolmogorov_smirnov_uniform_test
(input_bytes)[source]¶ Perform a KolmogorovSmirnov distribution hypothesis test on on whether the input_bytes was likely uniformly distributed (not by entropy value).
Parameters: input_bytes (bytes) – input in bytes to be tested. Returns: the pvalue from the KS twosample test, hypothesis rejectable if p is very small (usually <0.1), meaning input likely not uniformly distributed. Raises: TypeError – if the input were not supplied as bytes.

request_random_bytes
(request_size, block_size)[source]¶ It is computationally expensive to generate fresh uniform distributions each time a block is analysed, therefore a constant uniformlydistributed sample is kept, unless enlargement is required due to request size. Each regeneration is repeated five times with the highest entropy sample taken, to prevent accidental low entropy distribution from being used.
Parameters:  request_size (int) – the size of requested uniformly distributed bytes.
 block_size (int) – the number of bytes in each block.
Returns: list of blocks of uniformly distributed bytes of the size required.
Raises: ValueError – on an invalid block size or request size.
