CovertMark.analytics.traffic module

CovertMark.analytics.traffic.get_window_stats(windowed_packets, client_ips, feature_selection=None)[source]

Calculate the following features for the windowed packets:

  • ‘max_entropy_up’: max entropy of upstream TCP payloads;
  • ‘min_entropy_up’: min entropy of upstream TCP payloads;
  • ‘mean_entropy_up’: mean entropy of upstream TCP payloads;
  • ‘mean_interval_up’: upstream mean TCP ACK intervals;
  • ‘bin_#_interval_up’: the number of in-range intervals between TCP frames, ranges divided between 0, 1000, 10000, 100000, 1000000 microseconds, with value represented as the upper range of each interval. Only the first of all frames bearing the unique sequence number is counted;
  • ‘top1_tcp_len_up’: the most common upstream TCP payload length;
  • ‘top2_tcp_len_up’: the second most common upstream TCP payload length;
  • ‘mean_tcp_len_up’: mean upstream TCP payload length.
  • ‘bin_#_len_up’: binned TCP payload lengths in each direction of flow, divided between 100, 200, 500, 1000, 1500 bytes.
  • ‘push_ratio_up’: ratio of TCP ACKs with PSH flags set, indicating reuse of TCP handshake for additional data;
  • (All attributes above, except for downstream and named ‘…_down’);
  • ‘up_down_ratio’: ratio of upstream to downstream packets.

Only relevant features will be calculated and returned, see below.

Parameters:
  • windowed_packets (list) – a segment of TCP packets, assumed to have been sorted by time in ascending order.
  • client_ips (list) – the IP addresses/subnets of the suspected PT clients.
  • feature_selection

    chooses sets of features to check for and include in the output. If None, include all features. Options:

    • USE_ENTROPY : Entropy features
    • USE_INTERVAL : Mean interval
    • USE_INTERVAL_BINS : Binned intervals
    • USE_TCP_LEN : Top and mean TCP lengths
    • USE_TCP_LEN_BINS : Binned TCP lengths
    • USE_PSH : Ratio of PSH packets in ACK packets
Returns:

three-tuple: a dictionary containing the stats as described above, a set of remote IP addresses seen in the window, and a set of client ips seen in this window.

CovertMark.analytics.traffic.group_packets_by_ip_fixed_size(packets, clients, window_size)[source]

Group packets into fixed-size segments that contain bidirectional traffic from and towards individual predefined clients, individual inputs should normally come from time-windowing by window_packets_time_series() (e.g. 60s) to simulate realistic firewall operation conditions.

Parameters:
  • packets (list) – a list of parsed packets. Packets in this 1D list are assumed to be chronologically ordered.
  • clients (list) – a predefined list of Python subnets objects describing clients that are considered within the firewall’s control.
  • window_size (int) – threshold to start a new segment.
Returns:

a dictionary indexed by a tuple of individual client and target pair, containing one 2D list for each pair. Each 2D list contains segmented packets by fixed size, with remainders retained.

CovertMark.analytics.traffic.ordered_tcp_payload_length_frequency(packets, tls_only=False, bandwidth=3)[source]

Utilises meanshift to cluster input tcp frames by their payload to within a certain difference (bandwidth), and returns descending ordered clusters. This is useful if the PT sends a lot of unidirectional equal or similar length payloads, for which the packets should have been filtered by source or destination IP.

Parameters:
  • packets (list) – a list of parsed packets, non-tcp packets will be ignored.
  • tls_only (bool) – if True ignoring non-TLS frames, including TCP frames not containing TLS headers but segmented TLS data.
  • bandwidth (int) – the maximum distance within clusters, i.e. max difference between payload lengths.
Returns:

a list of sets containing clustered values ordered from most frequent to least. Packets with TCP payload lengths close to a likely MTU limit are not included in clustering to prevent bias against commonly seen non-Large Segment Offload traces captured on devices.

CovertMark.analytics.traffic.ordered_udp_payload_length_frequency(packets, bandwidth=3)[source]

Utilises meanshift to cluster input udp frames by their packet length to within a certain difference (bandwidth), and return descending ordered clusters. This is useful if the PT sends a lot of unidirectional equal or similar UDP length payloads, for which the packets should have been filtered by source or destination IP.

Parameters:
  • packets (list) – a list of parsed packets, non-udp packets will be ignored.
  • bandwidth (int) – the maximum distance within clusters, i.e. max difference between payload lengths.
Returns:

a list of sets containing clustered values ordered from the most frequent to the least.

CovertMark.analytics.traffic.synchronise_packets(packets, target_time, sort=False)[source]

Synchronise the input packets with another trace by shifting the time of the first frame to align with the target time supplied.

Parameters:
  • packets (list) – input packets to be time shifted, should be chronologically ordered or have sort set to True, otherwise results will be erroneous.
  • target_time (float) – a valid UNIX timestamp to at least 6 d.p.
  • sort (bool) – if True, the function will chronologically sort the input packets first.
Returns:

time shifted input packets.

CovertMark.analytics.traffic.window_packets_fixed_size(packets, window_size)[source]

Segment packets into fixed-size windows, discarding any remainder.

Parameters:
  • packets (list) – a list of parsed packets.
  • window_size (int) – the constant frame-count of each windowed segment, which will be segmented in chronological order.
Returns:

a 2-D list containing windowed packets.

Raises:

ValueError if the fixed window size is invalid.

CovertMark.analytics.traffic.window_packets_time_series(packets, chronological_window, sort=True)[source]

Segment packets into fixed chronologically-sized windows.

Parameters:
  • packets (list) – a list of parsed packets.
  • window_size (int) – the number of microseconds elapsed covered by each windowed segment, in chronological order.
  • sort (bool) – if True, packets will be sorted again into chronological order, useful if packet times not guaranteed to be chronologically ascending. True by default.
Returns:

a 2-D list containing windowed packets.