Skip to content

module regex_match_metrics

RegexMatchMetrics class.


class RegexMatchMetrics

Class to measure what we enounter as we iterate over all matches of a relatively simple byte level regex.

Things like how much many of our matched bytes were we able to decode easily vs. by force vs. not at all, were some encodings have a higher pct of success than others (indicating part of our mystery data might be encoded that way?

Example: "Find bytes between quotes" against a relatively large pool of close to random encrypted binary data.

Attributes:

  • match_count (int): Total number of matches found.
  • bytes_matched (int): Total number of bytes matched across all matches.
  • matches_decoded (int): Number of matches where we were able to decode at least some of the matched bytes.
  • easy_decode_count (int): Number of matches where we were able to decode the matched bytes without forcing.
  • forced_decode_count (int): Number of matches where we were only able to decode the matched bytes by forcing.
  • undecodable_count (int): Number of matches where we were unable to decode any of the matched bytes.
  • skipped_matches_lengths (defaultdict): Dictionary mapping lengths of skipped matches to their counts.
  • bytes_match_objs (list): List of BytesMatch objects for all matches encountered.
  • per_encoding_stats (defaultdict): Dictionary mapping encoding names to their respective RegexMatchMetrics.

TODO: use @dataclass decorator https://realpython.com/python-data-classes/

method __init__

__init__() → None

method num_matches_skipped_for_being_empty

num_matches_skipped_for_being_empty() → int

Number of matches skipped for being empty (0 length).


method num_matches_skipped_for_being_too_big

num_matches_skipped_for_being_too_big() → int

Number of matches skipped for being too big to decode.


method tally_match

tally_match(decoder: BytesDecoder) → None

Tally statistics from a BytesDecoder after it has processed a match.

Args:

  • decoder (BytesDecoder): The BytesDecoder that processed a match.

This file was automatically generated via lazydocs.