module regex_match_metrics
RegexMatchMetrics
class.
class RegexMatchMetrics
Class to measure what we enounter as we iterate over all matches of a relatively simple byte level regex.
Things like how much many of our matched bytes were we able to decode easily vs. by force vs. not at all, were some encodings have a higher pct of success than others (indicating part of our mystery data might be encoded that way?
Example: "Find bytes between quotes" against a relatively large pool of close to random encrypted binary data.
Attributes:
match_count
(int): Total number of matches found.bytes_matched
(int): Total number of bytes matched across all matches.matches_decoded
(int): Number of matches where we were able to decode at least some of the matched bytes.easy_decode_count
(int): Number of matches where we were able to decode the matched bytes without forcing.forced_decode_count
(int): Number of matches where we were only able to decode the matched bytes by forcing.undecodable_count
(int): Number of matches where we were unable to decode any of the matched bytes.skipped_matches_lengths
(defaultdict): Dictionary mapping lengths of skipped matches to their counts.bytes_match_objs
(list): List ofBytesMatch
objects for all matches encountered.per_encoding_stats
(defaultdict): Dictionary mapping encoding names to their respectiveRegexMatchMetrics
.
TODO: use @dataclass decorator https://realpython.com/python-data-classes/
method __init__
__init__() → None
method num_matches_skipped_for_being_empty
num_matches_skipped_for_being_empty() → int
Number of matches skipped for being empty (0 length).
method num_matches_skipped_for_being_too_big
num_matches_skipped_for_being_too_big() → int
Number of matches skipped for being too big to decode.
method tally_match
tally_match(decoder: BytesDecoder) → None
Tally statistics from a BytesDecoder
after it has processed a match.
Args:
decoder
(BytesDecoder): TheBytesDecoder
that processed a match.
This file was automatically generated via lazydocs.