Skip to content

module encoding_detector

EncodingDetector class for managing chardet encoding detection.

Global Variables

  • ENCODING
  • CONFIDENCE_SCORE_RANGE

class EncodingDetector

Manager class to ease dealing with the encoding detection library chardet.

Each instance of this class manages a chardet.detect_all() scan on a single set of bytes.

Attributes:

  • bytes (bytes): The bytes to analyze.
  • bytes_len (int): The length of the bytes.
  • table (Table): A rich Table object summarizing the chardet results.
  • assessments (List[EncodingAssessment]): List of EncodingAssessment objects from chardet results.
  • unique_assessments (List[EncodingAssessment]): Unique assessments by encoding, highest confidence only.
  • raw_chardet_assessments (List[dict]): Raw list of dicts returned by chardet.detect_all().
  • force_decode_assessments (List[EncodingAssessment]): Assessments above force decode threshold.
  • force_display_assessments (List[EncodingAssessment]): Assessments above force display threshold.
  • has_any_idea (Optional[bool]): True if chardet had any idea what the encoding might be, False if not, None if chardet wasn't run yet.
  • force_display_threshold (float): [class variable] Default confidence threshold for forcing display in decoded table.
  • force_decode_threshold (float): [class variable] Default confidence threshold for forcing a decode attempt.

method __init__

__init__(_bytes: bytes) → None

Args:

  • _bytes (bytes): The bytes to analyze with chardet.

method assessments_above_confidence

assessments_above_confidence(cutoff: float) → List[EncodingAssessment]

Return the assessments above the given confidence cutoff.


method get_encoding_assessment

get_encoding_assessment(encoding: str) → EncodingAssessment

Get the chardet assessment for a specific encoding.

Args:

  • encoding (str): The encoding to look for.

Returns:

  • EncodingAssessment: Assessment for the given encoding if it exists, otherwise a dummy with 0 confidence.

method has_enough_bytes

has_enough_bytes() → bool

Return True if we have enough bytes to run chardet.detect().


This file was automatically generated via lazydocs.