module encoding_detector
EncodingDetector class for managing chardet encoding detection.
Global Variables
- ENCODING
- CONFIDENCE_SCORE_RANGE
class EncodingDetector
Manager class to ease dealing with the encoding detection library chardet.
Each instance of this class manages a chardet.detect_all() scan on a single set of bytes.
Attributes:
bytes(bytes): The bytes to analyze.bytes_len(int): The length of the bytes.table(Table): A richTableobject summarizing the chardet results.assessments(List[EncodingAssessment]): List ofEncodingAssessmentobjects fromchardetresults.unique_assessments(List[EncodingAssessment]): Unique assessments by encoding, highest confidence only.raw_chardet_assessments(List[dict]): Raw list of dicts returned bychardet.detect_all().force_decode_assessments(List[EncodingAssessment]): Assessments above force decode threshold.force_display_assessments(List[EncodingAssessment]): Assessments above force display threshold.has_any_idea(Optional[bool]):Trueifchardethad any idea what the encoding might be,Falseif not,Noneifchardetwasn't run yet.force_display_threshold(float):[class variable]Default confidence threshold for forcing display in decoded table.force_decode_threshold(float):[class variable]Default confidence threshold for forcing a decode attempt.
method __init__
__init__(_bytes: bytes) → None
Args:
_bytes(bytes): The bytes to analyze withchardet.
method assessments_above_confidence
assessments_above_confidence(cutoff: float) → List[EncodingAssessment]
Return the assessments above the given confidence cutoff.
method get_encoding_assessment
get_encoding_assessment(encoding: str) → EncodingAssessment
Get the chardet assessment for a specific encoding.
Args:
encoding(str): The encoding to look for.
Returns:
EncodingAssessment: Assessment for the given encoding if it exists, otherwise a dummy with 0 confidence.
method has_enough_bytes
has_enough_bytes() → bool
Return True if we have enough bytes to run chardet.detect().
This file was automatically generated via lazydocs.