module encoding_detector
EncodingDetector
class for managing chardet encoding detection.
Global Variables
- ENCODING
- CONFIDENCE_SCORE_RANGE
class EncodingDetector
Manager class to ease dealing with the encoding detection library chardet
.
Each instance of this class manages a chardet.detect_all()
scan on a single set of bytes.
Attributes:
bytes
(bytes): The bytes to analyze.bytes_len
(int): The length of the bytes.table
(Table): A richTable
object summarizing the chardet results.assessments
(List[EncodingAssessment]): List ofEncodingAssessment
objects fromchardet
results.unique_assessments
(List[EncodingAssessment]): Unique assessments by encoding, highest confidence only.raw_chardet_assessments
(List[dict]): Raw list of dicts returned bychardet.detect_all()
.force_decode_assessments
(List[EncodingAssessment]): Assessments above force decode threshold.force_display_assessments
(List[EncodingAssessment]): Assessments above force display threshold.has_any_idea
(Optional[bool]):True
ifchardet
had any idea what the encoding might be,False
if not,None
ifchardet
wasn't run yet.force_display_threshold
(float):[class variable]
Default confidence threshold for forcing display in decoded table.force_decode_threshold
(float):[class variable]
Default confidence threshold for forcing a decode attempt.
method __init__
__init__(_bytes: bytes) → None
Args:
_bytes
(bytes): The bytes to analyze withchardet
.
method assessments_above_confidence
assessments_above_confidence(cutoff: float) → List[EncodingAssessment]
Return the assessments above the given confidence cutoff.
method get_encoding_assessment
get_encoding_assessment(encoding: str) → EncodingAssessment
Get the chardet
assessment for a specific encoding.
Args:
encoding
(str): The encoding to look for.
Returns:
EncodingAssessment
: Assessment for the given encoding if it exists, otherwise a dummy with 0 confidence.
method has_enough_bytes
has_enough_bytes() → bool
Return True
if we have enough bytes to run chardet.detect()
.
This file was automatically generated via lazydocs.