module character_encodings
Constants related to character encodings.
Helpful links:
-
UTF-8: www.utf8-chartable.de/unicode-utf8-table.pl?utf8=dec # noqa: E501
Global Variables
- NEWLINE_BYTE
- ENCODING
- ASCII
- UTF_8
- UTF_16
- UTF_32
- ISO_8859_1
- WINDOWS_1252
- BOMS
- UNPRINTABLE_ASCII
- UNPRINTABLE_ISO_8859_1
- UNPRINTABLE_UTF_8
- UNPRINTABLE_WIN_1252
- UNPRINTABLE_ISO_8859_7
- ENCODINGS_TO_ATTEMPT
- SINGLE_BYTE_ENCODINGS
- WIDE_UTF_ENCODINGS
- ENCODINGS
function scrub_c1_control_chars
scrub_c1_control_chars(char_map: dict) → None
Fill in a dict
with integer keys/values corresponding to where a given char encoding has no chars because this range is for C1 control chars (AKA the "undefined" part of most character maps).
function encoding_offsets
encoding_offsets(encoding: str) → list
Get possible offsets for a given encoding. If the encoding is not in WIDE_UTF_ENCODINGS
, return [0]
.
function encoding_width
encoding_width(encoding: str) → int
Get the width of a character in bytes for a given encoding, which is the number of possible offsets.
function is_wide_utf
is_wide_utf(encoding: str) → bool
Check if the encoding is a wide UTF encoding (UTF-16 or UTF-32).
This file was automatically generated via lazydocs.