`module` `character_encodings`

Constants related to character encodings.

Helpful links:

ISO-8859: www.mit.edu/people/kenta/two/iso8859.html
UTF-8: www.utf8-chartable.de/unicode-utf8-table.pl?utf8=dec # noqa: E501

Global Variables

NEWLINE_BYTE
ENCODING
ASCII
UTF_8
UTF_16
UTF_32
ISO_8859_1
WINDOWS_1252
BOMS
UNPRINTABLE_ASCII
UNPRINTABLE_ISO_8859_1
UNPRINTABLE_UTF_8
UNPRINTABLE_WIN_1252
UNPRINTABLE_ISO_8859_7
ENCODINGS_TO_ATTEMPT
SINGLE_BYTE_ENCODINGS
WIDE_UTF_ENCODINGS
ENCODINGS

`function` `scrub_c1_control_chars`

scrub_c1_control_chars(char_map: dict) → None

Fill in a dict with integer keys/values corresponding to where a given char encoding has no chars because this range is for C1 control chars (AKA the "undefined" part of most character maps).

`function` `encoding_offsets`

encoding_offsets(encoding: str) → list

Get possible offsets for a given encoding. If the encoding is not in WIDE_UTF_ENCODINGS, return [0].

`function` `encoding_width`

encoding_width(encoding: str) → int

Get the width of a character in bytes for a given encoding, which is the number of possible offsets.

`function` `is_wide_utf`

is_wide_utf(encoding: str) → bool

Check if the encoding is a wide UTF encoding (UTF-16 or UTF-32).

This file was automatically generated via lazydocs.

module character_encodings

Global Variables

function scrub_c1_control_chars

function encoding_offsets

function encoding_width

function is_wide_utf

`module` `character_encodings`

`function` `scrub_c1_control_chars`

`function` `encoding_offsets`

`function` `encoding_width`

`function` `is_wide_utf`