Minetest  5.4.0
irr::core::unicode Namespace Reference

Classes

class  hash
 Hashing algorithm for hashing a ustring. More...
 

Enumerations

enum  EUTF_ENCODE {
  EUTFE_NONE = 0 , EUTFE_UTF8 , EUTFE_UTF16 , EUTFE_UTF16_LE ,
  EUTFE_UTF16_BE , EUTFE_UTF32 , EUTFE_UTF32_LE , EUTFE_UTF32_BE
}
 Unicode encoding type. More...
 
enum  EUTF_ENDIAN { EUTFEE_NATIVE = 0 , EUTFEE_LITTLE , EUTFEE_BIG }
 Unicode endianness. More...
 

Functions

uchar32_t toUTF32 (uchar16_t high, uchar16_t low)
 Convert a UTF-16 surrogate pair into a UTF-32 character. More...
 
uchar16_t swapEndian16 (const uchar16_t &c)
 Swaps the endianness of a 16-bit value. More...
 
uchar32_t swapEndian32 (const uchar32_t &c)
 Swaps the endianness of a 32-bit value. More...
 
core::array< u8 > getUnicodeBOM (EUTF_ENCODE mode)
 Returns the specified unicode byte order mark in a byte array. More...
 
EUTF_ENCODE determineUnicodeBOM (const char *data)
 Detects if the given data stream starts with a unicode BOM. More...
 

Variables

const irr::u16 UTF_REPLACEMENT_CHARACTER = 0xFFFD
 The unicode replacement character. Used to replace invalid characters. More...
 
const u16 BOM = 0xFEFF
 The Unicode byte order mark. More...
 
const u8 BOM_UTF8_LEN = 3
 The size of the Unicode byte order mark in terms of the Unicode character size. More...
 
const u8 BOM_UTF16_LEN = 1
 
const u8 BOM_UTF32_LEN = 1
 
const u8 BOM_ENCODE_UTF8 [3] = { 0xEF, 0xBB, 0xBF }
 Unicode byte order marks for file operations. More...
 
const u8 BOM_ENCODE_UTF16_BE [2] = { 0xFE, 0xFF }
 
const u8 BOM_ENCODE_UTF16_LE [2] = { 0xFF, 0xFE }
 
const u8 BOM_ENCODE_UTF32_BE [4] = { 0x00, 0x00, 0xFE, 0xFF }
 
const u8 BOM_ENCODE_UTF32_LE [4] = { 0xFF, 0xFE, 0x00, 0x00 }
 
const u8 BOM_ENCODE_UTF8_LEN = 3
 The size in bytes of the Unicode byte marks for file operations. More...
 
const u8 BOM_ENCODE_UTF16_LEN = 2
 
const u8 BOM_ENCODE_UTF32_LEN = 4
 

Enumeration Type Documentation

◆ EUTF_ENCODE

Unicode encoding type.

Enumerator
EUTFE_NONE 
EUTFE_UTF8 
EUTFE_UTF16 
EUTFE_UTF16_LE 
EUTFE_UTF16_BE 
EUTFE_UTF32 
EUTFE_UTF32_LE 
EUTFE_UTF32_BE 

◆ EUTF_ENDIAN

Unicode endianness.

Enumerator
EUTFEE_NATIVE 
EUTFEE_LITTLE 
EUTFEE_BIG 

Function Documentation

◆ determineUnicodeBOM()

EUTF_ENCODE irr::core::unicode::determineUnicodeBOM ( const char *  data)
inline

Detects if the given data stream starts with a unicode BOM.

Parameters
dataThe data stream to check.
Returns
The unicode BOM associated with the data stream, or EUTFE_NONE if none was found.

References BOM_ENCODE_UTF16_BE, BOM_ENCODE_UTF16_LE, BOM_ENCODE_UTF32_BE, BOM_ENCODE_UTF32_LE, BOM_ENCODE_UTF8, EUTFE_NONE, EUTFE_UTF16_BE, EUTFE_UTF16_LE, EUTFE_UTF32_BE, EUTFE_UTF32_LE, and EUTFE_UTF8.

Referenced by irr::core::ustring16< TAlloc >::loadDataStream().

+ Here is the caller graph for this function:

◆ getUnicodeBOM()

core::array<u8> irr::core::unicode::getUnicodeBOM ( EUTF_ENCODE  mode)
inline

Returns the specified unicode byte order mark in a byte array.

The byte order mark is the first few bytes in a text file that signifies its encoding.

Parameters
modeThe Unicode encoding method that we want to get the byte order mark for. If EUTFE_UTF16 or EUTFE_UTF32 is passed, it uses the native system endianness.
Returns
An array that contains a byte order mark.

References BOM_ENCODE_UTF16_BE, BOM_ENCODE_UTF16_LE, BOM_ENCODE_UTF16_LEN, BOM_ENCODE_UTF32_BE, BOM_ENCODE_UTF32_LE, BOM_ENCODE_UTF32_LEN, BOM_ENCODE_UTF8, BOM_ENCODE_UTF8_LEN, COPY_ARRAY, EUTFE_NONE, EUTFE_UTF16, EUTFE_UTF16_BE, EUTFE_UTF16_LE, EUTFE_UTF32, EUTFE_UTF32_BE, EUTFE_UTF32_LE, and EUTFE_UTF8.

◆ swapEndian16()

uchar16_t irr::core::unicode::swapEndian16 ( const uchar16_t c)
inline

Swaps the endianness of a 16-bit value.

Returns
The new value.

Referenced by irr::core::ustring16< TAlloc >::append(), and irr::core::ustring16< TAlloc >::toUTF16().

+ Here is the caller graph for this function:

◆ swapEndian32()

uchar32_t irr::core::unicode::swapEndian32 ( const uchar32_t c)
inline

Swaps the endianness of a 32-bit value.

Returns
The new value.

Referenced by irr::core::ustring16< TAlloc >::append(), and irr::core::ustring16< TAlloc >::toUTF32().

+ Here is the caller graph for this function:

◆ toUTF32()

uchar32_t irr::core::unicode::toUTF32 ( uchar16_t  high,
uchar16_t  low 
)
inline

Convert a UTF-16 surrogate pair into a UTF-32 character.

Parameters
highThe high value of the pair.
lowThe low value of the pair.
Returns
The UTF-32 character expressed by the surrogate pair.

Referenced by irr::core::ustring16< TAlloc >::_ustring16_iterator_access::_get(), irr::core::ustring16< TAlloc >::lastChar(), irr::core::ustring16< TAlloc >::remove(), and irr::core::ustring16< TAlloc >::removeChars().

+ Here is the caller graph for this function:

Variable Documentation

◆ BOM

const u16 irr::core::unicode::BOM = 0xFEFF

◆ BOM_ENCODE_UTF16_BE

const u8 irr::core::unicode::BOM_ENCODE_UTF16_BE[2] = { 0xFE, 0xFF }

◆ BOM_ENCODE_UTF16_LE

const u8 irr::core::unicode::BOM_ENCODE_UTF16_LE[2] = { 0xFF, 0xFE }

◆ BOM_ENCODE_UTF16_LEN

const u8 irr::core::unicode::BOM_ENCODE_UTF16_LEN = 2

◆ BOM_ENCODE_UTF32_BE

const u8 irr::core::unicode::BOM_ENCODE_UTF32_BE[4] = { 0x00, 0x00, 0xFE, 0xFF }

◆ BOM_ENCODE_UTF32_LE

const u8 irr::core::unicode::BOM_ENCODE_UTF32_LE[4] = { 0xFF, 0xFE, 0x00, 0x00 }

◆ BOM_ENCODE_UTF32_LEN

const u8 irr::core::unicode::BOM_ENCODE_UTF32_LEN = 4

◆ BOM_ENCODE_UTF8

const u8 irr::core::unicode::BOM_ENCODE_UTF8[3] = { 0xEF, 0xBB, 0xBF }

◆ BOM_ENCODE_UTF8_LEN

const u8 irr::core::unicode::BOM_ENCODE_UTF8_LEN = 3

The size in bytes of the Unicode byte marks for file operations.

Referenced by irr::core::ustring16< TAlloc >::append(), and getUnicodeBOM().

◆ BOM_UTF16_LEN

const u8 irr::core::unicode::BOM_UTF16_LEN = 1

◆ BOM_UTF32_LEN

const u8 irr::core::unicode::BOM_UTF32_LEN = 1

◆ BOM_UTF8_LEN

const u8 irr::core::unicode::BOM_UTF8_LEN = 3

The size of the Unicode byte order mark in terms of the Unicode character size.

Referenced by irr::core::ustring16< TAlloc >::append(), irr::core::ustring16< TAlloc >::toUTF8(), and irr::core::ustring16< TAlloc >::toUTF8_s().

◆ UTF_REPLACEMENT_CHARACTER

const irr::u16 irr::core::unicode::UTF_REPLACEMENT_CHARACTER = 0xFFFD

The unicode replacement character. Used to replace invalid characters.

Referenced by irr::core::ustring16< TAlloc >::append(), irr::gui::CGUITTFont::getGlyphIndexByChar(), and irr::core::ustring16< TAlloc >::validate().