Panda3D
Public Types | Public Member Functions | Static Public Member Functions | List of all members
TextEncoder Class Reference

This class can be used to convert text between multiple representations, e.g. More...

#include "textEncoder.h"

Inheritance diagram for TextEncoder:
TextNode FrameRateMeter SceneGraphAnalyzerMeter

Public Types

enum  Encoding { E_iso8859, E_utf8, E_unicode }
 

Public Member Functions

 TextEncoder (const TextEncoder &copy)
 
void append_text (const string &text)
 Appends the indicates string to the end of the stored text. More...
 
void append_unicode_char (int character)
 Appends a single character to the end of the stored text. More...
 
void append_wtext (const wstring &text)
 Appends the indicates string to the end of the stored wide-character text. More...
 
void clear_text ()
 Removes the text from the TextEncoder. More...
 
wstring decode_text (const string &text) const
 Returns the given wstring decoded to a single-byte string, via the current encoding system. More...
 
string encode_wtext (const wstring &wtext) const
 Encodes a wide-text string into a single-char string, according to the current encoding. More...
 
string get_encoded_char (int index) const
 Returns the nth char of the stored text, as a one-, two-, or three-byte encoded string. More...
 
string get_encoded_char (int index, Encoding encoding) const
 Returns the nth char of the stored text, as a one-, two-, or three-byte encoded string. More...
 
Encoding get_encoding () const
 Returns the encoding by which the string set via set_text() is to be interpreted. More...
 
int get_num_chars () const
 Returns the number of characters in the stored text. More...
 
string get_text () const
 Returns the current text, as encoded via the current encoding system. More...
 
string get_text (Encoding encoding) const
 Returns the current text, as encoded via the indicated encoding system. More...
 
string get_text_as_ascii () const
 Returns the text associated with the node, converted as nearly as possible to a fully-ASCII representation. More...
 
int get_unicode_char (int index) const
 Returns the Unicode value of the nth character in the stored text. More...
 
const wstring & get_wtext () const
 Returns the text associated with the TextEncoder, as a wide-character string. More...
 
wstring get_wtext_as_ascii () const
 Returns the text associated with the node, converted as nearly as possible to a fully-ASCII representation. More...
 
bool has_text () const
 
bool is_wtext () const
 Returns true if any of the characters in the string returned by get_wtext() are out of the range of an ASCII character (and, therefore, get_wtext() should be called in preference to get_text()). More...
 
void make_lower ()
 Adjusts the text stored within the encoder to all lowercase letters (preserving accent marks correctly). More...
 
void make_upper ()
 Adjusts the text stored within the encoder to all uppercase letters (preserving accent marks correctly). More...
 
void set_encoding (Encoding encoding)
 Specifies how the string set via set_text() is to be interpreted. More...
 
void set_text (const string &text)
 Changes the text that is stored in the encoder. More...
 
void set_text (const string &text, Encoding encoding)
 The two-parameter version of set_text() accepts an explicit encoding; the text is immediately decoded and stored as a wide-character string. More...
 
void set_unicode_char (int index, int character)
 Sets the Unicode value of the nth character in the stored text. More...
 
void set_wtext (const wstring &wtext)
 Changes the text that is stored in the encoder. More...
 

Static Public Member Functions

static wstring decode_text (const string &text, Encoding encoding)
 Returns the given wstring decoded to a single-byte string, via the given encoding system. More...
 
static string encode_wchar (wchar_t ch, Encoding encoding)
 Encodes a single wide char into a one-, two-, or three-byte string, according to the given encoding system. More...
 
static string encode_wtext (const wstring &wtext, Encoding encoding)
 Encodes a wide-text string into a single-char string, according to the given encoding. More...
 
static Encoding get_default_encoding ()
 Specifies the default encoding to be used for all subsequently created TextEncoder objects. More...
 
static string lower (const string &source)
 Converts the string to lowercase, assuming the string is encoded in the default encoding. More...
 
static string lower (const string &source, Encoding encoding)
 Converts the string to lowercase, assuming the string is encoded in the indicated encoding. More...
 
static string reencode_text (const string &text, Encoding from, Encoding to)
 Given the indicated text string, which is assumed to be encoded via the encoding "from", decodes it and then reencodes it into the encoding "to", and returns the newly encoded string. More...
 
static void set_default_encoding (Encoding encoding)
 Specifies the default encoding to be used for all subsequently created TextEncoder objects. More...
 
static bool unicode_isalpha (int character)
 Returns true if the indicated character is an alphabetic letter, false otherwise. More...
 
static bool unicode_isdigit (int character)
 Returns true if the indicated character is a numeric digit, false otherwise. More...
 
static bool unicode_islower (int character)
 Returns true if the indicated character is a lowercase letter, false otherwise. More...
 
static bool unicode_ispunct (int character)
 Returns true if the indicated character is a punctuation mark, false otherwise. More...
 
static bool unicode_isspace (int character)
 Returns true if the indicated character is a whitespace letter, false otherwise. More...
 
static bool unicode_isupper (int character)
 Returns true if the indicated character is an uppercase letter, false otherwise. More...
 
static int unicode_tolower (int character)
 Returns the uppercase equivalent of the given Unicode character. More...
 
static int unicode_toupper (int character)
 Returns the uppercase equivalent of the given Unicode character. More...
 
static string upper (const string &source)
 Converts the string to uppercase, assuming the string is encoded in the default encoding. More...
 
static string upper (const string &source, Encoding encoding)
 Converts the string to uppercase, assuming the string is encoded in the indicated encoding. More...
 

Detailed Description

This class can be used to convert text between multiple representations, e.g.

utf-8 to Unicode. You may use it as a static class object, passing the encoding each time, or you may create an instance and use that object, which will record the current encoding and retain the current string.

This class is also a base class of TextNode, which inherits this functionality.

Definition at line 37 of file textEncoder.h.

Member Function Documentation

◆ append_text()

void TextEncoder::append_text ( const string &  text)
inline

Appends the indicates string to the end of the stored text.

Definition at line 193 of file textEncoder.I.

References append_unicode_char(), and get_text().

Referenced by TextNode::append_text(), and get_text().

◆ append_unicode_char()

void TextEncoder::append_unicode_char ( int  character)
inline

Appends a single character to the end of the stored text.

This may be a wide character, up to 16 bits in Unicode.

Definition at line 206 of file textEncoder.I.

References get_num_chars(), and get_wtext().

Referenced by append_text(), and TextNode::append_unicode_char().

◆ append_wtext()

void TextEncoder::append_wtext ( const wstring &  text)
inline

Appends the indicates string to the end of the stored wide-character text.

Definition at line 545 of file textEncoder.I.

References encode_wtext(), and get_wtext().

Referenced by TextNode::append_wtext(), and get_wtext().

◆ clear_text()

void TextEncoder::clear_text ( )
inline

Removes the text from the TextEncoder.

Definition at line 140 of file textEncoder.I.

References get_text().

Referenced by TextNode::clear_text(), and set_text().

◆ decode_text() [1/2]

wstring TextEncoder::decode_text ( const string &  text) const
inline

Returns the given wstring decoded to a single-byte string, via the current encoding system.

Definition at line 568 of file textEncoder.I.

References get_text(), and set_wtext().

Referenced by TextNode::calc_width(), encode_wtext(), get_wtext(), PGEntry::is_wtext(), ButtonEvent::read_datagram(), reencode_text(), WinGraphicsWindow::set_properties_now(), set_text(), and PGEntry::set_text().

◆ decode_text() [2/2]

wstring TextEncoder::decode_text ( const string &  text,
TextEncoder::Encoding  encoding 
)
static

Returns the given wstring decoded to a single-byte string, via the given encoding system.

Definition at line 200 of file textEncoder.cxx.

References StringDecoder::get_next_character(), and StringDecoder::is_eof().

◆ encode_wchar()

string TextEncoder::encode_wchar ( wchar_t  ch,
TextEncoder::Encoding  encoding 
)
static

Encodes a single wide char into a one-, two-, or three-byte string, according to the given encoding system.

Definition at line 128 of file textEncoder.cxx.

References encode_wtext(), and UnicodeLatinMap::look_up().

Referenced by encode_wtext(), and is_wtext().

◆ encode_wtext() [1/2]

string TextEncoder::encode_wtext ( const wstring &  wtext) const
inline

◆ encode_wtext() [2/2]

string TextEncoder::encode_wtext ( const wstring &  wtext,
TextEncoder::Encoding  encoding 
)
static

Encodes a wide-text string into a single-char string, according to the given encoding.

Definition at line 183 of file textEncoder.cxx.

References decode_text(), and encode_wchar().

◆ get_default_encoding()

TextEncoder::Encoding TextEncoder::get_default_encoding ( )
inlinestatic

Specifies the default encoding to be used for all subsequently created TextEncoder objects.

See set_encoding().

Definition at line 97 of file textEncoder.I.

References set_text().

Referenced by MouseWatcherParameter::get_candidate_string_encoded(), lower(), ButtonEvent::read_datagram(), set_default_encoding(), upper(), and ButtonEvent::write_datagram().

◆ get_encoded_char() [1/2]

string TextEncoder::get_encoded_char ( int  index) const
inline

Returns the nth char of the stored text, as a one-, two-, or three-byte encoded string.

Definition at line 264 of file textEncoder.I.

References get_encoding().

Referenced by set_unicode_char().

◆ get_encoded_char() [2/2]

string TextEncoder::get_encoded_char ( int  index,
TextEncoder::Encoding  encoding 
) const
inline

Returns the nth char of the stored text, as a one-, two-, or three-byte encoded string.

Definition at line 275 of file textEncoder.I.

References encode_wtext(), get_text_as_ascii(), and get_unicode_char().

◆ get_encoding()

TextEncoder::Encoding TextEncoder::get_encoding ( ) const
inline

Returns the encoding by which the string set via set_text() is to be interpreted.

See set_encoding().

Definition at line 73 of file textEncoder.I.

References set_default_encoding().

Referenced by get_encoded_char(), and set_encoding().

◆ get_num_chars()

int TextEncoder::get_num_chars ( ) const
inline

Returns the number of characters in the stored text.

This is a count of wide characters, after the string has been decoded according to set_encoding().

Definition at line 219 of file textEncoder.I.

References get_unicode_char(), and get_wtext().

Referenced by append_unicode_char().

◆ get_text() [1/2]

string TextEncoder::get_text ( ) const
inline

◆ get_text() [2/2]

string TextEncoder::get_text ( TextEncoder::Encoding  encoding) const
inline

Returns the current text, as encoded via the indicated encoding system.

Definition at line 182 of file textEncoder.I.

References append_text(), encode_wtext(), and get_wtext().

◆ get_text_as_ascii()

string TextEncoder::get_text_as_ascii ( ) const
inline

Returns the text associated with the node, converted as nearly as possible to a fully-ASCII representation.

This means replacing accented letters with their unaccented ASCII equivalents.

It is possible that some characters in the string cannot be converted to ASCII. (The string may involve symbols like the copyright symbol, for instance, or it might involve letters in some other alphabet such as Greek or Cyrillic, or even Latin letters like thorn or eth that are not part of the ASCII character set.) In this case, as much of the string as possible will be converted to ASCII, and the nonconvertible characters will remain encoded in the encoding specified by set_encoding().

Definition at line 300 of file textEncoder.I.

References encode_wtext(), get_wtext_as_ascii(), and reencode_text().

Referenced by get_encoded_char().

◆ get_unicode_char()

int TextEncoder::get_unicode_char ( int  index) const
inline

Returns the Unicode value of the nth character in the stored text.

This may be a wide character (greater than 255), after the string has been decoded according to set_encoding().

Definition at line 232 of file textEncoder.I.

References get_wtext(), and set_unicode_char().

Referenced by get_encoded_char(), and get_num_chars().

◆ get_wtext()

const wstring & TextEncoder::get_wtext ( ) const
inline

◆ get_wtext_as_ascii()

wstring TextEncoder::get_wtext_as_ascii ( ) const

Returns the text associated with the node, converted as nearly as possible to a fully-ASCII representation.

This means replacing accented letters with their unaccented ASCII equivalents.

It is possible that some characters in the string cannot be converted to ASCII. (The string may involve symbols like the copyright symbol, for instance, or it might involve letters in some other alphabet such as Greek or Cyrillic, or even Latin letters like thorn or eth that are not part of the ASCII character set.) In this case, as much of the string as possible will be converted to ASCII, and the nonconvertible characters will remain in their original form.

Definition at line 76 of file textEncoder.cxx.

References get_wtext(), is_wtext(), and UnicodeLatinMap::look_up().

Referenced by get_text_as_ascii(), and make_lower().

◆ is_wtext()

bool TextEncoder::is_wtext ( ) const

Returns true if any of the characters in the string returned by get_wtext() are out of the range of an ASCII character (and, therefore, get_wtext() should be called in preference to get_text()).

Definition at line 108 of file textEncoder.cxx.

References encode_wchar(), and get_wtext().

Referenced by get_wtext_as_ascii().

◆ lower() [1/2]

string TextEncoder::lower ( const string &  source)
inlinestatic

Converts the string to lowercase, assuming the string is encoded in the default encoding.

Definition at line 488 of file textEncoder.I.

References get_default_encoding().

Referenced by upper().

◆ lower() [2/2]

string TextEncoder::lower ( const string &  source,
TextEncoder::Encoding  encoding 
)
inlinestatic

Converts the string to lowercase, assuming the string is encoded in the indicated encoding.

Definition at line 499 of file textEncoder.I.

References get_text(), make_lower(), set_encoding(), set_text(), and set_wtext().

◆ make_lower()

void TextEncoder::make_lower ( )

Adjusts the text stored within the encoder to all lowercase letters (preserving accent marks correctly).

Definition at line 47 of file textEncoder.cxx.

References get_wtext(), get_wtext_as_ascii(), and unicode_tolower().

Referenced by lower(), and make_upper().

◆ make_upper()

void TextEncoder::make_upper ( )

Adjusts the text stored within the encoder to all uppercase letters (preserving accent marks correctly).

Definition at line 30 of file textEncoder.cxx.

References get_wtext(), make_lower(), and unicode_toupper().

Referenced by upper().

◆ reencode_text()

string TextEncoder::reencode_text ( const string &  text,
TextEncoder::Encoding  from,
TextEncoder::Encoding  to 
)
inlinestatic

Given the indicated text string, which is assumed to be encoded via the encoding "from", decodes it and then reencodes it into the encoding "to", and returns the newly encoded string.

This does not change or affect any properties on the TextEncoder itself.

Definition at line 314 of file textEncoder.I.

References decode_text(), encode_wtext(), and unicode_isalpha().

Referenced by get_text_as_ascii().

◆ set_default_encoding()

void TextEncoder::set_default_encoding ( TextEncoder::Encoding  encoding)
inlinestatic

Specifies the default encoding to be used for all subsequently created TextEncoder objects.

See set_encoding().

Definition at line 85 of file textEncoder.I.

References get_default_encoding().

Referenced by get_encoding().

◆ set_encoding()

void TextEncoder::set_encoding ( TextEncoder::Encoding  encoding)
inline

Specifies how the string set via set_text() is to be interpreted.

The default, E_iso8859, means a standard string with one-byte characters (i.e. ASCII). Other encodings are possible to take advantage of character sets with more than 256 characters.

This affects only future calls to set_text(); it does not change text that was set previously.

Definition at line 59 of file textEncoder.I.

References get_encoding(), get_text(), and get_wtext().

Referenced by Filename::from_os_specific_w(), ExecutionEnvironment::get_cwd(), Filename::get_fullpath_w(), lower(), Filename::pattern_filename(), Filename::scan_directory(), Filename::to_os_long_name(), Filename::to_os_short_name(), Filename::to_os_specific_w(), and upper().

◆ set_text() [1/2]

void TextEncoder::set_text ( const string &  text)
inline

Changes the text that is stored in the encoder.

The text should be encoded according to the method indicated by set_encoding(). Subsequent calls to get_text() will return this same string, while get_wtext() will return the decoded version of the string.

Definition at line 112 of file textEncoder.I.

Referenced by PNMTextMaker::calc_width(), PNMTextMaker::generate_into(), get_default_encoding(), Filename::get_fullpath_w(), lower(), TextNode::set_text(), Filename::to_os_specific_w(), and upper().

◆ set_text() [2/2]

void TextEncoder::set_text ( const string &  text,
TextEncoder::Encoding  encoding 
)
inline

The two-parameter version of set_text() accepts an explicit encoding; the text is immediately decoded and stored as a wide-character string.

Subsequent calls to get_text() will return the same text re-encoded using whichever encoding is specified by set_encoding().

Definition at line 130 of file textEncoder.I.

References clear_text(), decode_text(), and set_wtext().

◆ set_unicode_char()

void TextEncoder::set_unicode_char ( int  index,
int  character 
)
inline

Sets the Unicode value of the nth character in the stored text.

This may be a wide character (greater than 255), after the string has been decoded according to set_encoding().

Definition at line 249 of file textEncoder.I.

References get_encoded_char(), and get_wtext().

Referenced by get_unicode_char().

◆ set_wtext()

void TextEncoder::set_wtext ( const wstring &  wtext)
inline

Changes the text that is stored in the encoder.

Subsequent calls to get_wtext() will return this same string, while get_text() will return the encoded version of the string.

Definition at line 516 of file textEncoder.I.

References get_wtext().

Referenced by decode_text(), Filename::from_os_specific_w(), ExecutionEnvironment::get_cwd(), lower(), Filename::pattern_filename(), Filename::scan_directory(), ConnectionManager::scan_interfaces(), set_text(), TextNode::set_wtext(), WinGraphicsWindow::static_window_proc(), Filename::to_os_long_name(), and Filename::to_os_short_name().

◆ unicode_isalpha()

bool TextEncoder::unicode_isalpha ( int  character)
inlinestatic

Returns true if the indicated character is an alphabetic letter, false otherwise.

This is akin to ctype's isalpha(), extended to Unicode.

Definition at line 327 of file textEncoder.I.

References UnicodeLatinMap::look_up(), and unicode_isdigit().

Referenced by reencode_text().

◆ unicode_isdigit()

bool TextEncoder::unicode_isdigit ( int  character)
inlinestatic

Returns true if the indicated character is a numeric digit, false otherwise.

This is akin to ctype's isdigit(), extended to Unicode.

Definition at line 344 of file textEncoder.I.

References UnicodeLatinMap::look_up(), and unicode_ispunct().

Referenced by unicode_isalpha().

◆ unicode_islower()

bool TextEncoder::unicode_islower ( int  character)
inlinestatic

Returns true if the indicated character is a lowercase letter, false otherwise.

This is akin to ctype's islower(), extended to Unicode.

Definition at line 415 of file textEncoder.I.

References UnicodeLatinMap::look_up(), and unicode_toupper().

Referenced by unicode_isspace().

◆ unicode_ispunct()

bool TextEncoder::unicode_ispunct ( int  character)
inlinestatic

Returns true if the indicated character is a punctuation mark, false otherwise.

This is akin to ctype's ispunct(), extended to Unicode.

Definition at line 362 of file textEncoder.I.

References UnicodeLatinMap::look_up(), and unicode_isupper().

Referenced by unicode_isdigit().

◆ unicode_isspace()

bool TextEncoder::unicode_isspace ( int  character)
inlinestatic

Returns true if the indicated character is a whitespace letter, false otherwise.

This is akin to ctype's isspace(), extended to Unicode.

Definition at line 395 of file textEncoder.I.

References unicode_islower().

Referenced by unicode_isupper().

◆ unicode_isupper()

bool TextEncoder::unicode_isupper ( int  character)
inlinestatic

Returns true if the indicated character is an uppercase letter, false otherwise.

This is akin to ctype's isupper(), extended to Unicode.

Definition at line 379 of file textEncoder.I.

References UnicodeLatinMap::look_up(), and unicode_isspace().

Referenced by unicode_ispunct().

◆ unicode_tolower()

int TextEncoder::unicode_tolower ( int  character)
inlinestatic

Returns the uppercase equivalent of the given Unicode character.

This is akin to ctype's tolower(), extended to Unicode.

Definition at line 447 of file textEncoder.I.

References UnicodeLatinMap::look_up(), and upper().

Referenced by make_lower(), Filename::make_true_case(), and unicode_toupper().

◆ unicode_toupper()

int TextEncoder::unicode_toupper ( int  character)
inlinestatic

Returns the uppercase equivalent of the given Unicode character.

This is akin to ctype's toupper(), extended to Unicode.

Definition at line 431 of file textEncoder.I.

References UnicodeLatinMap::look_up(), and unicode_tolower().

Referenced by make_upper(), and unicode_islower().

◆ upper() [1/2]

string TextEncoder::upper ( const string &  source)
inlinestatic

Converts the string to uppercase, assuming the string is encoded in the default encoding.

Definition at line 462 of file textEncoder.I.

References get_default_encoding().

Referenced by unicode_tolower().

◆ upper() [2/2]

string TextEncoder::upper ( const string &  source,
TextEncoder::Encoding  encoding 
)
inlinestatic

Converts the string to uppercase, assuming the string is encoded in the indicated encoding.

Definition at line 473 of file textEncoder.I.

References get_text(), lower(), make_upper(), set_encoding(), and set_text().


The documentation for this class was generated from the following files: