decompiler
1.0.0
|
An implementation of StringManager that understands terminated unicode strings. More...
#include <stringmanage.hh>
Public Member Functions | |
StringManagerUnicode (Architecture *g, int4 max) | |
Constructor. More... | |
virtual const vector< uint1 > & | getStringData (const Address &addr, Datatype *charType, bool &isTrunc) |
Retrieve string data at the given address as a UTF8 byte array. More... | |
bool | writeUnicode (ostream &s, uint1 *buffer, int4 size, int4 charsize) |
Translate/copy unicode to UTF8. More... | |
Public Member Functions inherited from StringManager | |
StringManager (int4 max) | |
Constructor. More... | |
virtual | ~StringManager (void) |
Destructor. | |
void | clear (void) |
Clear out any cached strings. | |
bool | isString (const Address &addr, Datatype *charType) |
void | saveXml (ostream &s) const |
Save cached strings to a stream as XML. More... | |
void | restoreXml (const Element *el, const AddrSpaceManager *m) |
Restore string cache from XML. More... | |
Private Member Functions | |
int4 | checkCharacters (const uint1 *buf, int4 size, int4 charsize) const |
Make sure buffer has valid bounded set of unicode. More... | |
Private Attributes | |
Architecture * | glb |
Underlying architecture. | |
uint1 * | testBuffer |
Temporary buffer for pulling in loadimage bytes. | |
Additional Inherited Members | |
Static Public Member Functions inherited from StringManager | |
static bool | hasCharTerminator (const uint1 *buffer, int4 size, int4 charsize) |
Check for a unicode string terminator. More... | |
static int4 | readUtf16 (const uint1 *buf, bool bigend) |
Read a UTF16 code point from a byte array. More... | |
static void | writeUtf8 (ostream &s, int4 codepoint) |
Write unicode character to stream in UTF8 encoding. More... | |
static int4 | getCodepoint (const uint1 *buf, int4 charsize, bool bigend, int4 &skip) |
Extract next unicode codepoint. More... | |
Protected Attributes inherited from StringManager | |
map< Address, StringData > | stringMap |
Map from address to string data. | |
int4 | maximumChars |
Maximum characters in a string before truncating. | |
An implementation of StringManager that understands terminated unicode strings.
This class understands UTF8, UTF16, and UTF32 encodings. It reports a string if its sees a valid encoding that is null terminated.
StringManagerUnicode::StringManagerUnicode | ( | Architecture * | g, |
int4 | max | ||
) |
Constructor.
g | is the underlying architecture (and loadimage) |
max | is the maximum number of bytes to allow in a decoded string |
References glb, and testBuffer.
|
private |
Make sure buffer has valid bounded set of unicode.
Check that the given buffer contains valid unicode. If the string is encoded in UTF8 or ASCII, we get (on average) a bit of check per character. For UTF16, the surrogate reserved area gives at least some check.
buf | is the byte array to check |
size | is the size of the buffer in bytes |
charsize | is the UTF encoding (1=UTF8, 2=UTF16, 4=UTF32) |
References StringManager::getCodepoint(), glb, Translate::isBigEndian(), and Architecture::translate.
Referenced by getStringData().
|
virtual |
Retrieve string data at the given address as a UTF8 byte array.
If the address does not represent string data, a zero length vector is returned. Otherwise, the string data is fetched, converted to a UTF8 encoding, cached and returned.
addr | is the given address |
charType | is a character data-type indicating the encoding |
isTrunc | passes back whether the string is truncated |
Implements StringManager.
References checkCharacters(), Datatype::getSize(), glb, StringManager::hasCharTerminator(), Datatype::isOpaqueString(), Architecture::loader, LoadImage::loadFill(), StringManager::maximumChars, StringManager::stringMap, testBuffer, and writeUnicode().
bool StringManagerUnicode::writeUnicode | ( | ostream & | s, |
uint1 * | buffer, | ||
int4 | size, | ||
int4 | charsize | ||
) |
Translate/copy unicode to UTF8.
Assume the buffer contains a null terminated unicode encoded string. Write the characters out (as UTF8) to the stream.
s | is the output stream |
buffer | is the given byte buffer |
size | is the number of bytes in the buffer |
charsize | specifies the encoding (1=UTF8 2=UTF16 4=UTF32) |
References StringManager::getCodepoint(), glb, Translate::isBigEndian(), StringManager::maximumChars, Architecture::translate, and StringManager::writeUtf8().
Referenced by getStringData().