Development/String Classes

Throughout the codebase are two different families of string classes in use. Each of them has different features but efforts are ongoing to limit the use to one string family, with OUString for UTF-16 strings, and OString for 8bit strings. For better or worse, OString/OUString is modelled after the immutable Java String class, except with a contradictory additional operator+=

Favored Family, C++

 * OUString include/rtl/ustring.hxx, favored C++ 16bit (UTF-16) string


 * OString include/rtl/string.hxx, favored C++ 8bit string

Favored Family, C

 * rtl_uString include/rtl/ustring.h, raw C-API for OUString


 * rtl_String include/rtl/string.h, raw C-API for OString

Favored Family, C++ Helpers

 * OUStringBuffer include/rtl/ustrbuf.hxx, favored C++ 16bit (UTF-16) mutable string buffer


 * OStringBuffer include/rtl/strbuf.hxx, favored C++ 8bit mutable string buffer

Favored Family, C Helpers

 * OUStringBuffer include/rtl/ustrbuf.h, raw C-API OUStringBuffer equivalent


 * OStringBuffer include/rtl/strbuf.h, raw C-API OStringBuffer equivalent

Namespacing of OUString/OString
The rtl::OUString and rtl::OString classes are in the rtl namespace, but since they are already in practice namespaced by the O- prefix, for ease of reading it is not necessary to explicitly refer to the rtl namespace when using them in code that is not part of public API (i.e. most of the codebase).

Accessing the Data Buffer
Use SAL_DEBUG or SAL_INFO for debug output.

If you want to dump these strings with gdb, see Development/How to debug

Iterating through an OUString
Since sal_Unicode is a 16 bits type, using a simple  loop with   might cause problem UTF-16 codepoints coded on two words. If you need to iterate on an  that might contain such codepoints, use   or.

Iterating through a String
There are many loops over the UTF-16 code units in a String, and usually the loops use index variables of type  (typedef for  ). Since the OUString has a higher capacity these loops must be converted so that the index variable is. It may be even better to use, which iterates over Unicode code points; see previous section.

Favored Methods Utilizing String Literals
Both rtl::OString and rtl::OUString have specialized overloads for string literals that treat them efficiently, and string literals are automatically converted to rtl::OString/rtl::OUString when needed, or rtl::OString/rtl::OUString functions have overloads to handle them:

Please note that this auto-conversion does not work in the following cases, and you still need an explicit OUString constructor:


 * bFoo ? "bar" : "baz" [ you need bFoo ? OUString("bar") : OUString("baz"), and so on ]
 * return "foo"
 * aAny << "foo"
 * CPPUNIT_ASSERT_EQUAL("foo")

The RTL_CONSTASCII_STRINGPARAM and RTL_CONSTASCII_USTRINGPARAM macros are obsolete, and so are the various *AsciiL functions.

Empty string can be simply written as "" instead of OUString.