What C++ string classes/systems exist that have good unicode support and a decent interface? -
The use of string in C ++ development is always a bit more complicated than languages like Java or Scripting languages. I
I know about the following major string systems and find out if they are different and what specific deficiencies they have. MSc CTT:
I will accept that there can be no definitive answer, but I think the SOS voting system should show the preferences Is particularly suited for people (and thus the validity of people's debate A) is actually using a fixed string system. Added to the answer:
- UFT8-CPP:
I think that some complexity is with a display focus in C ++ and some are historical.
I can say that this all is historically in particular, two pieces of history:
- When all () were using 7-bit or 8-bit character encoding. Because of this, the concept of
char
and "byte" are frustrating. - C ++ programmers quickly recognize the desirability of having a string class rather than raw
char *
. Unfortunately, they had to wait 15 years to officially get standardized. Meanwhile, people wrote their string class, which are still on us today.
Anyway, I have used the two classes you mentioned:
MFC Seatrating
Really There are two CString
classes: Uses CStringA
with char
with "ANSI" encoding, and CStringW
UTF Uses wchar_t
with encoding -16. Based on a preprocessor macro, one of them is typed ( too many in the "ANSI" and "Unicode" versions of Windows)
You can use UTF-8 for . Four version based, but there is a problem that Microsoft refuses to support "UTF-8" as an ANSI code page. Thus, the functions like
trim (const char * pszTargets)
which depend on being capable of recognizing character boundaries, if you use them with non-ASCII characters, then correctly Will not work.
Since UTF-16 is basically supported, you might prefer the wchar_t
based version.
- Li>
- Slow performance for very large stars. (Last time I checked, anyway.)
- Lack of integration with the C ++ standard library. No faster for currents, even
& lt; & Lt;
and& gt; & Gt;
. - This is Windows only.
(That last thing has caused me a lot of too much frustration because I was accused of sending my code to Linux. The string class is written, which is a clone of CSTIrang, but is a cross-platform.)
std :: basic_string
The good thing is that it is standard.
The worst thing about it is that it does not have Unicode support. Otoh, it does not actively support not Unicode, because it lacks member code like upper ()
/ less ()
Which convert letters into symbolic letters In that sense, it is actually more of the "dynamic array of code units" than "string".
There are such libraries that give you UTF-8, such as above and some functions in the library.
For the size of the characters, see.
Comments
Post a Comment