Introduction:
In C++ programming, handling strings is a common task. However, dealing with wide strings, which can store multibyte characters and are used to support internationalization, adds complexity. Sometimes, you may need to convert wide strings to standard strings for various purposes. In this blog, we will explore different methods to achieve this conversion in C++.
Method 1: Using Standard Library Functions
To convert a wide string to a string in C++ is by utilizing standard library functions. Specifically, we can use the std::wstring_convert
class along with std::codecvt_utf8
to perform the conversion.
#include <iostream>
#include <string>
#include <locale>
#include <codecvt>
int main() {
std::wstring wideStr = L"Hello, 你好";
std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
std::string str = converter.to_bytes(wideStr);
std::cout << "Converted string: " << str << std::endl;
return 0;
}
Output:
Converted string: Hello, 你好
- We start by including necessary header files for string manipulation (
<string>
), input/output (<iostream>
), locales (<locale>
), and character conversion (<codecvt>
). - Next, we define a wide string
wideStr
containing multibyte characters. - Then, we instantiate an object of
std::wstring_convert
withstd::codecvt_utf8<wchar_t>
as the template parameter, which specifies the conversion facet for UTF-8 encoding. - We use the
to_bytes
member function ofstd::wstring_convert
to convert the wide stringwideStr
to a standard stringstr
. - Finally, we print the converted string
str
to the console.
Method 2: Using Wide Character Iterators
Another approach to convert a wide string to a string involves iterating through each wide character and converting it to its corresponding UTF-8 representation.
#include <iostream>
#include <string>
std::string wideStrToUTF8(const std::wstring& wideStr) {
std::string utf8Str;
for (wchar_t wideChar : wideStr) {
if (wideChar < 0x80) {
utf8Str += static_cast<char>(wideChar);
} else if (wideChar < 0x800) {
utf8Str += static_cast<char>(0xC0 | (wideChar >> 6));
utf8Str += static_cast<char>(0x80 | (wideChar & 0x3F));
} else {
utf8Str += static_cast<char>(0xE0 | (wideChar >> 12));
utf8Str += static_cast<char>(0x80 | ((wideChar >> 6) & 0x3F));
utf8Str += static_cast<char>(0x80 | (wideChar & 0x3F));
}
}
return utf8Str;
}
int main() {
std::wstring wideStr = L"Hello, 你好";
std::string str = wideStrToUTF8(wideStr);
std::cout << "Converted string: " << str << std::endl;
return 0;
}
Output:
Converted string: Hello, 你好
- We define a function
wideStrToUTF8
that takes a wide stringwideStr
as input and returns a UTF-8 encoded string. - Within the function, we iterate through each wide character of
wideStr
. - For each wide character, we check its value to determine its encoding in UTF-8.
- Based on the Unicode value of the wide-character, we construct the corresponding UTF-8 byte sequence.
- Finally, we return the UTF-8 encoded string.
Conclusion:
In this blog, we have explored two methods to convert wide strings to strings in C++. The first method using standard library functions offers simplicity and ease of use, while the second method using wide character iterators provides more control over the conversion process.
Comments (0)