Sai A Sai A
Updated date Mar 06, 2024
In this blog, we will learn how to convert wide strings to standard strings in C++. Explore methods using standard library functions and wide character iterators.

Introduction:

In C++ programming, handling strings is a common task. However, dealing with wide strings, which can store multibyte characters and are used to support internationalization, adds complexity. Sometimes, you may need to convert wide strings to standard strings for various purposes. In this blog, we will explore different methods to achieve this conversion in C++.

Method 1: Using Standard Library Functions

To convert a wide string to a string in C++ is by utilizing standard library functions. Specifically, we can use the std::wstring_convert class along with std::codecvt_utf8 to perform the conversion.

#include <iostream>
#include <string>
#include <locale>
#include <codecvt>

int main() {
    std::wstring wideStr = L"Hello, 你好";

    std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
    std::string str = converter.to_bytes(wideStr);

    std::cout << "Converted string: " << str << std::endl;

    return 0;
}

Output:

Converted string: Hello, 你好
  • We start by including necessary header files for string manipulation (<string>), input/output (<iostream>), locales (<locale>), and character conversion (<codecvt>).
  • Next, we define a wide string wideStr containing multibyte characters.
  • Then, we instantiate an object of std::wstring_convert with std::codecvt_utf8<wchar_t> as the template parameter, which specifies the conversion facet for UTF-8 encoding.
  • We use the to_bytes member function of std::wstring_convert to convert the wide string wideStr to a standard string str.
  • Finally, we print the converted string str to the console.

Method 2: Using Wide Character Iterators

Another approach to convert a wide string to a string involves iterating through each wide character and converting it to its corresponding UTF-8 representation.

#include <iostream>
#include <string>

std::string wideStrToUTF8(const std::wstring& wideStr) {
    std::string utf8Str;
    for (wchar_t wideChar : wideStr) {
        if (wideChar < 0x80) {
            utf8Str += static_cast<char>(wideChar);
        } else if (wideChar < 0x800) {
            utf8Str += static_cast<char>(0xC0 | (wideChar >> 6));
            utf8Str += static_cast<char>(0x80 | (wideChar & 0x3F));
        } else {
            utf8Str += static_cast<char>(0xE0 | (wideChar >> 12));
            utf8Str += static_cast<char>(0x80 | ((wideChar >> 6) & 0x3F));
            utf8Str += static_cast<char>(0x80 | (wideChar & 0x3F));
        }
    }
    return utf8Str;
}

int main() {
    std::wstring wideStr = L"Hello, 你好";
    std::string str = wideStrToUTF8(wideStr);
    std::cout << "Converted string: " << str << std::endl;
    return 0;
}

Output:

Converted string: Hello, 你好
  • We define a function wideStrToUTF8 that takes a wide string wideStr as input and returns a UTF-8 encoded string.
  • Within the function, we iterate through each wide character of wideStr.
  • For each wide character, we check its value to determine its encoding in UTF-8.
  • Based on the Unicode value of the wide-character, we construct the corresponding UTF-8 byte sequence.
  • Finally, we return the UTF-8 encoded string.

Conclusion:

In this blog, we have explored two methods to convert wide strings to strings in C++. The first method using standard library functions offers simplicity and ease of use, while the second method using wide character iterators provides more control over the conversion process.

Comments (0)

There are no comments. Be the first to comment!!!