保存文件为UTF8格式(Writing UTF-8 files in C++).

都是简单的单词,我就不翻译了。

原文地址:http://mariusbancila.ro/blog/2008/10/20/writing-utf-8-files-in-c/

 

Let’s say you need to write an XML file with this content:

< ?xml version="1.0" encoding="UTF-8"? >
< root description="this is a naïve example" >
< /root >

How do we write that in C++?

At a first glance, you could be tempted to write it like this:

#include< fstream >

int main()
{
        std
::ofstream testFile;

        testFile
.open("demo.xml", std::ios::out| std::ios::binary);

        std
::string text =
               
"< ?xml version="1.0" encoding="UTF-8"? >n"
               
"< root description="this is a naïve example" >n< /root >";

        testFile
<< text;

        testFile
.close();

       
return0;
}

When you open the file in IE for instance, surprize! It's not rendered correctly:

保存文件为UTF8格式(Writing UTF-8 files in C++).

So you could be tempted to say "let's switch to wstring and wofstream".

int main()
{
        std
::wofstream testFile;

        testFile
.open("demo.xml", std::ios::out| std::ios::binary);

        std
::wstring text =
                L
"< ?xml version="1.0" encoding="UTF-8"? >n"
                L
"< root description="this is a naïve example" >n< /root >";

        testFile
<< text;

        testFile
.close();

       
return0;
}

And when you run it and open the file again, no change. So, where is the problem? Well, the problem is that neither ofstream nor wofstream write the text in a UTF-8 format. If you want the file to really be in UTF-8 format, you have to encode the output buffer in UTF-8. And to do that we can use WideCharToMultiByte(). This Windows API maps a wide character string to a new character string (which is not necessary from a multibyte character set). The first argument indicates the code page. For UTF-8 we need to specify CP_UTF8.

The following helper functions encode a std::wstring into a UTF-8 stream, wrapped into a std::string.

#include< windows.h >

std
::string to_utf8(constwchar_t* buffer,int len)
{
       
int nChars =::WideCharToMultiByte(
                CP_UTF8
,
               
0,
                buffer
,
                len
,
                NULL
,
               
0,
                NULL
,
                NULL
);
       
if(nChars ==0)return"";

       
string newbuffer;
        newbuffer
.resize(nChars);
       
::WideCharToMultiByte(
                CP_UTF8
,
               
0,
                buffer
,
                len
,
               
const_cast<char*>(newbuffer.c_str()),
                nChars
,
                NULL
,
                NULL
);

       
return newbuffer;
}

std
::string to_utf8(const std::wstring& str)
{
       
return to_utf8(str.c_str(),(int)str.size());
}

With that in hand, all you have to do is doing the following changes:

int main()
{
        std
::ofstream testFile;

        testFile
.open("demo.xml", std::ios::out| std::ios::binary);

        std
::wstring text =
                L
"< ?xml version="1.0" encoding="UTF-8"? >n"
                L
"< root description="this is a naïve example" >n< /root >";

        std
::string outtext = to_utf8(text);

        testFile
<< outtext;

        testFile
.close();

       
return0;
}

And now when you open the file, you get what you wanted in the first place.

保存文件为UTF8格式(Writing UTF-8 files in C++).

And that is all!

原文链接: https://www.cnblogs.com/lebronjames/archive/2013/03/05/2944007.html

欢迎关注

微信关注下方公众号,第一时间获取干货硬货;公众号内回复【pdf】免费获取数百本计算机经典书籍

    保存文件为UTF8格式(Writing UTF-8 files in C++).

原创文章受到原创版权保护。转载请注明出处:https://www.ccppcoding.com/archives/79710

非原创文章文中已经注明原地址,如有侵权,联系删除

关注公众号【高性能架构探索】,第一时间获取最新文章

转载文章受原作者版权保护。转载请注明原作者出处!

(0)
上一篇 2023年2月9日 下午7:10
下一篇 2023年2月9日 下午7:11

相关推荐