Let's bust a myth that is a source of many subtle bugs. Are you sure that you can simply drop UTF-8-encoded text in char-based strings that expect ASCII text, and your C++ code will still work fine?| Giovanni Dicanio's Blog
While researching a very weird bug0 in Koha I had to figure out a way chop a string to a specific maximum length. In bytes and not in characters, because in that case the horrible format USMARC is used, whose spec starts with two red flags: It's from January 2000, and it's an "implementation of the American national standard", so you can bet that it only works (well) with ASCII and will be ... interesting when handling Unicode. But it's generally broken for longer strings1.| domm.plix.at
I have some SQL script files on Windows 7. When opened with Notepad++, in the "Encoding" menu some of them are reported to have an encoding of "UCS-2 Little Endian" and some of ...| Software Engineering Stack Exchange