Printing Non-ASCII Characters to the Console on Windows

…A subtitle could read as: “From Greek letters to Os”.

I came across some Windows C++ source code that printed the Greek small letter pi (π, U+03C0) and capital letter sigma (Σ, U+03A3) to the console, as part of some mathematical formulae. A minimal reproducible code snippet based on the original code looks like this:

#include <stdio.h>

#define CHAR_SIGMA 227
#define CHAR_PI 228

int main()
{
    printf("Greek pi: %c\n", CHAR_PI);
    printf("Greek Sigma: %c\n", CHAR_SIGMA);
}

When executed on the original author’s PC, the above code prints the expected Greek letters correctly. However, when I run the exact same code on my PC, I got this wrong output:

Greek letters wrongly printed to the console
Greek letters wrongly printed to the console

What’s going wrong here?

The problem is that the author of that code probably looked up the codes for those Greek letters in some code page, probably code page 437.

However, on my Windows PC, the default code page for the console seems to be a different one, i.e. code page 850; in fact, in this code page (850) the codes 227 and 228 are mapped to Ò and õ (instead of Σ and π) respectively.

How to fix that?

I would suggest using Unicode instead of code pages. On Windows, the default “native” Unicode encoding is UTF-16. Using UTF-16, π (U+03C0) is encoded as the 16-bit code unit 0x03C0 ; Σ (U+03A3) is encoded as 0x03A3.

This code snippet shows UTF-16 console output in action:

#include <fcntl.h>
#include <io.h>
#include <stdio.h>

int wmain( /* int argc, wchar_t* argv[] */ )
{
    // Enable Unicode UTF-16 output to console
    _setmode(_fileno(stdout), _O_U16TEXT);

    // Use UTF-16 encoding for Greek letters
    static const wchar_t kPi = 0x03C0;
    static const wchar_t kSigma = 0x03A3;
    
    wprintf(L"Greek pi: %c\n", kPi);
    wprintf(L"Greek Sigma: %c\n", kSigma);
}
Greek letters correctly printed to the console
Greek letters correctly printed to the console

Note the use of _setmode and _O_U16TEXT to enable UTF-16 output to the console, and the associated <fcntl.h> and <io.h> additional headers. (More details on those can be found in this blog post.)

P.S. Bonus reading:

Unicode Encoding Conversions with STL Strings and Win32 APIs

 

Leave a Reply

Your email address will not be published. Required fields are marked *