I'm trying to figure out the term for these types of characters:
\M-C\M-6 (corresponds to german "ö")
\M-C\M-$ (corresponds to german "ä")
\M-C\M^_ (corresponds to german "ß")
I want to know the term for these outputs so that I can easily convert them into the utf-8 character they actually are in golang instead of creating a mapping of each I come across.
What is the term for these? unicode? What would be the best way to convert these "characters" to their actual human readable character in golang?
It is the vis encoding of UTF-8 encoded text.
Here's an example:
The UTF-8 encoding of the rune ö
in bytes is [0303, 0266]
.
vis encodes the byte 0303
as the bytes \M-C
and the byte 0266
as the bytes \M-6
.
Putting the two levels of encoding together, the rune ö
is encoded as the bytes \M-C\M-6
.
You can either write an decoder using the documentation on the man page or search for a decoding package. The Go standard library does not include such a decoder.