[aprssig] Please,standardize UTF-8 for APRS
Heikki Hannikainen
hessu at hes.iki.fi
Fri Dec 18 01:48:03 EST 2009
On Thu, 17 Dec 2009, Robert Bruninga wrote:
> To clarify things, I have now created a UTF-8 discussion
> document that summarizes any issues with UTF-8 and suggests
> bounds on where it can be used. I welcome any suggestions or
> additions to this document to make sure we are all working from
> the same sheet. It is the first link listed on:
> http://aprs.org/aprs12.html
I think you're making it too complicated on that page, for the casual
developer, and mixing a lot of terminology. UTF-8 and ASCII are exactly
THE SAME for values 0 to 127.
Bytes in the ASCII range of 0–127 represent themselves in UTF, thereby
providing backward compatibility.
The whole APRS packet, and the mic-e packet, can be transmitted in UTF-8
encoded format, as the UTF-8 encoded version of the APRS/mic-e protocol
data is EXACTLY equal to ASCII. So most of the text on that page is
actually rather unnecessary!
I think the required specification boils down to:
-------------------- cut here ------------------
- Using UTF-8 is recommended for all free-form message data:
- APRS messaging data
- Comment field in compressed and uncompressed packets
- Mic-E Status Text Field
- Status messages
- Bulleting message content
- Beacon text
- Using 7-bit printable ASCII characters is mandatory for:
- All callsigns
- Bulletin / message destinations
- Object and item names
The whole APRS packet can be encoded as UTF-8, but character values above
127 must not be used in callsign fields which have strict
requirements for length (in bytes) in the encoded form.
-------------------- cut here ------------------
Could you please replace the utf-8.txt contents with that?
Also, it is not necessary to rant about not using UTF-8 in international
communications. I think it's rather obvious that when writing to you I
should write in English. Which uses characters under the < 127 values,
which are EQUAL in UTF-8 and ASCII - it will just work even if I am
transmitting in UTF-8 and you are expecting ASCII. So that's covered
automatically.
- Hessu
More information about the aprssig
mailing list