[aprssig] UTF-8 for APRS (testing)
Matti Aarnio
oh2mqk at sral.fi
Wed Sep 23 05:40:06 EDT 2009
On Tue, Sep 22, 2009 at 07:06:40PM -0400, Robert Bruninga wrote:
> > http://www.rfc-editor.org/rfc/rfc3629.txt
> >
> > Bob B. should very least read the abstract..
>
> To boil it all down to impact on present APRS systems, my
> understanding of the impact of this encoding scheme (with
> character sets other than 7 bit ascii) is simply the
> introduction of any number of 8-bit bytes into any number of
> places where text is used in APRS. There are many issues as I
> have elaborated before:
>
> 1) Delivery through digipeaters is not guaranteed
Delivery failure due to content is highly unlikely, unless there
is really bad software at the digi, or the datastream goes via
some 7E1 serial link that corrupts all data anyway.
> 2) Delivery thgough Igates is not guarnateed
Your basic APRS message delivery is not guaranteed, what is
the difference ? Of course if somebody is foolish enough
to run the TNC on "conversation mode" (or what it is called)
instead of KISS, then all bets are off.
Why to use KISS ? Because it removes variations due to TNC software.
Why not to use KISS ? Because producing binary AX.25 frame in
KISS format is a bit more difficult than putting out a line of text ?
> 3) Display on current clients in most cases will not work
Tough. Variations of Katakana seem to be available on SOME
models of Kenwood radios. (Katakana is Japanese "letter" writing
system.)
These variations are, of course, completely proprietary.
> 4) 8 bit bytes with high order bits set may cause problems to
> existing hardware and clients.
>
> The only concern I have ever had is point 4.
People who want to see characters outside the US-ASCII 7-bit range
are also willing to get new hardware and clients that are able to
do it.
People who never need it will be very unlikely to ever encounter any,
and if they do, they can inform the sender that "sorry, my software/
device does not understand your messages."
User education can help a lot here.
Btw: People are using messages containing 8th bit set bytes all
the time. Just that there are many ad-hoc methods which work in
between similar client softwares, each in their own style.
> If we can thoroughly test and get a handle on #4, then I think
> we are heading in the right direction. Issues 1-3 are expected
> for new introductions.
>
> IN my previous email I sugested we need a table for every system
> to be tested. I suggested these test items. I hope someone on
> the APRSSIG can take on this huge project and coordinate all the
> documentation of the results of these test. Better yet, write
> up a TEST plan and SPEC, so that all testing on every device and
> client is consistent, and does give us the information we need.
>
> > 1) Does it properly find the Line number?
> > 2) does it properly ACK the message
> * *) Soes it properly react to the ACK
> > 3) Will it properly REJect the message if full,
> > 4) can you "edit" the message... Etc.
Those are all application issues. Because UTF-8 does have unique
representation for '{' character, and no "extended" character can
contain that particular byte value, the UTF-8 unaware systems should
be able to find the '{' plus following sequence number, and present
ACK/REJs.
So aside of packet length issues, I think these highly likely.
> > 5) Do intermediate Igates see it as a valid message and pass it to RF?
Depends on how deeply that IGate does look inside into the packet
to be relayed. Most check only that it is message, and recipient's
call is NNNN-N.
Some IGates may communicate with the TNC thru a 7E1 serial communication
protocol parameters, which of course will very least corrupt the characters
with 8th bit set. KISS is the way to go (or soundmodem.)
> > 6) Does it get captured in the message list on the various APRS
> > internet servers
Yes. (aprs.fi & findu.com both get them. findu shows them correctly
as well, and Hessu will fix aprs.fi.)
> * 7) what does each existing DIGIpeater implementation do with it
Depends on maximum packet sizes, doesn't it?
UIDIGI is binary transparent, whatever is used in Japan is binary
transparent...
I will check digi_ned by doing source code review in the evening.
(I am 99% certain that it is binary transparent, but only review tells..)
> * 8) What does the APRS-IS do with it?
APRS-IS core is very close to binary transparent, and javAPRSSrvr treats
them happily as part of message packet. Only bytes 0x0A and 0x0C get
treated as end-of-line. Other software present at IGates and at APRS-IS
edges treats UTF-8 sequences as text, but may treat 0x00 bytes from one
of alternate ad-hoc schemes as end-of-line.
This is evident also when receiving the sample messages from UTF-8 robot.
> * 9) What changes are needed in Igates?
None with a decent one? Others may need: a) KISS interface with TNC,
b) permit 8th-bit-set bytes in payloads.
Especially: IGate using TNC2 in "conversation mode" with 7E1 communication
mode WILL CORRUPT the data.
How many such are out there? Could I offer a bounty on them?
Fixing their communication configurations should not be a problem,
but if their software does stripping of 8th bit, then the only way
is to use them in KISS mode - or replace them with KISS modem.
> > 10) Anything else we need to verify?
> >
> > In otherwords, I don't care if gyberish produces gyberish,
> > what counts is whether the APRS protocol still works. And
> > we want to see the above tests on every major system...
>
> Bob, WB4APR
73 de Matti, OH2MQK
More information about the aprssig
mailing list