[aprssig] UTF-8 for APRS (testing)

Matti Aarnio oh2mqk at sral.fi
Wed Sep 23 05:40:06 EDT 2009


On Tue, Sep 22, 2009 at 07:06:40PM -0400, Robert Bruninga wrote:
> > http://www.rfc-editor.org/rfc/rfc3629.txt
> > 
> > Bob B. should very least read the abstract..
> 
> To boil it all down to impact on present APRS systems, my
> understanding of the impact of this encoding scheme (with
> character sets other than 7 bit ascii) is simply the
> introduction of any number of 8-bit bytes into any number of
> places where text is used in APRS.  There are many issues as I
> have elaborated before:
> 
> 1) Delivery through digipeaters is not guaranteed

  Delivery failure due to content is highly unlikely, unless there
  is really bad software at the digi, or the datastream goes via
  some 7E1 serial link that corrupts all data anyway.

> 2) Delivery thgough Igates is not guarnateed

  Your basic APRS message delivery is not guaranteed, what is
  the difference ?   Of course if somebody is foolish enough
  to run the TNC on "conversation mode" (or what it is called)
  instead of KISS, then all bets are off.

  Why to use KISS ?  Because it removes variations due to TNC software.
  Why not to use KISS ?  Because producing binary AX.25 frame in
  KISS format is a bit more difficult than putting out a line of text ?

> 3) Display on current clients in most cases will not work

  Tough.   Variations of Katakana seem to be available on SOME
  models of Kenwood radios.  (Katakana is Japanese "letter" writing
  system.)

  These variations are, of course, completely proprietary.

> 4) 8 bit bytes with high order bits set may cause problems to
>    existing hardware and clients.
> 
> The only concern I have ever had is point 4.

  People who want to see characters outside the US-ASCII 7-bit range
  are also willing to get new hardware and clients that are able to
  do it.

  People who never need it will be very unlikely to ever encounter any,
  and if they do, they can inform the sender that "sorry, my software/
  device does not understand your messages."

  User education can help a lot here.

  Btw: People are using messages containing 8th bit set bytes all
  the time.  Just that there are many ad-hoc methods which work in
  between similar client softwares, each in their own style.


> If we can thoroughly test and get a handle on #4, then I think
> we are heading in the right direction.  Issues 1-3 are expected
> for new introductions.
> 
> IN my previous email I sugested we need a table for every system
> to be tested.  I suggested these test items.  I hope someone on
> the APRSSIG can take on this huge project and coordinate all the
> documentation of the results of these test.  Better yet, write
> up a TEST plan and SPEC, so that all testing on every device and
> client is consistent, and does give us the information we need.
> 
> > 1) Does it properly find the Line number?
> > 2) does it properly ACK the message
> * *) Soes it properly react to the ACK
> > 3) Will it properly REJect the message if full,
> > 4) can you "edit" the message... Etc.

  Those are all application issues.  Because UTF-8 does have unique
  representation for '{' character, and no "extended" character can
  contain that particular byte value, the UTF-8 unaware systems should
  be able to find the '{' plus following sequence number, and present
  ACK/REJs.

  So aside of packet length issues, I think these highly likely.

> > 5) Do intermediate Igates see it as a valid message and pass it to RF?

  Depends on how deeply that IGate does look inside into the packet
  to be relayed.  Most check only that it is message, and recipient's
  call is NNNN-N.

  Some IGates may communicate with the TNC thru a 7E1 serial communication
  protocol parameters, which of course will very least corrupt the characters
  with 8th bit set.  KISS is the way to go (or soundmodem.)

> > 6) Does it get captured in the message list on the various APRS
> >    internet servers

  Yes.  (aprs.fi & findu.com  both get them.  findu shows them correctly
  as well, and Hessu will fix aprs.fi.)

> * 7) what does each existing DIGIpeater implementation do with it

  Depends on maximum packet sizes, doesn't it?
  UIDIGI is binary transparent, whatever is used in Japan is binary
  transparent...

  I will check  digi_ned  by doing source code review in the evening.
  (I am 99% certain that it is binary transparent, but only review tells..)

> * 8) What does the APRS-IS do with it?

  APRS-IS core is very close to binary transparent, and javAPRSSrvr treats
  them happily as part of message packet.  Only bytes 0x0A and 0x0C get
  treated as end-of-line.   Other software present at IGates and at APRS-IS
  edges treats UTF-8 sequences as text, but may treat 0x00 bytes from one
  of alternate ad-hoc schemes as end-of-line.

  This is evident also when receiving the sample messages from UTF-8 robot.

> * 9) What changes are needed in Igates?

  None with a decent one?  Others may need:  a) KISS interface with TNC,
  b) permit 8th-bit-set bytes in payloads.

  Especially:  IGate using TNC2 in "conversation mode" with 7E1 communication
  mode WILL CORRUPT the data.

  How many such are out there?  Could I offer a bounty on them?
  Fixing their communication configurations should not be a problem,
  but if their software does stripping of 8th bit, then the only way
  is to use them in KISS mode - or replace them with KISS modem.

> > 10) Anything else we need to verify?
> > 
> > In otherwords, I don't care if gyberish produces gyberish,
> > what counts is whether the APRS protocol still works.  And
> > we want to see the above tests on every major system...
> 
> Bob, WB4APR

73 de Matti, OH2MQK




More information about the aprssig mailing list