[aprssig] APRS protocol replacement ideas: protobuf

Sat Feb 12 03:03:09 EST 2022

On Fri, 11 Feb 2022, Weston Bustraan wrote:

> I can say that it was not easy to navigate the many exceptions and and 
> oddities in the APRS 1.0.1 spec when implementing a parser, and OpenTRAC 
> seems much more consistent, but the OpenTRAC spec does not seem to be 
> fully complete. For example, I would expect that any new protocol that 
> would supplant APRS would be supporting a UTF-8 character set instead of 
> just ASCII.

Yes! Implementing all variations and details of APRS is quite difficult 
indeed, as there are so many completely different position formats, and 
various quirks in which order extension fields should be decoded in, and 
the specifications do not detail all of those. It is unnecessarily complex 
and an awful lot of work.

Just last week there was a question on the Facebook APRS group, where one 
user was sending uncompressed APRS position packets, *without* a 
timestamp, but with a '@' packet type identifier which indicates a 
timestamp would follow. Their igate appliance requires the user configure 
a raw packet string template, and it's a little bit difficult to do that 
correctly for someone new to APRS. Aprsdirect parsed the position from the 
packet just fine, even though it was technically incorrect. aprs.fi, and 
likely many other decoders, would try to parse out a timestamp after the 
'@' and fail. From the user's point of view, it looked like aprs.fi is at 
fault. To add to the confusion, the remainder of the packet (after a 
suitable number of invalid timestamp bytes) happened to look like a 
perfectly valid *compressed* packet and aprs.fi managed to decode that to 
some random coordinates on the correct continent!  Now I'm wondering 
whether I should ignore all data after an invalid timestamp, or just keep 
skipping the invalid timestamp like it's been doing so far.

We have a fairly complete decoder (by OH2KKU and myself), but still not 
fully complete (Area Objects? DF?), and quirks like this come up every now 
and then.  It would not be very motivating to get to implement another 
custom packet encoder and decoder, and it is likely that slightly 
incompatible encoders and decoders would be produced by different 
developers.

If a replacement is actually planned, I would propose battling against the 
NIH syndrome, and using an existing low-level serializer which is widely 
used elsewhere, such as Protocol Buffers 
(https://developers.google.com/protocol-buffers). It is widely used 
outside Google and it's open source. The encoded on-the-wire (well, 
on-air) format is tight with little overhead and message formats can be 
extended without breaking existing decoders. Integer types use efficient 
variable-length encoding, so for a timestamp one could use int64 which 
would not actually use 64 bits in the packet until we actually reach such 
a date which requires it.

With Protocol Buffers one writes a schema file which describes the data 
format. A supplied compiler (protoc) compiles the schema file to C code, 
which is then included in the application to encode and decode packets. 
Compilers are available for Java, Python, Objective-C, C++, Kotlin, Dart, 
Go, Ruby, and C#, with more languages to come. A very simple example 
format would look like this:

message aprspacket {
         string srccall = 1;
         int64 timestamp = 2;
         optional float latitude = 3;
         optional float longitude = 4;
         optional int32 course = 5;
         optional int32 heading = 6;
         optional int32 altitude = 7;
         optional string comment = 10;
         optional string object_name = 11; // set a value to make it an object
}

Naturally you can have nested structures too - in a real world there would 
be a top-level packet structure with optional position, text message, 
weather and telemetry structures below it.

The upside of this is that whoever implements an application does not need 
to actually write the low-level encoder or decoder - the protobuf compiler 
builds it from the schema file. Developers download the schema file, 
compile their application with it, call the generated encoder/decoder 
functions/methods (aprspacket_unpack_message(), aprspacket_pack_message()) 
and get nice structures out.

Adding support for new fields in a new protocol version? Download a new 
schema file and recompile, find decoded value in output structure.

Adding a new custom field to experiment with? Add it in the schema field 
with a suitable large random field identifier in a "reserved for 
experimentation" range and experiment - other apps will ignore it.  Want 
to deploy it in the field for others to use? Submit a request to the 
Protocol Committee which allocates an identifier and releases a new 
version of the schema file.

The decoders will ignore any new fiels that might be added to packets by 
apps which have been built with a new version of the schema, so old apps 
will not blow up when they receive packets from new apps. Protobuf 3.5 
also retains these unknown fields in the decoder output structure, and 
re-encodes them if the structure is serialized again, so an old app can 
even re-encode and forward a packet with fields it does not implement.

For embedded systems there's https://github.com/nanopb/nanopb : "It is 
especially suitable for use in microcontrollers, but fits any memory 
restricted system.".

That said, I find it quite unlikely to find momentum for a completely new 
protocol encoding format, given the large amount of legacy hardware out 
there, and the difficulty of migration and backwards compatibility on RF 
side. Fragmentation would be really bad and a lot of old things would 
never be upgraded. Maybe this sort of thing could be used in the LoRa 
world which is still in development - it would speed up the development of 
new things quite a lot if the encoder/decoder code would be generated, not 
written by hand.

   - Hessu