[nos-bbs] TCP/IP over AX.25 on VC - a study

Tue Jun 5 19:59:43 EDT 2007

----- Original Message ----- 
From: "(Skip) K8RRA" <k8rra at ameritech.net>
To: "TAPR xNOS - Mail List" <nos-bbs at lists.tapr.org>
Sent: Tuesday, June 05, 2007 4:23 PM
Subject: [nos-bbs] TCP/IP over AX.25 on VC - a study

>
> For discussion purposes, start with the text originated from the JNOS40
> Config Manaual written by Johan Reinalda and William Thompson c. 1994,in
> the secion "Of PACLEN, MTU, MSS, and More".  From that work some of
> configuration may be formulated, then:
>  a) ip rtimer        reassembly restart timer
>  b) tcp timertype    retransmit delay formula
>  c) tcp blimit       retransmit maximum count
>  d) tcp maxwait      retransmit maximum wait time
>  e) ax25 timertype   retransmit delay formula
>  f) ax25 blimit      retransmit maximum count
>  g) ax25 maxwait     retransmit maximum wait time
>  h) ax25 t2          reply pace time
>  i) ax25 t3          channel keep-alive timer
>  j) ax25 t4          channel idle closure timer
> require setting to complete the job.  Additionally, both ends of the
> link need compatible parameters and may not be independently set.

 Skip, could you explain why this is so with Jnos but not any other
NOS, all OS's and types included? I ask because, for example,
my Jnos runs wildly different parameters than, say, google.com. Or
download.nvidia.com which i usually use for testing. I am just wondering
what it is specifically with ip over ax.25 that requires scientific
inspection and testing of all those parameters? Most of all,
why do they need to be  the same at each end?

>  II) b), c), d), and e), f), g), demonstrate similar approaches to data
> retransmission for both AX.25 and TCP/IP protocols
>  III) h), i), and j), set end-to-end relationships that effectively
> avoid unwarranted overhead.

 It really all depends on what you are doing and how..Sending data to a node
that is reflected back on yourself may cause some problems. sending data to
a node
that is going to retransmit that data on the same frequency on
downlink to your partner may require something different..and yet
again a TheNet netrom/IP compatible node that ingests your packets,
shuffles them along a backbone, and spits them out on a different
frequency for downlink is going to need something different
entirely. then there is background noise, enhanced to not enhanced
VHF propagation, the audio drive to the VHF rigs, the quality of the modem
in the tnc's,
the speed of the tnc's to the serial port..lot of factors here and a lot of
timing parameters,
i am not sure one specific FTP is exactly a scientific test.

>
> In my research before testing I found the concept of TCP/IP over AX.25
> on a Virtual Channel (I) compelling for the benefits of quicker error
> detection and repair, plus the redundancy of data validation in both
> protocols.  While working thru configuration, I discovered the similar
> approaches to retransmission of data in both protocols do not play well
> together.  More disappointing is the aspect that in a mixed environment
> of both (UI) and (I) links there is not a good solution - one mode must
> suffer if the other is to succeed.

 Who is mixing UI and VC on the same node? the concept of
establishing a TCP/IP capable node on a specific frequency
assumes the sysop of that node is going to provide some
timing parameters based on documentation and common sense.

For example, on the node frequency, there will be no direct station to
station
transfers, this causes HTS. and you do not mix UI and VC. you assign
each station an IP and add it to the nodes ARP tables and there you
establish
if that user's link is going to be UI or VC. I have never heard of a "Mix".

I also do not underatand why an ax.25 frame carrying IP data, should somehow
be different from an ax.25 frame carrying, say, 7plus data? how does the raw
binary format of the IP frame affect the timing parameters and functioning
of
the AX.25 frame? unless those ax.25 parameters are guessed at and are not
of common sense.

for example, setting the retry timer on the ip side of things faster than
the retry on the
ax.25 side of things will cause the IP to backup. Or worse, using IP
datagrams
that have a larger size limit then the AX.25 side of things, which may cause
the
dreaded "fragmentation".

>
> Here is the test bed...  I have a couple 100K ascii files that require
> around 1/2hr to transfer under reasonable conditions.  For the no-stress
> test, just start one ftp "get" or "put" and wait until the transfer is
> complete.  This is a single-threaded example and works really well.  For
> the stress test, start two ftp sessions, one get and one put, then add a
> telnet BBS session attempting a NET/ROM connect, and also add "ping",
> "SMTP", and "finger", access from remote sites, all in the time window
> required for ftp to complete.  To watch progress I created a script to
> "source" from F10 console including "ax25 status", "arp", tcp view", and
> "tcp irtt" commands that could be run frequently on demand.

 This channel, assuming you are using a net/rom IP capable
firmware node such as TheNet X1J, sounds more loaded
than the channels we used in the "heydey" of packet around
here. FTP was limited to nighttime use, for exmaple
because we had to get along with vanilla ax.25 users.

I also do not underastand why you are attempting a get and a put using the
same
link? It sounds to me like you are trying to pull a dump truck
up mount washington with a ford pinto.

>
> What I found in the stress test was quite disappointing - the AX25-srtt
> doubled to 7sec but the IP-srtt rose tenfold to 326sec.  The
> retransmission rate rose from nill to to values exceeding the good data
> values.  It seems clear to me that the TCP/IP protocol retransmission
> testing conflicted with the AX.25 protocol retransmission testing.  The
> IP data was being needlessly retransmitted so rapidly that the
> underlying AX channel was overloaded by the excess.

Exactly what I mentioned above regarding IP timers versus
ax.25 timers. Just because your ax.25 channel has a 7 sec
frack to whatever it's destination is, does not mean
that BEHIND that ax.25 connection IP data is not piling up.

You also have several IP connection all attempting
to use the single AX.25 channel. ax.25 may be getting
it's point-to-point date to it's destination in 7 sec,
but since all those tcp sockets are sharing that connection
they divide. And..."Shudder"..all those TCP ACKS.

> In my experience so far, some conflict is unavoidable, the level of
> conflict may only be reduced but not eliminated.  Some retransmission
> will be required by multi-threaded or independent-but-concurrent use of
> the IP over AX on VC channel.  The worst news is that as VC channel gets
> better parameter settings to reduced retransmission the datagram channel
> gets worse performance by protracted times of inactivity.
>
> The "proof" of the above is that once one of the two ftp sessions
> completed, the remaining session returned to normal and required no
> further retransmission.  Further, the srtt values for both protocols
> began to return to the smaller (better) values.

 SO once you stopped overloading the channel the rest of
the traffic had better timing values? I'm not
too surprised.

> The best news is that even under stress, the data got thru without
> error.  Jnos software was bulletproof.  It seems that use of datagram
> (UI) mode for IP over AX is more suitable than connected (I) mode.

UI mode will, regardless of channel activity, or in some instances
if the node is even THERE  will spit out as many ax.25 frames in UI mode as
it
can to keep up with the ip traffic. bombard the UI with ip traffic and it
will back up
in the buffer, and keep transmitting long after the bombardment has stopped.
Much to the displeasure of anyone else using the frequency.

> However the logic presented in 1994 still seems compelling to me.

1994 was a busy time for IP over radio ax.25 on VHF.

> For the betterment of jnos, I'd like to float a concept based on this
> work.  "When IP is over AX on VC - arbitrarily defeat the pacing timers
> shown above as b), c), and d),

 Do that and you defeat the basic premise of TCP. you might as well
get rid of it's checksumming features too and stop the TCP
ACKS from taking up those 40 bytes on return.

 I'm sorry that your scientific test did not work out as you had
hoped, but I'm not quite sure that optimizing Jnos in it's
entirety, and especially that hacking the TCP timing parameters,
is going to help the nos community as a whole. It may help you in your
very own and personal perception of the documents, but in the
grand scheme of things it adds more complexity, and worse, it makes Jnos
fall away from the specific protocol specification to which
is was designed. those TCP timing parameters work fine on the internet
and everywhere TCP is used, sorry if they do not work with your own
personal set of rules you have established on your RF net. maybe
is has more to do with  the structure of your own RF net when it is
loaded down with a put,get,finger,ping, and even smtp? Also is your
RF net a 9600, 19.2k or even higher link?

 That might help, but then the same problem arises, you do not
have your own personal specification of the TCP
protocol to use at your disposal, so things may still go awry.

Steven - N1OHX