_
[Contents]
Copyright © 2004 jsd
1. Basic Observations
Executive summary:
- The "orinoco" wireless device-driver Linux that ships with
current versions of Linux has a bug that produces horrific numbers of
duplicate packets under certain conditions. I debugged it.
Patches are available.
- It's a somewhat interesting story, because before you can debug
something you have to decide what "normal" and/or "desirable" behavior
looks like. That's not so easy, because even on good days, the
behavior of wireless networks is intrinsically different from
conventional wired networks, as discussed in section 2.
A number of conventional assumptions need to be re-thought. There
are opportunities for further improvement.
- Also, the IPv6 "Duplicate Address Detection" is a mess, as
discussed in section 3. While trying to solve
a mostly-imaginary problem, it creates real problems. It should
be easy to fix, but it would be even easier to get rid of it entirely.
So ... here's the story:
- All observations refer to an 802.11 network in Managed mode (aka
Infrastructure mode), managed by a Netgear MR314. Features of the
MR314 include
- Wireless LAN base station. In official 802.11 parlance
this is called an Access Point, which is the simplest case of a
Distribution System.
- Router, including NAT function.
- Four ports for wired LAN connections.
- A distinct wired port for a WAN connection, which in this case
is connected to the Internet.
- There are four hosts on this network:
- a) IBM laptop, Linux 2.4.26, wireless via "Lucent WaveLAN GOLD
TURBO 11Mb" PCMCIA card, using the "orinoco_cs" driver distributed with
Linux.
- b) Homebrew desktop (Intel mainboard), Linux 2.4.26,
wireless via "Proxim ORiNOCO 8482-WD" 802.11a/b/g PCI card, using the
Atheros driver from MadWiFi.
- c) IBM laptop, win2k, wireless via "Netgear WG11T" 802.11a/b/g
PCMCIA card.
- d) Toshiba laptop, win98, 100baseT ether wired to the MR314
natbox.
- All the wireless-enabled boxes are sufficiently close together
that they can be in direct radio contact with each other, if/when they
choose to be.
- All Managed-mode wireless networks use a hub-and-spokes
topology. The uplinks are distinct from downlinks. That is,
when host (a) wants to communicate with host (b), it does not
communicate directly, which may seem somewhat surprising. Instead
host (a) uplinks a packet to the base station, which downlinks it to
host (b). This makes a certain amount of sense if you have a WiFi
setup where the base station is in the middle of the service area and
the two hosts are near opposite edges of the area, by allowing the two
hosts to communicate even though they are not in direct radio contact
with each other. It makes even more sense if there are multiple
Access Points hooked together to form a Distribution System extending
over a large area. Still, the hub-and-spoke topology is
unnecessary and indeed wasteful if all the hosts are close enough to be
in direct radio contact with each other, as they are in the network
under discussion. Furthermore, it would be very unhelpful in a
situation where the nodes are strung out along a line such as (a)
<--> (b) <--> (base) where (a) and (b) are in direct radio
contact, but (a) is out of range of the base station. In the
latter case, (a) and (b) would be able to talk to each other in Ad-hoc
mode, but not in Managed mode.
- I assume that no interface can actually receive while it is
transmitting ... but OTOH if circumstances indicate that this host
should receive the packet (either because it is multicast, or because
the interface is in promiscuous mode) then the packet gets looped back
in software, so it does actually show up in the input queue.
- In the actual 802.11 packet format that is carried over the air,
uplink traffic is distinguished from downlink traffic by use of the
ToDS and FromDS bits, meaning To the Distribution System and From the
Distribution System. However, when the downlinked packet arrives
at its destination, before it is handed over to the operating system
for processing, these bits have been lost, and all further processing
is based on the 48-bit MAC addresses, as it would be for ether packets.
- The concept of promiscuity is defined at the socket level, on a
socket-by-socket basis ... and also defined at the hardware
level. If you are using a non-promiscuous socket, you shouldn't
(in the absence of bugs) need to care whether the hardware is
promiscuous or not, because the software will filter out packets not
meant for you, by looking at the destination MAC address.
Meanwhile, an application such as tcpdump
can make good use of a promiscuous socket. Whenever one or more
promiscuous sockets are open, the operating system puts the hardware
into promiscuous mode. In addition, the ifconfig command can be used to
lock the hardware into promiscuous mode, although there is rarely (if
ever) any advantage in doing so. The printout from the ifconfig command will tell you
whether it has locked the
hardware in promiscuous mode, but will not tell you if the operating
system has promiscuified the hardware for some other reason.
Hint: the command "ip -6 addr ls"
is useful for (among other things) finding out which interfaces are
actually in promiscuous mode. Another hint: the
command "tcpdump -neli eth0 proto
99" is a convenient way to temporarily put the interface into
promiscuous mode. Since protocol 99 doesn't exist, the command
won't do much work and won't generate any output. Terminating the
program undoes the effect.
- When host (a) is in promiscuous mode, it blissfully receives all
of the uplink traffic and all of the downlink traffic. This is a bug. As a
consequence,
for example, if host (c) is pinging host (b), host (a) will see
two copies of the echo-request packet followed by two copies of the
echo-reply packet. This looks pretty weird the first time you see
it.
- In contrast, host (b) never receives uplink traffic, even when it
is in promiscuous mode (with the semi-exception of software loopback of
packets originating from host (b) itself, as described above).
That's because the MadWiFi driver apparently sets up the hardware to
only deliver downlink packets -- even in promiscuous mode -- and
furthermore (using the belt-and-suspenders approach) the driver
software would instantly throw away uplink packets if it ever got its
hands on them.
- That means for instance that if the hardwired host (d) is pinging
host (a), host (b) will see only the echo-request packet as it is
downlinked from the base station to host (a), and will not see the
echo-reply. Conversely, if host (a) is pinging host (d), host (b)
will see only the echo-reply packet, not the echo-request. This
also looks pretty weird the first time you see it. This paragraph
assumes host (b) is in promiscuous mode; otherwise it would see
none of this traffic, since it is nominally not a party to the
conversation.
- Things get even weirder in the case where a wireless host such as
(c) is pinging the buggy host (a). If the hardware on host (a) is in
promiscuous mode, it gets one copy of the echo-request packet as it is
uplinked from (c) to the base station, and gets another copy as it is
downlinked from the base station to (a). Both of these look like
valid requests -- with the right MAC addresses and everything -- so
host (a) replies to them both. This affects all applications, not
just ping. One way of understanding this is to realize the
application has opened a non-promiscuous socket. It is relying on
the per-socket software filtering to get rid of undesired packets, but
this filtering is failing. It fails because the filtering relies
on MAC addresses, and the MAC addresses look the same for the uplink
traffic and downlink traffic. If you are using tcpdump on host (b) to observe
this, you see three packets: the echo-request downlinked from the
base station to (a), followed by the two replies downlinked from the
base station to (c). If you are watching this from host (a)
itself, you ordinarily see six
packets: two copies of the echo-request and two copies of each of
the echo-replies. (Sometimes, especially if the packets are long,
you see less than six, presumably because the hardware cannot receive
so many packets in such a short time.) This is bad news, because
it means the behavior of one process affects the behavior of another
process. Specifically, the ping process will see duplication of
packets or not, depending on whether or not some other
seemingly-unrelated process such as tcpdump has opened a
promiscuous socket. Duplication is sometimes very serious and
sometimes less serious, as discussed in section 3.
- Things get even weirder if we consider broadcast or multicast
packets. Suppose host (a) sends a multicast packet. You
might expect host (a) would receive two copies: one from software
loopback (since it is a legitimate recipient of the packet being
uplinked) and another from receiving the downlinked multicast packet
... but somehow usually only one copy shows up in the input
queue. I'm pretty sure the received packet is the downlink
packet, and that the software chooses not to loopback multicast
messages to itself. Meanwhile host (b) will receive only one copy
of this packet, namely the downlinked copy. It doesn't
matter whether (a) and/or (b) are in promiscuous mode, since they are
legitimate recipents of these packets, as specified by the layer2
headers.
- If host (b) sends a multicast packet, again you might expect it
to receive two copies ... but somehow usually only one shows up
in the input queue. Just to keep things weird, if host (a) is
watching all this in promiscuous mode, it will receive two copies of this packet, one from
the uplink and one from the downlink.
- You may be wondering how anything so messed up could possibly
survive in the marketplace. I don't know, but here are some
conjectural partial answers:
- Possibly part of the answer may lie in the fact that the system
more-or-less works if the hosts on the wireless LAN never talk to each
other. Instead, host (a) talks via the base station to the
internet, while independently host (b) talks via the base station to
the internet, et cetera. Apparently simply connecting to the WAN
is the intended application, and if you don't stray from that simple
scenario you're more-or-less OK.
- Possibly part of the answer is that many of the fundamental IP
protocols were designed to tolerate duplicate packets. Therefore
if
the hosts on the wireless LAN try to communicate with each other, some
applications function semi-correctly. There are penalties
involved,
including a tremendous reduction in throughput due to all the duplicate
traffic flying around.
- Possibly another part of the answer is that it is somewhat
uncommon for a machine to be in promiscuous mode at the times
when things are happening that are sensitive to packet
duplication. (That's unless you happen to be trying to debug
networking hardware or software, in which case you're going to suffer
from some nasty Heisenbugs.)
- In some cases, alas, packet duplication has severe consequences,
because not all IP protocols tolerate duplication. One example
that leaps to mind is IPv6 Duplicate Address Detection, as will be
discussed in section 3.
2. Discussion
So, what can we learn from all this?
One important take-home message is that even in the absence of bugs,
Infrastructure-mode WiFi networks intrinsically
behave differently from wired ethernets. The main intrinsic
difference is that it is possible for a host to receive (via the
downlink) a packet that the host itself recently sent (via the
uplink). (In contrast, this can never happen in an ordinary
baseband ethernet, nor in an Ad-hoc-mode wireless network. In
"normal" networks, a transmitted packet cannot return to the sender.)
This behavior is due to a combination of two factors:
- Odd factor
#1 is that the base station is performing a store-and-forward operation
at layer 2. This is unusual; it is more common for
store-and-forward operations to be conducted at layer 3.
- Odd
factor #2 is that the network is using free space as its transmission
medium, so the base station cannot control where the forwarded
(downlinked) packets go. (In contrast, packet switches and
routers that operate at layer 2 in the wired world are generally clever
enough to not send a packet back down the wire it came in on.)
- Odd factor #3 is that 802.11 is just plain different from 802.3
(i.e. ether). People are accustomed to seeing IP, and even stuff
that isn't IP (e.g. ARP, IPX, netbeui) can still move along an ether as
802.3 packets. Many software applications want to know which
802.3 protocol-type they are dealing with, and the question is
unanswerable for 802.11 packets.
The very existence of an uplink distinct from the downlink raises all
sorts of thorny issues. How should the interfaces present uplink
and downlink traffic to the higher layers? IMHO box (a) is much
too forward, wantonly tossing the uplink traffic in with the downlink
traffic. On the other side of the same coin, IMHO box (b) is much
too backward, never receiving the uplink traffic, even though it could,
and even though I might very much want to see it. The next
question is, how should utilities like tcpdump display uplink data as
distinct from downlink data? The layer2 MAC headers are the same,
and all the higher-level bits are the same, so any analysis that starts
with the conventional layer-model is going to have problems.
The best way out of this, as far as I can see, is to face up to some
facts that are concealed by the interfaces as they have heretofore been
structured. The key fact is that there is encapsulation going on. The
operating system and higher-level application programs would like to
pretend that all layer2 links are ether-like, with a 48-bit MAC address
for the source and a 48-bit MAC address for the destination. But
that pretense just doesn't agree with reality. The link between
host (a) and the base station is not a piece of ether; it is an
802.11 link. An 802.11 link actually has three or four MAC
addresses, the meaning of which is determined by the ToDS and FromDS
bits.
Encapsulation is a tried-and-true concept. Well-known examples
include IP-in-IP encapsulation, IPsec, and GRE. In the present
case, we have a physical
802.11 link from host (a) to the base station, and similarly a physical
802.11 link from the base station to host (b). At an abstract
non-physical layer, we have a virtual
ether link going virtually directly from host (a) to host (b).
The way this works is that the ether packet is encapsulated in an envelope. Viewed from the
outside, the envelope is a raw 802.11 packet that moves over the
physical link from (a) to the base station. Then another envelope
moves from the base station to (b). Finally at (b) the packet is decapsulated. That is, the
ether packet is extracted from the envelope and forwarded to its final
destination.
The current WiFi cards pretend to implement an ether device. The
interface is literally called eth0 on my host (a). It is
called ath0
on my host (b), but it comes to the same thing, because it is designed
to
act like an ether device. In the future, to make sense of what is
going on, it would help to have an additional type of device on each
host. Call it wifi0. The wifi0 device is connected to the
physical link and is suitable for carrying envelopes, not naked ether
packets. If you look at wifi0 using tcpdump, you will see
envelopes. You may see envelopes going from host (a) to the base
station, and you may see envelopes going from the base station to host
(b), but you will never see an envelope going from (a) to (b), because
there is no such 802.11 link. The wifi0 device, in order to
provide what the operating system expects, will present the envelopes
to the higher layers in a form that looks ether-like to the extent of
having a source-MAC-address and a destination-MAC-address. For
instance, for a packet going from host (a) to the base station, the
destination-MAC-address will be the MAC address of the base
station. The MAC address of the intended ultimate recipient, i.e.
host (b), is encoded deep inside the payload of the envelope, and is
not part of the addressing of the envelope. These envelopes will
be very different, both in their addressing and in their internal
structure, from the packets that are currently being pulled off the
uplink and downlink.
Meanwhile, if you attach tcpdump
to the eth0 interface, you will never see an uplink or a
downlink. Eth0 in my scheme is not a radio device; it is a
virtual device. The ether packet is absorbed by the eth0 device
on host (a) and encapsulated. It moves across the virtual ether
in one hop --- no uplink, no downlink. The ether packet becomes
visible again when it is decapsulated by the eth0 (or ath0) device on
host (b).
Tangential idea: By having wifi0 distinct from eth0, it may be
possible to get some of the benefits of Ad-hoc mode at the same time
you are using Managed mode. You fiddle with the routing tables so
that Managed traffic is routed via eth0, while Ad-hoc traffic is routed
via wifi0.
Objectives
- Most important: Non-promiscuous applications on the
destination-host do not receive any ToDS packets.
- Highly important: All packets received by non-promiscuous
applications are bit-for-bit the same as they have always been ... no
changes to MAC addresses or anything else. T
- Highly desirable: Promiscuous applications such as tcpdump
can snoop all traffic (including ToDS traffic) that arrives at the
antenna.
- Desirable: ToDS traffic should be presented to tcpdump in a way
that is non-confusing and preserves as much information as
possible.
Interim solutions
The foregoing scheme, with a physical wifi0 device distinct from the
virtual eth0 device, is the best solution. In the mean time, it
may be expedient to implement a halfway solution, based on the
following strategem: There is only one driver, called eth0 or
ath0 or whatever. Mostly it acts like a virtual device, as a
source and sink for ether-type packets.
However, the ability to snoop on uplink traffic is valuable.
A quick and simple way to receive uplink traffic without causing
confusion due to duplicated packets is as follows: Whenever the
driver gets its hands on uplink traffic, present it to the networking
stack as an 802.11 packet
without decapsulating it. This allows us to achieve all of the
objectives enumerated above.
I patched the orinoco driver to do this. It works great.
Now the software-based per-socket filtering on non-promiscuous sockets
works fine -- applications no longer suffer from duplicated
packets. Meanwhile tcpdump
and programs of that ilk are still able to see all the traffic, uplink
as well as downlink.
Part of the trick was to re-arrange a couple fields in the 802.11
packet to make resemble an 802.3 packet so that 802.3-oriented
applications such as tcpdump
know how to deal with it. One way of thinking of this is that we
have a very light-weight encapsulation of 802.11 in 802.3, which I call
802.311. We can be confident that naive applications will not be
confused by the 802.311 packet, for at least two reasons: For
one, the MAC addresses on the 802.311 packet indicate that the packet
is not addressed to this host. This stands in contrast to what
would happen if we decapsulated this packet, since the payload of the
802.11 frame can be considered an ether frame that is addressed to this host.
(As a related point, the sk_buff.pkt_type is set to PACKET_OTHERHOST,
just to cover weird cases such as some wise-guy broadcasting in Ad-Hoc
mode.) A second reason is that the 802.311 packet is given a
distinctive ether-protocol-type (0x8311).
Of course tcpdump doesn't (yet?) understand the new encapsulation
scheme, so when it sees a packet with ether-type 8311 it can only deal
with it at a low level. It will print out the packet in
hex. Similarly, if you are accustomed to using the command
tcpdump -s 150 -neli eth0 'not tcp'
you will have to start using
tcpdump -s 150 -neli eth0 'not tcp and
not (ether proto 0x8311 and ether[43] == 6)'
where the part in parentheses is the detector for encapsulated tcp packets.
There is one slight infelicity: Suppose host (a) is in
promiscuous
mode and transmits a packet. As usual, one copy gets received via
software
loopback. This is conceptually and virtually an uplink packet, so
ideally it should packaged as a 311 packet and handled the same way
as other hosts' uplink traffic. But the software loopback goes
through different machinery, and I don't at the moment know how to get
my hands on looped-back packets in order to fiddle their MAC
addresses. Soon afterward, another copy will be received by
radio, downlinked from the base station. So tcpdump
will receive two identical copies of such packets; neither copy
will have 311 encapsulation. This is not a problem for
ordinary non-promiscuous sockets, which will receive neither copy,
since the packets are not addressed to host (a).
To use the bugfixes and features described here,
please use the following procedure:
- Make sure your kernel is configured to load the orinoco driver as
a module (not compiled into the kernel).
- Download the latest version of the driver software using the
command: cvs -z3
-d:ext:anoncvs@savannah.nongnu.org:/cvsroot/orinoco co orinoco
- Apply the patch: http://www.av8n.com/computer/orinoco-cvs.patch
- Perform the usual
compilation steps: make; make install.
- Perform the following
steps: On redhat systems NW=/etc/init.d/network while on debian
systems NW=/etc/init.d/networking.
- $NW stop
- rmmod orinoco_cs
- rmmod orinoco
- rmmod hermes
- modprobe orinoco_cs
- $NW start
- iwconfig ......
- That's all.
I also fixed up the "atheros" driver from MadWiFi
to make it work in promiscuous mode. Heretofore all attempts to
put the interface into promiscuous mode had no effect. You can
download a patch from here:
Also, of you want to fetch the ultra-new "WPA" branch of the driver
using the command
cvs -z3
-d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/madwifi co -rWPA
madwifi
and use that, then there is an untested
but plausible patch for that, too:
You can think of this patch as having two parts: (1) Making
promiscuous mode do something at all, and (2) making sure it works
properly (i.e. not confusing higher-level applications) by using the
802.311 trick as previously discussed. The first step was a
bit of an adventure, because the driver made three checks in series
(one in hardware, two in software) to get rid of everything except
downlink packets addressed to this host.
3. IPv6 Duplicate Address Detection and other
nonsense
Whenever an IPv6 address is newly assigned to an interface, it is
considered tentative.
The interface then multicasts an IPv6 Neighbor Solicitation to see if
anybody else on the link is using that address. If so, the
address will be disabled. Flaw #1 in IPv6 is that if the
interface sees a Neighbor Solitication for its own tentative address,
it (with few exceptions) assumes that somebody else is tentatively
using the address, in which case both are supposed to disable the
address. That would make sense if the other NS arrived before we sent our NS. But
when the NS arrives immediately after
we send ours, the overwhelming likelihood is that we are seeing our own
packet bouncing back to us. Flaw #2 in IPv6 is that the NS packet
payload should contain some sort of fingerprint (such as a long random
number) so that we can instantly and conclusively distinguish our NS
packets from anybody else's NS packets, even if the layer2 and layer3
addresses are the same ... but no such fingerprint is provided.
One often hears the suggestion
that this song-and-dance with NS packets could be improved as
follows: if our interface is tentatively using an address and it
sees somebody else's NS mentioning that address, rather than sending
our own NS we could send some other type of message, some sort of NACK
explicitly indicating that we are (tentatively) using the
address. But alas that doesn't solve the problem, because if we
are being confused by our own NS messages we will also be confused by
our own NACK messages.
Flaw #3 in IPv6 is the very notion that Duplicate Address Detection is
valuable. Situations where the detector is falsely triggered
(e.g. due to duplicated packets) vastly outnumber situations where
there really is a duplicate address that needs to be detected.
Remember that DAD only ensures that the address is unique on the local
link, not necessarily unique globally, so its value is limited at best.
Among other things, DAD is applied to link-local addresses that start
with the fe80 prefix. The suffix of those addresses is supposed
to be derived from the MAC address of the interface card. So all
we are really detecting is the case where two cards on the same local
link have the same MAC address. That is really quite an unusual
situation, and in my opinion if you try something like that you should
expect weird failures, and you should not depend on higher-level
protocols to protect you. The vastly more common case involves
non-fe80 addresses, in which case it is overwhelmingly likely that the
two offenders will be on different links, and DAD will fail to detect
the problem.
DAD is a noble goal, it it would be great to have a DAD method that
actually works. But the current scheme does vastly more harm than
good. It has too many false negatives and too many false
positives --- i.e. real duplications go undetected while innocuous
echoes cause the interface to shut down. In the attempt to
solve an almost-imaginary problem it creates real problems. (It
could probably be promoted from "harmful" to "harmless" by the addition
of fingerprints as discussed above.)
References
- A Technical Tutorial on the IEEE 802.11 Protocol http://www.sss-mag.com/pdf/802_11tut.pdf
[Contents]
_
Copyright © 2004 jsd