Refactor management frame structures to take into account optional
presence of some parts of the header:
* drop the single structure for management header and body since
the body offset is variable.
* add mmpdu_get_body to locate the start of frame body.
* drop the union of different management frame type bodies.
* prefix names specific to management frames with "mmpdu" instead
of "mpdu" including any enums based on 802.11-2012 section 8.4.
* move the FC field to the mmpdu_header structure.
If the kernel device driver or the kernel nl80211 version doesn't
support the new RSSI threshold list CQM monitoring, implement similar
logic in iwd with periodic polling. This is only active when an RSSI
agent is registered to receive the events. I tested this with the same
testRSSIAgent autotests that tests the driver-side rssi monitoring
except with all timeouts multiplied by ~20.
Function to allow netdev.c to explicitly tell eapol.c whether to expect
EAP / 4-Way handshake. This is to potentially make the code more
descriptive, until now we'd look at sm->handshake->ptk_complete to see
if a new PTK was needed.
A 4-Way handshake is required on association to an AP except after FT.
Modify netdev_get_iftype, which was until now unused, and add
netdev_set_iftype. Don't skip interfaces with types other than STATION
on startup, instead reset the type to STATION in device.c.
netdev_get_iftype is modified to use our own interface type enum to
avoid forcing users to include "nl80211.h".
Note that setting an interface UP and DOWN wouldn't generally reset the
iftype to STATION. Another process may still change the type while iwd
is running and iwd would not detect this as it would detect another
interface setting interface DOWN, not sure how far we want to go in
monitoring all of the properties this way.
Allow attempts to connect to a new AP using the Reassociation frame even
if netdev->operational is false. This is needed if we want to continue
an ongoing roam attempt after the original connection broke and will be
needed when we start using cached PMKSAs in the future.
There are situations including after beacon loss and during FT where the
cfg80211 will detect we're now disconnected (in some cases will send a
Deauthenticate frame too) and generate this event, or the driver may do
this. For example in ieee80211_report_disconnect in net/mac80211/mlme.c
will (through cfg80211) generate a CMD_DEAUTHENTICATE followed by a
CMD_DISCONNECT.
The kernel doesn't reset the netdev's state to disconnected when it
sends us a beacon loss event so we can't either unless we automatically
send a disconnect command to the kernel.
It seems the handling of beacon loss depends on the driver. For example
in mac80211 only after N beacon loss events (default 7) a probe request is
sent to the AP and a deauthenticate packet is sent if no probe reply is
receiver within T (default 500ms).
CMD_DEAUTHENTICATE is not available for FullMAC based cards. We already
use CMD_CONNECT in the non-FT cases, which works on all cards. However,
for some reason we kept using CMD_DEAUTHENTICATE instead of CMD_DISCONNECT.
For FT (error) cases, keep using CMD_DEAUTHENTICATE.
Certain WiFi drivers do not support using CMD_SET_STATION (e.g.
mwifiex). It is not completely clear how such drivers handle the
AUTHORIZED state, but they don't seem to take it into account. So for
such drivers, ignore the -ENOTSUPP error return from CMD_SET_STATION.
These flags are documented in RFC2863 and kernel's
Documentation/networking/operstates.txt. Operstate doesn't have any
siginificant effect on normal connectivity or on our autotests because
it is not used by the kernel except in some rare cases but it is
supposed to affect some userspace daemons that watch for RTM_NEWLINK
events, so I believe we *should* set them according to this
documentation. Changes:
* There's no point setting link_mode or operstate of the netdev when
we're bringing the admin state DOWN as that overrides operstate.
* Instead of numerical values for link_mode use the if.h defines.
* Set IF_OPER_UP when association succeeds also in the Fast Transition
case. The driver will have set carrier off and then on so the
operstate should be IF_OPER_DORMANT at this point and needs to be
reset to UP.
Add an methods and an event using the new
NL80211_EXT_FEATURE_CQM_RSSI_LIST kernel feature to request RSSI
monitoring with notifications only when RSSI moves from one of the N
intervals requested to another.
device.c will call netdev_set_rssi_report_levels to request
NETDEV_EVENT_RSSI_LEVEL_NOTIFY events every time the RSSI level changes,
level meaning one of the intervals delimited by the threshold values
passed as argument. Inside the event handler it can call
netdev_get_rssi_level to read the new level.
There's no fallback to periodic polling implemented in this patch for
the case of older kernels and/or the driver not supporting
NL80211_EXT_FEATURE_CQM_RSSI_LIST.
netdev_reassociate transitions to another BSS without FT. Similar to
netdev_connect but uses reassociation instead of association and
requires and an existing connection.
Handle the changes of interface address in RTNL New Link messages
similarly to the name changes, emit a NETDEV_WATCH_EVENT_ADDRESS_CHANGE
event and a propety change on dbus.
Note this can only happen when the interface is down so it doesn't
break anything but we need to handle it anyway.
Right now the code checks for is_rsn to wait for the 4-way handshake and
sends the NETDEV_EVENT_4WAY_HANDSHAKE. However, is_rsn condition is not
true for WSC connections since they do not set an RSN field. Still,
they are EAP based handshakes and should be treated in the same manner.
We relax the is_rsn check to instead check for netdev->sm. Currently
netdev->sm is only non-NULL if handshake->own_ie field is not NULL or in
the case of eap-wsc connections.
Make sure that the Neighbor Report timeout is cancelled when connection
breaks or device is being destroyed, and call the callback. Add an
errno parameter to the callback to indicate the cause.
Validate the fourth message of the fast transition sequence and save the
new keys and state as current values in the netdev object. The
FT-specific IE validation that was already present in the initial MD
is moved to a new function.
Build and send the FT Authentication Request frame, the initial Fast
Transition message.
In this version the assumption is that once we start a transition attempt
there's no going back so the old handshake_state, scan_bss, etc. can be
replaced by the new objects immediately and there's no point at which both
the old and the new connection states are needed. Also the disconnect
event for the old connection is implicit. At netdev level the state
during a transition is almost the same with a new connection setup.
The first disconnect event on the netlink socket after the FT Authenticate
is assumed to be the one generated by the kernel for the old connection.
The disconnect event doesn't contain the AP bssid (unlike the
deauthenticate event preceding it), otherwise we could check to see if
the bssid is the one we are interested in or could check connect_cmd_id
assuming a disconnect doesn't happen before the connect command finishes.
Action Frames are sent by nl80211 as unicast data. We're not receiving
any other unicast packets in iwd at this time so let netdev directly
handle all unicast data on the genl socket.
There are situations when a CMD_DISCONNECT or deauthenticate will be
issued locally because of an error detected locally where netdev would
not be able to emit a event to the device object. The CMD_DISCONNECT
handler can only send an event if the disconnect is triggered by the AP
because we don't have an enum value defined for other diconnects. We
have these values defined for the connect callback but those errors may
happen when the connect callback is already NULL because a connection
has been estabilshed. So add an event type for local errors.
These situations may occur in a transition negotiation or in an eapol
handshake failure during rekeying resulting in a call to
netdev_handshake_failed.
The kernel parses NL80211_ATTR_USE_MFP to mean an enumeration
nl80211_mfp. So instead of using a boolean, we should be using the
value NL80211_MFP_REQUIRED.
Use the NLMSG_ALIGN macro on the family header size (struct ifinfomsg in
this case). The ascii graphics in include/net/netlink.h show that both
the netlink header and the family header should be padded. The netlink
header (nlmsghdr) is already padded in ell. To "document" this
requirementin ell what we could do is take two buffers, one for the
family header and one for the attributes.
This doesn't change anything for most people because ifinfomsg is
already 16-byte long on the usual architectures.
Remove the keys and other data from struct eapol_sm, update device.c,
netdev.c and wsc.c to use the handshake_state object instead of
eapol_sm. This also gets rid of eapol_cancel and the ifindex parameter
in some of the eapol functions where sm->handshake->ifindex can be
used instead.
If an MD IE is supplied to netdev_connect, pass that MD IE in the
associate request, then validate and handle the MD IE and FT IE in the
associate response from AP.
Split eapol_start into two calls, one to register the state machine so
that the PAE read handler knows not to discard frames for that ifindex,
and eapol_start to actually start processing the frames. This is needed
because, as per the comment in netdev.c, due to scheduling the PAE
socket read handler may trigger before the CMD_CONNECT event handler,
which needs to parse the FTE from the Associate Response frame and
supply it to the eapol SM before it can do anything with the message 1
of 4 of the FT handshake.
Another issue is that depending on the driver or timing, the underlying
link might not be marked as 'ready' by the kernel. In this case, our
response to Message 1 of the 4-way Handshake is written and accepted by
the kernel, but gets dropped on the floor internally. Which leads to
timeouts if the AP doesn't retransmit.
If the handshake fails, we trigger a deauthentication prior to reporting
NETDEV_RESULT_HANDSHAKE_FAILED. If a netdev_disconnect is invoked in
the meantime, then the caller will receive -ENOTCONN. This is
incorrect, since we are in fact logically connected until the connect_cb
is notified.
Tweak the behavior to keep the connected variable as true, but check
whether disconnect_cmd_id has been issued in the netdev_disconnect_event
callback.
We used to open a socket for each wireless interface. This patch uses a
single socket with an attached BPF to handle all EAPoL traffic via a
single file descriptor.
CMD_DISCONNECT fails on some occasions when CMD_CONNECT is still
running. When this happens the DBus disconnect command receives an
error reply but iwd's device state is left as disconnected even though
there's a connection at the kernel level which times out a few seconds
later. If the CMD_CONNECT is cancelled I couldn't reproduce this so far.
src/network.c:network_connect()
src/network.c:network_connect_psk()
src/network.c:network_connect_psk() psk:
69ae3f8b2f84a438cf6a44275913182dd2714510ccb8cbdf8da9dc8b61718560
src/network.c:network_connect_psk() len: 32
src/network.c:network_connect_psk() ask_psk: false
src/device.c:device_enter_state() Old State: disconnected, new state:
connecting
src/scan.c:scan_notify() Scan notification 33
src/device.c:device_netdev_event() Associating
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/device.c:device_dbus_disconnect()
src/device.c:device_connect_cb() 6, result: 5
src/device.c:device_enter_state() Old State: connecting, new state:
disconnecting
src/device.c:device_disconnect_cb() 6, success: 0
src/device.c:device_enter_state() Old State: disconnecting, new state:
disconnected
src/scan.c:scan_notify() Scan notification 34
src/netdev.c:netdev_mlme_notify() MLME notification 19
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 37
src/netdev.c:netdev_authenticate_event()
src/scan.c:get_scan_callback() get_scan_callback
src/scan.c:get_scan_done() get_scan_done
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 19
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 38
src/netdev.c:netdev_associate_event()
src/netdev.c:netdev_mlme_notify() MLME notification 46
src/netdev.c:netdev_connect_event()
<delay>
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 20
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 20
src/netdev.c:netdev_mlme_notify() MLME notification 39
src/netdev.c:netdev_deauthenticate_event()
This is to make sure device_remove and netdev_connect_free are called
early so we don't continue setting up a connection and don't let DBus
clients power device back up after we've called netdev_set_powered.
Prevents situations like this:
src/device.c:device_enter_state() Old State: connecting, new state:
connected
src/scan.c:scan_periodic_stop() Stopping periodic scan for ifindex: 3
src/device.c:device_dbus_disconnect()
src/device.c:device_connect_cb() 3
src/device.c:device_disassociated() 3
src/device.c:device_enter_state() Old State: connected, new state:
autoconnect
Try to make the connect and disconnect operations look more like a
transaction where the callback is always called eventually, also with a
clear indication if the operation is in profress. The connected state
lasts from the start of the connection attempt until the disconnect.
1. Non-null netdev->connected or disconnect_cb indicate that the operation
is active.
2. Every entry-point in netdev.c checks if connected is still set
before executing the next step of the connection setup. CMD_CONNECT and
the subsequent commands may succeed even if CMD_DISCONNECT is called
in the middle so they can't only rely on the error value for that.
3. netdev->connect_cb and other elements of the connection state are
reset by netdev_connect_free which groups the clean-up operations to
make sure we don't miss anything. Since the callback pointers are
reset device.c doesn't need to check that it receives a spurious
event in those callbacks for example after calling netdev_disconnect.
If initial bring up returns ERFKILL proceed and the inteface can be
explicitly brought up by the client once rfkill is disabled.
Also fix the error number returned to netdev_set_powered callback to be
negative as expected by netdev_initial_up_cb.
==3059== 7 bytes in 1 blocks are still reachable in loss record 1 of 2
==3059== at 0x4C2C970: malloc (vg_replace_malloc.c:296)
==3059== by 0x50BB319: strndup (in /lib64/libc-2.22.so)
==3059== by 0x417B4D: l_strndup (util.c:180)
==3059== by 0x417E1B: l_strsplit (util.c:311)
==3059== by 0x4057FC: netdev_init (netdev.c:1658)
==3059== by 0x402E26: nl80211_appeared (main.c:112)
==3059== by 0x41F577: get_family_callback (genl.c:1038)
==3059== by 0x41EE3F: process_unicast (genl.c:390)
==3059== by 0x41EE3F: received_data (genl.c:506)
==3059== by 0x41C6F4: io_callback (io.c:120)
==3059== by 0x41BAA9: l_main_run (main.c:381)
==3059== by 0x402B9C: main (main.c:234)
When a new wiphy is added, the kernel usually adds a default STA
interface as well. This interface is currently not signaled over
nl80211 in any way.
This implements a selective dump of the wiphy interfaces in order to
obtain the newly added netdev. Selective dump is currently not
supported by the kernel, so all netdevs will be returned. A patch on
linux-wireless is pending that implements the selective dump
functionality.
==24934== 16 bytes in 1 blocks are definitely lost in loss record 1 of 1
==24934== at 0x4C2C970: malloc (vg_replace_malloc.c:296)
==24934== by 0x41675D: l_malloc (util.c:62)
==24934== by 0x4033B3: netdev_set_linkmode_and_operstate
(netdev.c:149)
==24934== by 0x4042B9: netdev_free (netdev.c:221)
==24934== by 0x41735D: l_queue_clear (queue.c:107)
==24934== by 0x4173A8: l_queue_destroy (queue.c:82)
==24934== by 0x40543D: netdev_exit (netdev.c:1459)
==24934== by 0x402D6F: nl80211_vanished (main.c:126)
==24934== by 0x41E607: l_genl_family_unref (genl.c:1057)
==24934== by 0x402B50: main (main.c:237)
When setting operstate to dormant or down, give it a callback for debug
purposes. It looks like that operstate down message does not have a
chance to go out currently.
The eapol state machine parameters are now built inside device.c when
the network connection is attempted. The reason is that the device
object knows about network settings, wiphy constraints and should
contain the main 'management' logic.
netdev now manages the actual low-level process of building association
messages, detecting authentication events, etc.
Turn netdev watches into device watches. The intent is to refactor out
netdev specific details into its own class and move device specific
logic into device.c away from wiphy.c