The way that netdev_set_linkmode_and_operstate was used resulted in
potential crashes when the netdev was destroyed. This is because netdev
was given as data to l_netlink_send and could be destroyed between the
time of the call and the callback. Since the result of calls to
netdev_set_linkmode_and_operstate is inconsequential, it isn't really
worthwhile tracking these calls in order to cancel them.
This patch simplies the handling of these rtnl calls, makes sure that
netdev isn't passed as user data and rewrites the
netdev_set_linkmode_and_operstate signature to be more consistent with
rtnl_set_powered.
Since all netdevs share the rtnl l_netlink object, it was possible for
netdevs to be destroyed with outstanding commands still executing on the
rtnl object. This can lead to crashes and other nasty situations.
This patch makes sure that Powered requests are always tracked via
set_powered_cmd_id and the request is canceled when netdev is destroyed.
This also implies that netdev_set_powered can now return an -EBUSY error
in case a request is already outstanding.
This removes the authenticator bit in eapol_sm as well as unifies
eapol_register_authenticator and eapol_register. Taking advantage
of the handshake state authenticator bit we no longer have a need
for 2 separate register functions.
ap, and adhoc were also updated to set the authenticator bit in
the handshake and only use eapol_register to register their sm's.
netdev was updated to use the authenticator bit when choosing the
correct key address for adhoc.
In order to plug SAE into the existing connect mechanism the actual
CMD_CONNECT message is never sent, rather sae_register takes care
of sending out CMD_AUTHENTICATE. This required some shuffling of
code in order to handle both eapol and sae. In the case of non-SAE
authentication everything behaves as it did before. When using SAE
an sae_sm is created when a connection is attempted but the eapol_sm
is not. After SAE succeeds it will start association and then create
the eapol_sm and start the 4-way handshake.
This change also adds the handshake SAE events to device and
initializes SAE in main.
Our logic would set CONTROL_PORT_OVER_NL80211 even in cases where
CONTROL_PORT wasn't used (e.g. for open networks). While the kernel
ignored this attribute in this case, it is nicer to set this only if
CONTROL_PORT is intended to be used.
SAE will require some of the same CMD_ASSOCIATE building code that
FT currently uses. This breaks out the common code from FT into
netdev_build_cmd_associate_common.
These will issue a JOIN/LEAVE_IBSS to the kernel. There is
a TODO regarding network configuration. For now, only the
SSID is configurable. This configuration is also required
for AP, but needs to be thought out. Since the current
AP Dbus API has nothing related to configuration items
such as freq/channel or RSN elements they are hard coded,
and will be for Ad-Hoc as well (for now).
Now that the device mode can be changed, netdev must check that
the iftype is correct before starting a connection or disconnecting.
netdev_connect, netdev_connect_wsc, and netdev_disconnect now check
that the iftype is station before continuing.
With the introduction of Ad-Hoc, its not as simple as choosing
aa/spa addresses when setting the keys. Since Ad-Hoc acts as
both the authenticator and supplicant we must check how the netdev
address relates to the particular handshake object as well as
choose the correct key depending on the value of the AA/SPA address.
802.11 states that the higher of the two addresses is to be used
to set the key for the Ad-Hoc connection.
A simple helper was added to choose the correct addressed based on
netdev type and handshake state. netdev_set_tk also checks that
aa > spa in the handshake object when in Ad-Hoc mode. If this is
true then the keys from that handshake are used, otherwise return
and the other handshake key will be used (aa will be > spa).
The station/ap mode behaves exactly the same as before.
For Ad-Hoc networks, the kernel takes care of auth/assoc
and issues a NEW_STATION event when that is complete. This
provides a way to notify when NEW_STATION events occur as
well as forward the MAC of the station to Ad-Hoc.
The two new API's added:
- netdev_station_watch_add()
- netdev_station_watch_remove()
This removes the need for duplicate code in AP/netdev for issuing
a DEL_STATION command. Now AP can issue a DEL_STATION with
netdev_del_station, and specify to either disassociate or deauth
depending on state.
If netdev fails to set the keys, there was no way for device/ap to
know. A new handshake event was added for this. The key setting
failure function was also fixed to support both AP/station iftypes.
It will now automatically send either a disconnect or del_station
depending on the interface type.
In similar manner, netdev_handshake_failed was also modified to
support both AP/station iftypes. Now, any handshake event listeners
should call netdev_handshake_failed upon a handshake failure
event, including AP.
Right now iwd uses Control Port over NL80211 feature if the kernel /
driver supports it. On some kernels this feature is still buggy, so add
an iwd.conf entry to allow the user to override id.
For now the default is to disable this feature until it is more stable.
These checks allow both a station and authenticator to use
the same netdev key install functions. For NEW_KEY and
SET_STATION, the iftype is checked and either handshake->aa
or ->spa is used as the station address for the KEY/STATION
commands. Also, in the failure cases, a disconnect command
is issued only if the iftype is station as this doesn't
apply to AP.
Handshake related netdev events were removed in favor of
handshake events. Now events will be emitted on the handshake
object related to the 4-way handshake and key settings. Events
are:
HANDSHAKE_EVENT_STARTED
HANDSHAKE_EVENT_SETTING_KEYS
HANDSHAKE_EVENT_COMPLETE
HANDSHAKE_EVENT_FAILED
Right now, since netdev only operates in station mode, nothing
listens for COMPLETE/FAILED, as device/wsc gets notified by the
connect_cb when the connection was successful. The COMPLETE/
FAILED were added in preperation for AP moving into eapol/netdev.
When a wifi interface is added/removed to/from a bridge, a
RTM_NEW/DELLINK event is issued. This is the same event used to signal
when an interface is created/deleted.
For this reason the event generated by the bridge code has to be
properly distinguished and handled accordingly. Failing to do so will
result in inconsistencies in iwd which will think an interface has been
deleted when it was actually not.
Detect incoming NEW/DELLINK bridge events and reacts accordingly. For
now, this simply means printing a simple message, as there is no
special logic in iwd for this yet.
If Control Port over NL80211 is not supported, open up a PAE socket and
stuff it into an l_io on the netdev object. Install a read handler on
the l_io and call __eapol_rx_packet as needed.
With the introduction of Control Port Over NL80211 feature, the
transport details need to be moved out of eapol and into netdev.c.
Whether a given WiFi hardware supports transfer of Control Port packets
over NL80211 is Wiphy and kernel version related, so the transport
decisions need to be made elsewhere.
handshake_state_install_ptk triggers a call to
netdev_set_pairwise_key_cb which calls netdev_connect_ok, so don't call
netdev_connect_ok after handshake_state_install_ptk. This doesn't fix
any specific problem though.
SA Query procedure is used when an unprotected disassociate frame
is received (with frame protection enabled). There are two code
paths that can occur when this disassociate frame is received:
1. Send out SA Query and receive a response from the AP within a
timeout. This means that the disassociate frame was not sent
from the AP and can be ignored.
2. Send out SA Query and receive no response. In this case it is
assumed that the AP went down ungracefully and is now back up.
Since frame protection is enabled, you must re-associate with
the AP.
When the 4-Way Handshake is done eapol.c calls netdev_set_tk, then
optionally netdev_set_gtk and netdev_set_igtk. To support the no group
key option send the final SET STATION enabling the controlled port
inside the callback for the netdev_set_tk operation which always means
the end of a 4-Way Handshake rather than in the netdev_set_gtk callback.
The spec says exactly that the controlled port is enabled at the end of
the 4-Way Handshake.
The netlink operations will still be queued in the same order because
the netdev_set_tk/netdev_set_gtk/netdev_set_igtk calls happen in one
main loop iteration but even if the order changed it wouldn't matter.
On failure of any of the three operations netdev_setting_keys_failed
gets called and the remaining operations are cancelled.
The l_queue_find() to find other watches matching the new prefix
needs to be before the watchlist_link(), otherwise the prefix will
match itself and "registered" is always true.
Rename netdev_register_frame to netdev_frame_watch_add and expose to be
usable outside of netdev.c, add netdev_frame_watch_remove also. Update
the Neighbor Report handling which was the only user of
netdev_register_frame.
The handler is now simpler because we use a lookup list with all the
prefixes and individual frame handlers only see the frames matching the
right prefix. This is also useful for the future Access-Point mode.
Refactor management frame structures to take into account optional
presence of some parts of the header:
* drop the single structure for management header and body since
the body offset is variable.
* add mmpdu_get_body to locate the start of frame body.
* drop the union of different management frame type bodies.
* prefix names specific to management frames with "mmpdu" instead
of "mpdu" including any enums based on 802.11-2012 section 8.4.
* move the FC field to the mmpdu_header structure.
If the kernel device driver or the kernel nl80211 version doesn't
support the new RSSI threshold list CQM monitoring, implement similar
logic in iwd with periodic polling. This is only active when an RSSI
agent is registered to receive the events. I tested this with the same
testRSSIAgent autotests that tests the driver-side rssi monitoring
except with all timeouts multiplied by ~20.
Function to allow netdev.c to explicitly tell eapol.c whether to expect
EAP / 4-Way handshake. This is to potentially make the code more
descriptive, until now we'd look at sm->handshake->ptk_complete to see
if a new PTK was needed.
A 4-Way handshake is required on association to an AP except after FT.
Modify netdev_get_iftype, which was until now unused, and add
netdev_set_iftype. Don't skip interfaces with types other than STATION
on startup, instead reset the type to STATION in device.c.
netdev_get_iftype is modified to use our own interface type enum to
avoid forcing users to include "nl80211.h".
Note that setting an interface UP and DOWN wouldn't generally reset the
iftype to STATION. Another process may still change the type while iwd
is running and iwd would not detect this as it would detect another
interface setting interface DOWN, not sure how far we want to go in
monitoring all of the properties this way.
Allow attempts to connect to a new AP using the Reassociation frame even
if netdev->operational is false. This is needed if we want to continue
an ongoing roam attempt after the original connection broke and will be
needed when we start using cached PMKSAs in the future.
There are situations including after beacon loss and during FT where the
cfg80211 will detect we're now disconnected (in some cases will send a
Deauthenticate frame too) and generate this event, or the driver may do
this. For example in ieee80211_report_disconnect in net/mac80211/mlme.c
will (through cfg80211) generate a CMD_DEAUTHENTICATE followed by a
CMD_DISCONNECT.
The kernel doesn't reset the netdev's state to disconnected when it
sends us a beacon loss event so we can't either unless we automatically
send a disconnect command to the kernel.
It seems the handling of beacon loss depends on the driver. For example
in mac80211 only after N beacon loss events (default 7) a probe request is
sent to the AP and a deauthenticate packet is sent if no probe reply is
receiver within T (default 500ms).
CMD_DEAUTHENTICATE is not available for FullMAC based cards. We already
use CMD_CONNECT in the non-FT cases, which works on all cards. However,
for some reason we kept using CMD_DEAUTHENTICATE instead of CMD_DISCONNECT.
For FT (error) cases, keep using CMD_DEAUTHENTICATE.
Certain WiFi drivers do not support using CMD_SET_STATION (e.g.
mwifiex). It is not completely clear how such drivers handle the
AUTHORIZED state, but they don't seem to take it into account. So for
such drivers, ignore the -ENOTSUPP error return from CMD_SET_STATION.
These flags are documented in RFC2863 and kernel's
Documentation/networking/operstates.txt. Operstate doesn't have any
siginificant effect on normal connectivity or on our autotests because
it is not used by the kernel except in some rare cases but it is
supposed to affect some userspace daemons that watch for RTM_NEWLINK
events, so I believe we *should* set them according to this
documentation. Changes:
* There's no point setting link_mode or operstate of the netdev when
we're bringing the admin state DOWN as that overrides operstate.
* Instead of numerical values for link_mode use the if.h defines.
* Set IF_OPER_UP when association succeeds also in the Fast Transition
case. The driver will have set carrier off and then on so the
operstate should be IF_OPER_DORMANT at this point and needs to be
reset to UP.
Add an methods and an event using the new
NL80211_EXT_FEATURE_CQM_RSSI_LIST kernel feature to request RSSI
monitoring with notifications only when RSSI moves from one of the N
intervals requested to another.
device.c will call netdev_set_rssi_report_levels to request
NETDEV_EVENT_RSSI_LEVEL_NOTIFY events every time the RSSI level changes,
level meaning one of the intervals delimited by the threshold values
passed as argument. Inside the event handler it can call
netdev_get_rssi_level to read the new level.
There's no fallback to periodic polling implemented in this patch for
the case of older kernels and/or the driver not supporting
NL80211_EXT_FEATURE_CQM_RSSI_LIST.
netdev_reassociate transitions to another BSS without FT. Similar to
netdev_connect but uses reassociation instead of association and
requires and an existing connection.
Handle the changes of interface address in RTNL New Link messages
similarly to the name changes, emit a NETDEV_WATCH_EVENT_ADDRESS_CHANGE
event and a propety change on dbus.
Note this can only happen when the interface is down so it doesn't
break anything but we need to handle it anyway.
Right now the code checks for is_rsn to wait for the 4-way handshake and
sends the NETDEV_EVENT_4WAY_HANDSHAKE. However, is_rsn condition is not
true for WSC connections since they do not set an RSN field. Still,
they are EAP based handshakes and should be treated in the same manner.
We relax the is_rsn check to instead check for netdev->sm. Currently
netdev->sm is only non-NULL if handshake->own_ie field is not NULL or in
the case of eap-wsc connections.
Make sure that the Neighbor Report timeout is cancelled when connection
breaks or device is being destroyed, and call the callback. Add an
errno parameter to the callback to indicate the cause.
Validate the fourth message of the fast transition sequence and save the
new keys and state as current values in the netdev object. The
FT-specific IE validation that was already present in the initial MD
is moved to a new function.
Build and send the FT Authentication Request frame, the initial Fast
Transition message.
In this version the assumption is that once we start a transition attempt
there's no going back so the old handshake_state, scan_bss, etc. can be
replaced by the new objects immediately and there's no point at which both
the old and the new connection states are needed. Also the disconnect
event for the old connection is implicit. At netdev level the state
during a transition is almost the same with a new connection setup.
The first disconnect event on the netlink socket after the FT Authenticate
is assumed to be the one generated by the kernel for the old connection.
The disconnect event doesn't contain the AP bssid (unlike the
deauthenticate event preceding it), otherwise we could check to see if
the bssid is the one we are interested in or could check connect_cmd_id
assuming a disconnect doesn't happen before the connect command finishes.
Action Frames are sent by nl80211 as unicast data. We're not receiving
any other unicast packets in iwd at this time so let netdev directly
handle all unicast data on the genl socket.
There are situations when a CMD_DISCONNECT or deauthenticate will be
issued locally because of an error detected locally where netdev would
not be able to emit a event to the device object. The CMD_DISCONNECT
handler can only send an event if the disconnect is triggered by the AP
because we don't have an enum value defined for other diconnects. We
have these values defined for the connect callback but those errors may
happen when the connect callback is already NULL because a connection
has been estabilshed. So add an event type for local errors.
These situations may occur in a transition negotiation or in an eapol
handshake failure during rekeying resulting in a call to
netdev_handshake_failed.
The kernel parses NL80211_ATTR_USE_MFP to mean an enumeration
nl80211_mfp. So instead of using a boolean, we should be using the
value NL80211_MFP_REQUIRED.
Use the NLMSG_ALIGN macro on the family header size (struct ifinfomsg in
this case). The ascii graphics in include/net/netlink.h show that both
the netlink header and the family header should be padded. The netlink
header (nlmsghdr) is already padded in ell. To "document" this
requirementin ell what we could do is take two buffers, one for the
family header and one for the attributes.
This doesn't change anything for most people because ifinfomsg is
already 16-byte long on the usual architectures.
Remove the keys and other data from struct eapol_sm, update device.c,
netdev.c and wsc.c to use the handshake_state object instead of
eapol_sm. This also gets rid of eapol_cancel and the ifindex parameter
in some of the eapol functions where sm->handshake->ifindex can be
used instead.
If an MD IE is supplied to netdev_connect, pass that MD IE in the
associate request, then validate and handle the MD IE and FT IE in the
associate response from AP.
Split eapol_start into two calls, one to register the state machine so
that the PAE read handler knows not to discard frames for that ifindex,
and eapol_start to actually start processing the frames. This is needed
because, as per the comment in netdev.c, due to scheduling the PAE
socket read handler may trigger before the CMD_CONNECT event handler,
which needs to parse the FTE from the Associate Response frame and
supply it to the eapol SM before it can do anything with the message 1
of 4 of the FT handshake.
Another issue is that depending on the driver or timing, the underlying
link might not be marked as 'ready' by the kernel. In this case, our
response to Message 1 of the 4-way Handshake is written and accepted by
the kernel, but gets dropped on the floor internally. Which leads to
timeouts if the AP doesn't retransmit.
If the handshake fails, we trigger a deauthentication prior to reporting
NETDEV_RESULT_HANDSHAKE_FAILED. If a netdev_disconnect is invoked in
the meantime, then the caller will receive -ENOTCONN. This is
incorrect, since we are in fact logically connected until the connect_cb
is notified.
Tweak the behavior to keep the connected variable as true, but check
whether disconnect_cmd_id has been issued in the netdev_disconnect_event
callback.
We used to open a socket for each wireless interface. This patch uses a
single socket with an attached BPF to handle all EAPoL traffic via a
single file descriptor.
CMD_DISCONNECT fails on some occasions when CMD_CONNECT is still
running. When this happens the DBus disconnect command receives an
error reply but iwd's device state is left as disconnected even though
there's a connection at the kernel level which times out a few seconds
later. If the CMD_CONNECT is cancelled I couldn't reproduce this so far.
src/network.c:network_connect()
src/network.c:network_connect_psk()
src/network.c:network_connect_psk() psk:
69ae3f8b2f84a438cf6a44275913182dd2714510ccb8cbdf8da9dc8b61718560
src/network.c:network_connect_psk() len: 32
src/network.c:network_connect_psk() ask_psk: false
src/device.c:device_enter_state() Old State: disconnected, new state:
connecting
src/scan.c:scan_notify() Scan notification 33
src/device.c:device_netdev_event() Associating
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/device.c:device_dbus_disconnect()
src/device.c:device_connect_cb() 6, result: 5
src/device.c:device_enter_state() Old State: connecting, new state:
disconnecting
src/device.c:device_disconnect_cb() 6, success: 0
src/device.c:device_enter_state() Old State: disconnecting, new state:
disconnected
src/scan.c:scan_notify() Scan notification 34
src/netdev.c:netdev_mlme_notify() MLME notification 19
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 37
src/netdev.c:netdev_authenticate_event()
src/scan.c:get_scan_callback() get_scan_callback
src/scan.c:get_scan_done() get_scan_done
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 19
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 38
src/netdev.c:netdev_associate_event()
src/netdev.c:netdev_mlme_notify() MLME notification 46
src/netdev.c:netdev_connect_event()
<delay>
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 20
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 20
src/netdev.c:netdev_mlme_notify() MLME notification 39
src/netdev.c:netdev_deauthenticate_event()
This is to make sure device_remove and netdev_connect_free are called
early so we don't continue setting up a connection and don't let DBus
clients power device back up after we've called netdev_set_powered.
Prevents situations like this:
src/device.c:device_enter_state() Old State: connecting, new state:
connected
src/scan.c:scan_periodic_stop() Stopping periodic scan for ifindex: 3
src/device.c:device_dbus_disconnect()
src/device.c:device_connect_cb() 3
src/device.c:device_disassociated() 3
src/device.c:device_enter_state() Old State: connected, new state:
autoconnect
Try to make the connect and disconnect operations look more like a
transaction where the callback is always called eventually, also with a
clear indication if the operation is in profress. The connected state
lasts from the start of the connection attempt until the disconnect.
1. Non-null netdev->connected or disconnect_cb indicate that the operation
is active.
2. Every entry-point in netdev.c checks if connected is still set
before executing the next step of the connection setup. CMD_CONNECT and
the subsequent commands may succeed even if CMD_DISCONNECT is called
in the middle so they can't only rely on the error value for that.
3. netdev->connect_cb and other elements of the connection state are
reset by netdev_connect_free which groups the clean-up operations to
make sure we don't miss anything. Since the callback pointers are
reset device.c doesn't need to check that it receives a spurious
event in those callbacks for example after calling netdev_disconnect.
If initial bring up returns ERFKILL proceed and the inteface can be
explicitly brought up by the client once rfkill is disabled.
Also fix the error number returned to netdev_set_powered callback to be
negative as expected by netdev_initial_up_cb.
==3059== 7 bytes in 1 blocks are still reachable in loss record 1 of 2
==3059== at 0x4C2C970: malloc (vg_replace_malloc.c:296)
==3059== by 0x50BB319: strndup (in /lib64/libc-2.22.so)
==3059== by 0x417B4D: l_strndup (util.c:180)
==3059== by 0x417E1B: l_strsplit (util.c:311)
==3059== by 0x4057FC: netdev_init (netdev.c:1658)
==3059== by 0x402E26: nl80211_appeared (main.c:112)
==3059== by 0x41F577: get_family_callback (genl.c:1038)
==3059== by 0x41EE3F: process_unicast (genl.c:390)
==3059== by 0x41EE3F: received_data (genl.c:506)
==3059== by 0x41C6F4: io_callback (io.c:120)
==3059== by 0x41BAA9: l_main_run (main.c:381)
==3059== by 0x402B9C: main (main.c:234)
When a new wiphy is added, the kernel usually adds a default STA
interface as well. This interface is currently not signaled over
nl80211 in any way.
This implements a selective dump of the wiphy interfaces in order to
obtain the newly added netdev. Selective dump is currently not
supported by the kernel, so all netdevs will be returned. A patch on
linux-wireless is pending that implements the selective dump
functionality.
==24934== 16 bytes in 1 blocks are definitely lost in loss record 1 of 1
==24934== at 0x4C2C970: malloc (vg_replace_malloc.c:296)
==24934== by 0x41675D: l_malloc (util.c:62)
==24934== by 0x4033B3: netdev_set_linkmode_and_operstate
(netdev.c:149)
==24934== by 0x4042B9: netdev_free (netdev.c:221)
==24934== by 0x41735D: l_queue_clear (queue.c:107)
==24934== by 0x4173A8: l_queue_destroy (queue.c:82)
==24934== by 0x40543D: netdev_exit (netdev.c:1459)
==24934== by 0x402D6F: nl80211_vanished (main.c:126)
==24934== by 0x41E607: l_genl_family_unref (genl.c:1057)
==24934== by 0x402B50: main (main.c:237)
When setting operstate to dormant or down, give it a callback for debug
purposes. It looks like that operstate down message does not have a
chance to go out currently.