This variable ended up being used only on the fast-transition path. On
the re-associate path it was never used, but memcpy-ied nevertheless.
Since its only use is by auth_proto based protocols, move it to the
auth_proto object directly.
Due to how prepare_ft works (we need prev_bssid from the handshake, but
the handshake is reset), have netdev_ft_* methods take an 'orig_bss'
parameter, similar to netdev_reassociate.
The RoamThreshold5G was never honored because it was being
set prior to any connections. This caused the logic inside
netdev_cqm_rssi_update to always choose the 2GHz threshold
(RoamThreshold) due to netdev->frequency being zero at this time.
Instead call netdev_cqm_rssi_update in all connect/transition
calls after netdev->frequency is updated. This will allow both
the 2G and 5G thresholds to be used depending on what frequency
the new BSS is.
The call to netdev_cqm_rssi_update in netdev_setup_interface
was also removed since it serves no purpose, at least now
that there are two thresholds to consider.
Instead of requiring each auth_proto to perform validation of the frames
received via rx_authenticate & rx_associate, have netdev itself perform
the mpdu validation. This is unlikely to happen anyway since the kernel
performs its own frame validation. Print a warning in case the
validation fails.
netdev_free relies on netdev->connected being set to detect whether a
connection is in progress. This variable is only set once the driver
has been connected however, so for situations where a CMD_CONNECT is
still 'in flight' or if the wiphy work is still pending, the ongoing
connection will not be canceled. Fix that by being more thorough when
trying to detect that a connection is in progress.
src/wiphy.c:wiphy_radio_work_next() Starting work item 2
Terminate
src/netdev.c:netdev_free() Freeing netdev wlan0[9]
src/device.c:device_free()
src/station.c:station_free()
src/netconfig.c:netconfig_destroy()
Removing scan context for wdev c
src/scan.c:scan_context_free() sc: 0x4a44c80
src/netdev.c:netdev_mlme_notify() MLME notification New Station(19)
src/netdev.c:netdev_link_notify() event 16 on ifindex 9
==6356== Invalid write of size 4
==6356== at 0x40A253: netdev_cmd_connect_cb (netdev.c:2522)
==6356== by 0x4A8886: process_unicast (genl.c:986)
==6356== by 0x4A8C48: received_data (genl.c:1098)
==6356== by 0x4A3DFD: io_callback (io.c:120)
==6356== by 0x4A2799: l_main_iterate (main.c:478)
==6356== by 0x4A28DA: l_main_run (main.c:525)
==6356== by 0x4A2BF2: l_main_run_with_signal (main.c:647)
==6356== by 0x404D27: main (main.c:542)
==6356== Address 0x4a3e418 is 152 bytes inside a block of size 472 free'd
==6356== at 0x48399CB: free (vg_replace_malloc.c:538)
==6356== by 0x49996F: l_free (util.c:136)
==6356== by 0x406662: netdev_free (netdev.c:886)
==6356== by 0x4129C2: netdev_shutdown (netdev.c:5980)
==6356== by 0x403A14: iwd_shutdown (main.c:79)
==6356== by 0x403A7D: signal_handler (main.c:90)
==6356== by 0x4A2AFB: sigint_handler (main.c:612)
==6356== by 0x4A2F3B: handle_callback (signal.c:78)
==6356== by 0x4A3030: signalfd_read_cb (signal.c:104)
==6356== by 0x4A3DFD: io_callback (io.c:120)
==6356== by 0x4A2799: l_main_iterate (main.c:478)
==6356== by 0x4A28DA: l_main_run (main.c:525)
==6356== Block was alloc'd at
==6356== at 0x483879F: malloc (vg_replace_malloc.c:307)
==6356== by 0x49983B: l_malloc (util.c:62)
==6356== by 0x4121BD: netdev_create_from_genl (netdev.c:5776)
==6356== by 0x451F6F: manager_new_station_interface_cb (manager.c:173)
==6356== by 0x4A8886: process_unicast (genl.c:986)
==6356== by 0x4A8C48: received_data (genl.c:1098)
==6356== by 0x4A3DFD: io_callback (io.c:120)
==6356== by 0x4A2799: l_main_iterate (main.c:478)
==6356== by 0x4A28DA: l_main_run (main.c:525)
==6356== by 0x4A2BF2: l_main_run_with_signal (main.c:647)
==6356== by 0x404D27: main (main.c:542)
If the daemon is started and killed rapidly on startup, it is possible
for netdev_shutdown to be called prior to manager processing messages
that actually create the netdev itself. Since the netdev_list has
already been freed, the storage is lost. Fix that by destroying
netdev_list only when the module is unloaded.
If we're going down, make sure to notify any watches about EVENT_DEL
earlier. Not doing so might result in us not cleaning up requests that
might have been started as the result of this event.
If the connected BSS announces that it is switching operating channel,
the kernel may emit the NL80211_CMD_CH_SWTICH_NOTIFY event when the
switch is complete. Add a new netdev event NETDEV_EVENT_CHANNEL_SWITCHED
to signal to interested modules that the connected BSS has changed
channel. The event carries a pointer to the new channel's frequency.
Since netdev maintains the list of FT over DS info structs there is not
any need for station to get callbacks when the initial action frame
is received, or not. This removes the need for the callback handler,
user data, and response timeout.
The beauty of FT-over-DS is that a station can send and receive
action frames to many APs to prepare for a future roam. Each
AP authenticates the station and when a roam happens the station
can immediately move to reassociation.
To handle this a queue of netdev_ft_over_ds_info structs is used
instead of a single entry. Using the new ft.c parser APIs these
info structs can be looked up when responses come in. For now
the timeouts/callbacks are kept but these will be removed as it
really does not matter if the AP sends a response (keeps station
happy until the next patch).
This is to prepare for multiple concurrent FT-over-DS action frames.
A list will be kept in netdev and for lookup reasons it needs to
parse the start of the frame to grab the aa/spa addresses. In this
call the IEs are also returned and passed to the new
ft_over_ds_parse_action_response.
For now the address checks have been moved into netdev, but this will
eventually turn into a queue lookup.
This value sets the roaming threshold on 5GHz networks. The
threshold has been separated from 2.4GHz because in many cases
5GHz can perform much better at low RSSI than 2.4GHz.
In addition the BSS ranking logic was re-worked and now 5GHz is
much more preferred, even at low RSSI. This means we need a
lower floor for RSSI before roaming, otherwise IWD would end
up roaming immediately after connecting due to low RSSI CQM
events.
Commit 1fe5070 added a workaround for drivers which may send the
connect event prior to the connect callback/ack. This caused IWD
to fail to start eapol if reassociation was used due to
netdev_reassociate never setting netdev->connected = false.
netdev_reassociate uses the same code path as normal connections,
but when the connect callback came in connected was already set
to true which then prevents eapol from being registered. Then,
once the connect event comes in, there is no frame watch for
eapol and IWD doesn't respond to any handshake frames.
Prior to this, an error sending the FT Reassociation was treated
as fatal, which is correct for FT-over-Air but not for FT-over-DS.
If the actual l_genl_family_send call fails for FT-over-DS the
existing connection can be maintained and there is no need to
call netdev_connect_failed.
Adding a return to the tx_associate function works for both FT
types. In the FT-over-Air case this return will ultimately get
sent back up to auth_proto_rx_authenticate in which case will
call netdev_connect_failed. For FT-over-DS tx_associate is
actually called from the 'start' operation which can fail and
still maintain the existing connection.
FT-over-DS followed the same pattern as FT-over-Air which worked,
but really limited how the protocol could be used. FT-over-DS is
unique in that we can authenticate to many APs by sending out
FT action frames and parsing the results. Once parsed IWD can
immediately Reassociate, or do so at a later time.
To take advantage of this IWD need to separate FT-over-DS into
two stages: action frame and reassociation.
The initial action frame stage is started by netdev. The target
BSS is sent an FT action frame and a new cache entry is created
in ft.c. Once the response is received the entry is updated
with all the needed data to Reassociate. To limit the record
keeping on netdev each FT-over-DS entry holds a userdata pointer
so netdev doesn't need to maintain its own list of data for
callbacks.
Once the action response is parsed netdev will call back signalling
the action frame sequence was completed (either successfully or not).
At this point the 'normal' FT procedure can start using the
FT-over-DS auth-proto.
The info struct is on the stack which leads to the potential
for uninitialized data access. Zero out the info struct prior
to calling the get station callback:
==141137== Conditional jump or move depends on uninitialised value(s)
==141137== at 0x458A6F: diagnostic_info_to_dict (diagnostic.c:109)
==141137== by 0x41200B: station_get_diagnostic_cb (station.c:3620)
==141137== by 0x405BE1: netdev_get_station_cb (netdev.c:4783)
==141137== by 0x4722F9: process_unicast (genl.c:994)
==141137== by 0x4722F9: received_data (genl.c:1102)
==141137== by 0x46F28B: io_callback (io.c:120)
==141137== by 0x46E5AC: l_main_iterate (main.c:478)
==141137== by 0x46E65B: l_main_run (main.c:525)
==141137== by 0x46E65B: l_main_run (main.c:507)
==141137== by 0x46E86B: l_main_run_with_signal (main.c:647)
==141137== by 0x403EA8: main (main.c:490)
In case the netdev is brought down while we're trying to connect, try to
detect this and fail early instead of trying to send additional
commands.
src/station.c:station_enter_state() Old State: disconnected, new state: connecting
src/station.c:station_netdev_event() Associating
src/netdev.c:netdev_mlme_notify() MLME notification Connect(46)
src/netdev.c:netdev_connect_event()
src/netdev.c:netdev_link_notify() event 16 on ifindex 4
src/eapol.c:eapol_handle_ptk_1_of_4() ifindex=4
src/netdev.c:netdev_link_notify() event 16 on ifindex 4
src/eapol.c:eapol_handle_ptk_3_of_4() ifindex=4
src/netdev.c:netdev_set_gtk() 4
src/station.c:station_handshake_event() Setting keys
src/netdev.c:netdev_set_tk() 4
src/netdev.c:netdev_set_rekey_offload() 4
New Key for Group Key failed for ifindex: 4:Network is down
src/netdev.c:netdev_link_notify() event 16 on ifindex 4
src/station.c:station_free()
src/netdev.c:netdev_mlme_notify() MLME notification Disconnect(48)
src/netdev.c:netdev_disconnect_event()
src/wiphy.c:wiphy_reg_notify() Notification of command Reg Change(36)
src/wiphy.c:wiphy_update_reg_domain() New reg domain country code for (global) is XX
src/netdev.c:netdev_link_notify() event 16 on ifindex 4
src/wiphy.c:wiphy_reg_notify() Notification of command Reg Change(36)
src/wiphy.c:wiphy_update_reg_domain() New reg domain country code for (global) is DE
src/wiphy.c:wiphy_radio_work_done() Work item 14 done
src/station.c:station_connect_cb() 4, result: 4
Segmentation fault
If the iftype changes, kernel silently wipes out any frame registrations
we may have registered. Right now, frame registrations are only done when
the interface is created. This can result in frame watches not being
added if the interface type is changed between station mode to ap mode
and then back to station mode, e.g.:
device wlan0 set-property Mode ap
device wlan0 set-property Mode station
Make sure to re-add frame registrations according to the mode if the
interface type is changed.
Since netdev now keeps track of iftype changes, let it call
frame_watch_wdev_remove on netdevs that it manages to clear frame
registrations that should be cleared due to an iftype change.
Note that P2P_DEVICE wdevs are not managed by any netdev object, but
since their iftype cannot be changed, they should not be affected
by this change.
And set the interface type based on the event rather than the command
callback. This allows us to track interface type changes even if they
come from outside iwd (which shouldn't happen.)
The prepare_ft patch was an intermediate to a full patch
set and was not fully tested stand alone. Its placement
actually broke FT due to handshake->aa getting overwritten
prior to netdev->prev_bssid being copied out. This caused
FT to fail with "transport endpoint not connected (-107)"
Fix a regression where connection to an open network results in an
NotSupported error being returned.
Fixes: d79e883e93 ("netdev: Introduce connection types")
This makes conversions simpler. Also fixes a bug where P2P devices were
printed with an incorrect Mode value since dbus_iftype_to_string was
assuming that an iftype as defined in nl80211.h was being passed in,
while netdev was returning an enum value defined in netdev.h.
The 8021x offloading procedure still does EAP in userspace which
negotiates the PMK. The kernel then expects to obtain this PMK
from userspace by calling SET_PMK. This then allows the firmware
to begin the 4-way handshake.
Using __eapol_install_set_pmk_func to install netdev_set_pmk,
netdev now gets called into once EAP finishes and can begin
the final userspace actions prior to the firmware starting
the 4-way handshake:
- SET_PMK using PMK negotiated with EAP
- Emit SETTING_KEYS event
- netdev_connect_ok
One thing to note is that the kernel provides no way of knowing if
the 4-way handshake completed. Assuming SET_PMK/SET_STATION come
back with no errors, IWD assumes the PMK was valid. If not, or
due to some other issue in the 4-way, the kernel will send a
disconnect.
This adds a new type for 8021x offload as well as support in
building CMD_CONNECT.
As described in the comment, 8021x offloading is not particularly
similar to PSK as far as the code flow in IWD is concerned. There
still needs to be an eapol_sm due to EAP being done in userspace.
This throws somewhat of a wrench into our 'is_offload' cases. And
as such this connection type is handled specially.
The chances were extremely low, but using l_idle_oneshot
could end up causing a invalid memory access if the netdev
went down while waiting for the disconnect idle callback.
Instead netdev can keep track of the idle with l_idle_create
and remove it if the netdev goes down prior to the idle callback.
This crash was caused from the disconnect_cb being called
immediately in cases where send_disconnect was false. The
previous patch actually addressed this separately as this
flag was being set improperly which will, indirectly, fix
one of the two code paths that could cause this crash.
Still, there is a situation where send_disconnect could
be false and in this case IWD would still crash. If IWD
is waiting to queue the connect item and netdev_disconnect
is called it would result in the callback being called
immediately. Instead we can add an l_idle as to allow the
callback to happen out of scope, which is what station
expects.
Prior to this patch, the crashing behavior can be tested using
the following script (or some variant of it, your system timing
may not be the same as mine).
iwctl station wlan0 disconnect
iwctl station wlan0 connect <network1> &
sleep 0.02
iwctl station wlan0 connect <network2>
++++++++ backtrace ++++++++
0 0x7f4e1504e530 in /lib64/libc.so.6
1 0x432b54 in network_get_security() at src/network.c:253
2 0x416e92 in station_handshake_setup() at src/station.c:937
3 0x41a505 in __station_connect_network() at src/station.c:2551
4 0x41a683 in station_disconnect_onconnect_cb() at src/station.c:2581
5 0x40b4ae in netdev_disconnect() at src/netdev.c:3142
6 0x41a719 in station_disconnect_onconnect() at src/station.c:2603
7 0x41a89d in station_connect_network() at src/station.c:2652
8 0x433f1d in network_connect_psk() at src/network.c:886
9 0x43483a in network_connect() at src/network.c:1183
10 0x4add11 in _dbus_object_tree_dispatch() at ell/dbus-service.c:1802
11 0x49ff54 in message_read_handler() at ell/dbus.c:285
12 0x496d2f in io_callback() at ell/io.c:120
13 0x495894 in l_main_iterate() at ell/main.c:478
14 0x49599b in l_main_run() at ell/main.c:521
15 0x495cb3 in l_main_run_with_signal() at ell/main.c:647
16 0x404add in main() at src/main.c:490
17 0x7f4e15038b25 in /lib64/libc.so.6