It was observed that IWD's ranking for BSS's did not always
end up with the fastest being chosen. This was due to IWD's
heavy weight on signal strength. This is a decent way of ranking
but even better is calculating a theoretical data rate which
was also done and factored in. The problem is the data rate
factor was always outdone by the signal strength.
Intead remove signal strength entirely as this is already taken
into account with the data rate calculation. This also removes
the check for rate IEs. If no IEs are found the parser will
base the data rate soley on RSSI.
There were a few other factors removed which will be added back
when ranking *networks* rather than BSS's. WPA version (or open)
was removed as well as the privacy capability. These values really
should not differ between BSS's in the same SSID and as such
should be used for network ranking instead.
Both ext/supported rates IEs are obtained from scan results. These
IEs are passed to ie_tlv_init/ie_tlv_next, as well as direct length
checks (for supported rates at least, extended supported rates can
be as long as a single byte integer can hold, 1 - 255) which verifies
that the length in the IE matches the overall IE length that is
stored in scan_bss. Because of this, ie_parse_supported_rates_from_data
was doing double duty re-initializing a TLV iterator.
Intead, since we know the IE length is within bounds, the length/data
can simply be directly accessed out of the buffer. This avoids the need
for a wrapper function entirely.
The length parameters were also removed, since this is now obtained
directly from the IE.
Since netdev maintains the list of FT over DS info structs there is not
any need for station to get callbacks when the initial action frame
is received, or not. This removes the need for the callback handler,
user data, and response timeout.
Roam times can be slightly improved by sending out the FT-over-DS
action frames to any BSS in the mobility domain immediately after
connecting. This preauthenticates IWD to each AP which means
Reassociation can happen right away when a roam is needed.
When a roam is needed station_transition_start will first try
FT-over-DS (if supported) via netdev_fast_transtion_over_ds. The
return is checked and if netdev has no cached entries FT-over-Air
will be used instead.
The beauty of FT-over-DS is that a station can send and receive
action frames to many APs to prepare for a future roam. Each
AP authenticates the station and when a roam happens the station
can immediately move to reassociation.
To handle this a queue of netdev_ft_over_ds_info structs is used
instead of a single entry. Using the new ft.c parser APIs these
info structs can be looked up when responses come in. For now
the timeouts/callbacks are kept but these will be removed as it
really does not matter if the AP sends a response (keeps station
happy until the next patch).
This is to prepare for multiple concurrent FT-over-DS action frames.
A list will be kept in netdev and for lookup reasons it needs to
parse the start of the frame to grab the aa/spa addresses. In this
call the IEs are also returned and passed to the new
ft_over_ds_parse_action_response.
For now the address checks have been moved into netdev, but this will
eventually turn into a queue lookup.
This value sets the roaming threshold on 5GHz networks. The
threshold has been separated from 2.4GHz because in many cases
5GHz can perform much better at low RSSI than 2.4GHz.
In addition the BSS ranking logic was re-worked and now 5GHz is
much more preferred, even at low RSSI. This means we need a
lower floor for RSSI before roaming, otherwise IWD would end
up roaming immediately after connecting due to low RSSI CQM
events.
This is being added as a developer method and should not be used
in production. For testing purposes though, it is quite useful as
it forces IWD to roam to a provided BSS and bypasses IWD's roaming
and ranking logic for choosing a roam candidate.
To use this a BSSID is provided as the only parameter. If this
BSS is not in IWD's current scan results -EINVAL will be returned.
If IWD knows about the BSS it will attempt to roam to it whether
that is via FT, FT-over-DS, or Reassociation. These details are
still sorted out in IWDs station_transition_start() logic.
This will enable developer features to be used. Currently the
only user of this will be StationDiagnostics.Roam() method which
should only be exposed in this mode.
Expose the state directory/storage directory path on D-Bus because it
can't be known to clients until IWD runs, and client might need to
occasionally fiddle with the network config files. While there also
expose the IWD version string, similar to how some other D-Bus services
do.
Similar to 06aa84cca set the operstate when AdHoc is started and
stopped as it is no longer always set by netdev (only for station/p2p
interface types)
Previously resp was a simple array of bytes allocated on the stack.
This was changed to a dynamically allocated array, but the sizeof(resp)
argument to ap_build_beacon_pr_head() was never changed appropriately.
Fix this by introducing a new resp_len variable that holds the number of
bytes allocated for resp. Also, move the allocation after the basic
sanity checks have been performed to avoid allocating/freeing memory
unnecessarily.
Fixes: 18a63f91fd ("ap: Write extra frame IEs from the user")
Commit 1fe5070 added a workaround for drivers which may send the
connect event prior to the connect callback/ack. This caused IWD
to fail to start eapol if reassociation was used due to
netdev_reassociate never setting netdev->connected = false.
netdev_reassociate uses the same code path as normal connections,
but when the connect callback came in connected was already set
to true which then prevents eapol from being registered. Then,
once the connect event comes in, there is no frame watch for
eapol and IWD doesn't respond to any handshake frames.
Prior to this, an error sending the FT Reassociation was treated
as fatal, which is correct for FT-over-Air but not for FT-over-DS.
If the actual l_genl_family_send call fails for FT-over-DS the
existing connection can be maintained and there is no need to
call netdev_connect_failed.
Adding a return to the tx_associate function works for both FT
types. In the FT-over-Air case this return will ultimately get
sent back up to auth_proto_rx_authenticate in which case will
call netdev_connect_failed. For FT-over-DS tx_associate is
actually called from the 'start' operation which can fail and
still maintain the existing connection.
FT-over-DS was refactored to separate the FT action frame and
reassociation. From stations standpoint IWD needs to call
netdev_fast_transition_over_ds_action prior to actually roaming.
For now these two stages are being combined and the action
roam happens immediately after the action response callback.
FT-over-DS followed the same pattern as FT-over-Air which worked,
but really limited how the protocol could be used. FT-over-DS is
unique in that we can authenticate to many APs by sending out
FT action frames and parsing the results. Once parsed IWD can
immediately Reassociate, or do so at a later time.
To take advantage of this IWD need to separate FT-over-DS into
two stages: action frame and reassociation.
The initial action frame stage is started by netdev. The target
BSS is sent an FT action frame and a new cache entry is created
in ft.c. Once the response is received the entry is updated
with all the needed data to Reassociate. To limit the record
keeping on netdev each FT-over-DS entry holds a userdata pointer
so netdev doesn't need to maintain its own list of data for
callbacks.
Once the action response is parsed netdev will call back signalling
the action frame sequence was completed (either successfully or not).
At this point the 'normal' FT procedure can start using the
FT-over-DS auth-proto.
FT-over-DS is being separated into two independent stages. The
first of which is the processing of the action frame response.
This new class will hold all the parsed information from the action
frame and allowing it to be retrieved at a later time when IWD
needs to roam.
Initial info class should be created when the action frame is
being sent out. Once a response is received it can be parsed
with ft_over_ds_parse_action_response. This verifies the frame
and updates the ft_ds_info class with the parsed data.
ft_over_ds_prepare_handshake is the final step prior to
Reassociation. This sets all the stored IEs, anonce, and KH IDs
into the handshake and derives the new PTK.
This adds the RSNE verification to ft_parse_ies which will
be common between over-Air and over-DS. The MDE check was
also factored out into its own minimal function as to
retain the spec comment but allow reuse elsewhere.
Prior to this the diagnostic interface was taken down when station
transitioned to DISCONNECTED. This worked but once station is in
a DISCONNECTING state it then calls netdev_disconnect(). Trying to
get any diagnostic data during this time may not work as its
unknown what state exactly the kernel is in. To be safe take the
interface down when station is DISCONNECTING.
The building of the FT IEs for Action/Authenticate
frames will need to be shared between ft and netdev
once FT-over-DS is refactored.
The building was refactored to work off the callers
buffer rather than internal stack buffers. An argument
'new_snonce' was included as FT-over-DS will generate
a new snonce for the initial action frame, hence the
handshakes snonce cannot be used.
Break up the rather large code block which parses out IEs,
verifies, and sets into the handshake. FT-over-DS needs these
steps broken up in order to parse the action frame response
without modifying the handshake.
Under very rare circumstances the roaming scan triggered might not be
canceled properly. This is because we issue the roam scan recursively
from within a scan callback and re-use the id of the scan for the
subsequent request. The destroy callback is invoked right after the
callback and resets the id. This leads to the scan not being canceled
properly in roam_state_clear().
src/netdev.c:netdev_mlme_notify() MLME notification Notify CQM(64)
src/station.c:station_roam_trigger_cb() 37
src/station.c:station_roam_scan() ifindex: 37
src/station.c:station_roam_trigger_cb() Using cached neighbor report for roam
...
src/scan.c:get_scan_done() get_scan_done
src/station.c:station_roam_failed() 37
src/station.c:station_roam_scan() ifindex: 37
src/scan.c:scan_request_triggered() Active scan triggered for wdev 22
^CTerminate
src/netdev.c:netdev_free() Freeing netdev wlan0[37]
src/device.c:device_free()
src/station.c:station_free()
...
Removing scan context for wdev 22
src/scan.c:scan_context_free() sc: 0x4a362a0
src/wiphy.c:wiphy_radio_work_done() Work item 14 done
==19542== Invalid write of size 4
==19542== at 0x411500: station_roam_scan_destroy (station.c:2010)
==19542== by 0x420B5B: scan_request_free (scan.c:156)
==19542== by 0x410BAC: destroy_work (wiphy.c:294)
==19542== by 0x410BAC: wiphy_radio_work_done (wiphy.c:1613)
==19542== by 0x46C66E: l_queue_clear (queue.c:107)
==19542== by 0x46C6B8: l_queue_destroy (queue.c:82)
==19542== by 0x420BAE: scan_context_free (scan.c:205)
==19542== by 0x424135: scan_wdev_remove (scan.c:2272)
==19542== by 0x408754: netdev_free (netdev.c:847)
==19542== by 0x40E18C: netdev_shutdown (netdev.c:5773)
==19542== by 0x404756: iwd_shutdown (main.c:78)
==19542== by 0x404756: iwd_shutdown (main.c:65)
==19542== by 0x470E21: handle_callback (signal.c:78)
==19542== by 0x470E21: signalfd_read_cb (signal.c:104)
==19542== by 0x47166B: io_callback (io.c:120)
==19542== Address 0x4d81f98 is 200 bytes inside a block of size 288 free'd
==19542== at 0x48399CB: free (vg_replace_malloc.c:538)
==19542== by 0x47F3E5: interface_instance_free (dbus-service.c:510)
==19542== by 0x481DEA: _dbus_object_tree_remove_interface (dbus-service.c:1694)
==19542== by 0x481F1C: _dbus_object_tree_object_destroy (dbus-service.c:795)
==19542== by 0x40894F: netdev_free (netdev.c:844)
==19542== by 0x40E18C: netdev_shutdown (netdev.c:5773)
==19542== by 0x404756: iwd_shutdown (main.c:78)
==19542== by 0x404756: iwd_shutdown (main.c:65)
==19542== by 0x470E21: handle_callback (signal.c:78)
==19542== by 0x470E21: signalfd_read_cb (signal.c:104)
==19542== by 0x47166B: io_callback (io.c:120)
==19542== by 0x47088C: l_main_iterate (main.c:478)
==19542== by 0x47095B: l_main_run (main.c:525)
==19542== by 0x47095B: l_main_run (main.c:507)
==19542== by 0x470B6B: l_main_run_with_signal (main.c:647)
==19542== Block was alloc'd at
==19542== at 0x483879F: malloc (vg_replace_malloc.c:307)
==19542== by 0x46AB2D: l_malloc (util.c:62)
==19542== by 0x416599: station_create (station.c:3448)
==19542== by 0x406D55: netdev_newlink_notify (netdev.c:5324)
==19542== by 0x46D4BC: l_hashmap_foreach (hashmap.c:612)
==19542== by 0x472F46: process_broadcast (netlink.c:158)
==19542== by 0x472F46: can_read_data (netlink.c:279)
==19542== by 0x47166B: io_callback (io.c:120)
==19542== by 0x47088C: l_main_iterate (main.c:478)
==19542== by 0x47095B: l_main_run (main.c:525)
==19542== by 0x47095B: l_main_run (main.c:507)
==19542== by 0x470B6B: l_main_run_with_signal (main.c:647)
==19542== by 0x403EDB: main (main.c:490)
==19542==
Prior to this netdev_connect_ok set setting this which really
only applies to station mode. In addition this happens for each
new station that connects to the AP. Instead set the operstate /
link mode when AP starts and stops.
Change ap_start to load all of the AP configuration from a struct
l_settings, moving the 6 or so parameters from struct ap_config members
to the l_settings groups and keys. This extends the ap profile concept
used for the DHCP settings. ap_start callers create the l_settings
object and fill the values in it or read the settings in from a file.
Since ap_setup_dhcp and ap_load_profile_and_dhcp no longer do the
settings file loading, they needed to be refactored and some issues were
fixed in their logic, e.g. l_dhcp_server_set_ip_address() was never
called when the "IP pool" was used. Also the IP pool was previously only
used if the ap->config->profile was NULL and this didn't match what the
docs said:
"If [IPv4].Address is not provided and no IP address is set on the
interface prior to calling StartProfile the IP pool will be used."
The info struct is on the stack which leads to the potential
for uninitialized data access. Zero out the info struct prior
to calling the get station callback:
==141137== Conditional jump or move depends on uninitialised value(s)
==141137== at 0x458A6F: diagnostic_info_to_dict (diagnostic.c:109)
==141137== by 0x41200B: station_get_diagnostic_cb (station.c:3620)
==141137== by 0x405BE1: netdev_get_station_cb (netdev.c:4783)
==141137== by 0x4722F9: process_unicast (genl.c:994)
==141137== by 0x4722F9: received_data (genl.c:1102)
==141137== by 0x46F28B: io_callback (io.c:120)
==141137== by 0x46E5AC: l_main_iterate (main.c:478)
==141137== by 0x46E65B: l_main_run (main.c:525)
==141137== by 0x46E65B: l_main_run (main.c:507)
==141137== by 0x46E86B: l_main_run_with_signal (main.c:647)
==141137== by 0x403EA8: main (main.c:490)
It isn't safe to return a NULL from diagnostic_akm_suite_to_security()
since the value is used directly. Also, if the AKM suite is 0, this
implies that the network is an Open network and not some unknown AKM.
==17982== Invalid read of size 1
==17982== at 0x483BC92: strlen (vg_replace_strmem.c:459)
==17982== by 0x47DE60: _dbus1_builder_append_basic (dbus-util.c:981)
==17982== by 0x41ACB2: dbus_append_dict_basic (dbus.c:197)
==17982== by 0x412050: station_get_diagnostic_cb (station.c:3614)
==17982== by 0x405B19: netdev_get_station_cb (netdev.c:4801)
==17982== by 0x47436E: process_unicast (genl.c:994)
==17982== by 0x47436E: received_data (genl.c:1102)
==17982== by 0x470FBB: io_callback (io.c:120)
==17982== by 0x4701DC: l_main_iterate (main.c:478)
==17982== by 0x4702AB: l_main_run (main.c:525)
==17982== by 0x4702AB: l_main_run (main.c:507)
==17982== by 0x4704BB: l_main_run_with_signal (main.c:647)
==17982== by 0x403EDB: main (main.c:490)
==17982== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==17982==
Aborting (signal 11) [/home/denkenz/iwd/src/iwd]
++++++++ backtrace ++++++++
0 0x488a550 in /lib64/libc.so.6
1 0x483bc92 in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so
2 0x47de61 in _dbus1_builder_append_basic() at ell/dbus-util.c:983
3 0x41acb3 in dbus_append_dict_basic() at src/dbus.c:197
4 0x412051 in station_get_diagnostic_cb() at src/station.c:3618
5 0x405b1a in netdev_get_station_cb() at src/netdev.c:4801
It is possible for the RTNL command callback to come after
netconfig_reset or netconfig_destroy has been called. Make sure that
any outstanding commands that might access the netconfig object are
canceled.
src/netconfig.c:netconfig_ipv4_dhcp_event_handler() DHCPv4 event 0
src/netconfig.c:netconfig_ifaddr_added() wlan0: ifaddr 192.168.1.55/24 broadcast 192.168.1.255
^CTerminate
src/netdev.c:netdev_free() Freeing netdev wlan0[15]
src/device.c:device_free()
src/station.c:station_free()
src/netconfig.c:netconfig_destroy()
src/netconfig.c:netconfig_reset()
src/netconfig.c:netconfig_reset_v4() 16
src/netconfig.c:netconfig_reset_v4() Stopping client
Removing scan context for wdev c
src/scan.c:scan_context_free() sc: 0x4a3cc10
==12792== Invalid read of size 8
==12792== at 0x43BF5A: netconfig_route_add_cmd_cb (netconfig.c:600)
==12792== by 0x4727FA: process_message (netlink.c:181)
==12792== by 0x4727FA: can_read_data (netlink.c:289)
==12792== by 0x470F4B: io_callback (io.c:120)
==12792== by 0x47016C: l_main_iterate (main.c:478)
==12792== by 0x47023B: l_main_run (main.c:525)
==12792== by 0x47023B: l_main_run (main.c:507)
==12792== by 0x47044B: l_main_run_with_signal (main.c:647)
==12792== by 0x403EDB: main (main.c:490)
In case the netdev is brought down while we're trying to connect, try to
detect this and fail early instead of trying to send additional
commands.
src/station.c:station_enter_state() Old State: disconnected, new state: connecting
src/station.c:station_netdev_event() Associating
src/netdev.c:netdev_mlme_notify() MLME notification Connect(46)
src/netdev.c:netdev_connect_event()
src/netdev.c:netdev_link_notify() event 16 on ifindex 4
src/eapol.c:eapol_handle_ptk_1_of_4() ifindex=4
src/netdev.c:netdev_link_notify() event 16 on ifindex 4
src/eapol.c:eapol_handle_ptk_3_of_4() ifindex=4
src/netdev.c:netdev_set_gtk() 4
src/station.c:station_handshake_event() Setting keys
src/netdev.c:netdev_set_tk() 4
src/netdev.c:netdev_set_rekey_offload() 4
New Key for Group Key failed for ifindex: 4:Network is down
src/netdev.c:netdev_link_notify() event 16 on ifindex 4
src/station.c:station_free()
src/netdev.c:netdev_mlme_notify() MLME notification Disconnect(48)
src/netdev.c:netdev_disconnect_event()
src/wiphy.c:wiphy_reg_notify() Notification of command Reg Change(36)
src/wiphy.c:wiphy_update_reg_domain() New reg domain country code for (global) is XX
src/netdev.c:netdev_link_notify() event 16 on ifindex 4
src/wiphy.c:wiphy_reg_notify() Notification of command Reg Change(36)
src/wiphy.c:wiphy_update_reg_domain() New reg domain country code for (global) is DE
src/wiphy.c:wiphy_radio_work_done() Work item 14 done
src/station.c:station_connect_cb() 4, result: 4
Segmentation fault
A prior commit refactored the AKM selection in wiphy.c. This
ended up breaking FILS tests due to the hard coding of a
false fils_hint in wiphy_select_akm. Since our FILS tests
only advertise FILS AKMs wiphy_can_connect would return false
for these networks.
Similar to wiphy_select_akm, add a fils hint parameter to
wiphy_can_connect and pass that down directly to wiphy_select_akm.
If PreSharedKey is set, the current logic does not validate the
Passphrase beyond its existence. This can lead to strange situations
where an invalid WPA3-PSK passphrase might get used. This can of course
only happen if the user (as root) or NetworkManager-iwd-backend writes
such a file incorrectly.
Move the WSC Primary Device Type parsing from p2p.c and eap-wsc.c to a
common function in wscutil.c supporting both formats so that it can be
used in ap.c too.
Logically this frame watch belongs in station. It was kept in device.c
for the purported reason that the station object was removed with
ifdown/ifup changes and hence the frame watch might need to be removed
and re-added unnecessarily. Since the kernel does not actually allow to
unregister a frame watch (only when the netdev is removed or its iftype
changes), re-adding a frame watch might trigger a -EALREADY or similar
error.
Avoid this by registering the frame watch when a new netdev is detected
in STATION mode, or when the interface type changes to STATION.
If a netdev iftype is changed, all frame registrations are removed.
Make sure to re-register for the appropriate frame notifications in case
our iftype is switched back to 'station'. In any other iftype, no frame
watches are registered and rrm_state object is effectively dormant.
Right now, RRM is created when a new netdev is detected and its iftype
is of type station. That means that any devices that start their life
as any other iftype cannot be changed to a station and have RRM function
properly. Fix that by always creating the RRM state regardless of the
initial iftype.
In the case that a netdev is powered down, or an interface type change
occurs, the station object will be removed and any watches will be
freed.
Since rrm is created when the netdev is created and persists across
iftype and power up/down changes, it should provide a destroy callback
to station_add_state_watch so that it can be notified when the watch is
removed.
If the iftype changes, kernel silently wipes out any frame registrations
we may have registered. Right now, frame registrations are only done when
the interface is created. This can result in frame watches not being
added if the interface type is changed between station mode to ap mode
and then back to station mode, e.g.:
device wlan0 set-property Mode ap
device wlan0 set-property Mode station
Make sure to re-add frame registrations according to the mode if the
interface type is changed.
Since netdev now keeps track of iftype changes, let it call
frame_watch_wdev_remove on netdevs that it manages to clear frame
registrations that should be cleared due to an iftype change.
Note that P2P_DEVICE wdevs are not managed by any netdev object, but
since their iftype cannot be changed, they should not be affected
by this change.
And set the interface type based on the event rather than the command
callback. This allows us to track interface type changes even if they
come from outside iwd (which shouldn't happen.)
The prepare_ft patch was an intermediate to a full patch
set and was not fully tested stand alone. Its placement
actually broke FT due to handshake->aa getting overwritten
prior to netdev->prev_bssid being copied out. This caused
FT to fail with "transport endpoint not connected (-107)"
- Make sure to print the cookie information
- Don't print messages for frames we're not interested in. This is
particularly helpful when running auto-tests since frame acks from
hostapd pollute the iwd log.
Fix a regression where connection to an open network results in an
NotSupported error being returned.
Fixes: d79e883e93 ("netdev: Introduce connection types")
This makes conversions simpler. Also fixes a bug where P2P devices were
printed with an incorrect Mode value since dbus_iftype_to_string was
assuming that an iftype as defined in nl80211.h was being passed in,
while netdev was returning an enum value defined in netdev.h.
It was seen that some full mac cards/drivers do not include any
rate information with the NEW_STATION event. This was causing
the NEW_STATION event to be ignored, preventing AP mode from
working on these cards.
Since the full mac path does not even require sta->rates the
parsing can be removed completely.
It was found that if the user cancels/disconnects the agent prior to
entering credentials, IWD would get stuck and could no longer accept
any connect calls with the error "Operation already in progress".
For example exiting iwctl in the Password prompt would cause this:
iwctl
$ station wlan0 connect myssid
$ Password: <Ctrl-C>
This was due to the agent never calling the network callback in the
case of an agent disconnect. Network would wait indefinitely for the
credentials, and disallow any future connect attempts.
To fix this agent_finalize_pending can be called in agent_disconnect
with a NULL reply which behaves the same as if there was an
internal timeout and ultimately allows network to fail the connection
The 8021x offloading procedure still does EAP in userspace which
negotiates the PMK. The kernel then expects to obtain this PMK
from userspace by calling SET_PMK. This then allows the firmware
to begin the 4-way handshake.
Using __eapol_install_set_pmk_func to install netdev_set_pmk,
netdev now gets called into once EAP finishes and can begin
the final userspace actions prior to the firmware starting
the 4-way handshake:
- SET_PMK using PMK negotiated with EAP
- Emit SETTING_KEYS event
- netdev_connect_ok
One thing to note is that the kernel provides no way of knowing if
the 4-way handshake completed. Assuming SET_PMK/SET_STATION come
back with no errors, IWD assumes the PMK was valid. If not, or
due to some other issue in the 4-way, the kernel will send a
disconnect.
This adds a new type for 8021x offload as well as support in
building CMD_CONNECT.
As described in the comment, 8021x offloading is not particularly
similar to PSK as far as the code flow in IWD is concerned. There
still needs to be an eapol_sm due to EAP being done in userspace.
This throws somewhat of a wrench into our 'is_offload' cases. And
as such this connection type is handled specially.
802.1x offloading needs a way to call SET_PMK after EAP finishes.
In the same manner as set_tk/gtk/igtk a new 'install_pmk' function
was added which eapol can call into after EAP completes.
The chances were extremely low, but using l_idle_oneshot
could end up causing a invalid memory access if the netdev
went down while waiting for the disconnect idle callback.
Instead netdev can keep track of the idle with l_idle_create
and remove it if the netdev goes down prior to the idle callback.
This fixes an infinite loop issue when authenticate frames time
out. If the AP is not responding IWD ends up retrying indefinitely
due to how SAE was handling this timeout. Inside sae_auth_timeout
it was actually sending another authenticate frame to reject
the SAE handshake. This, again, resulted in a timeout which called
the SAE timeout handler and repeated indefinitely.
The kernel resend behavior was not taken into account when writing
the SAE timeout behavior and in practice there is actually no need
for SAE to do much of anything in response to a timeout. The
kernel automatically resends Authenticate frames 3 times which mirrors
IWDs SAE behavior anyways. Because of this the authenticate timeout
handler can be completely removed, which will cause the connection
to fail in the case of an autentication timeout.
This crash was caused from the disconnect_cb being called
immediately in cases where send_disconnect was false. The
previous patch actually addressed this separately as this
flag was being set improperly which will, indirectly, fix
one of the two code paths that could cause this crash.
Still, there is a situation where send_disconnect could
be false and in this case IWD would still crash. If IWD
is waiting to queue the connect item and netdev_disconnect
is called it would result in the callback being called
immediately. Instead we can add an l_idle as to allow the
callback to happen out of scope, which is what station
expects.
Prior to this patch, the crashing behavior can be tested using
the following script (or some variant of it, your system timing
may not be the same as mine).
iwctl station wlan0 disconnect
iwctl station wlan0 connect <network1> &
sleep 0.02
iwctl station wlan0 connect <network2>
++++++++ backtrace ++++++++
0 0x7f4e1504e530 in /lib64/libc.so.6
1 0x432b54 in network_get_security() at src/network.c:253
2 0x416e92 in station_handshake_setup() at src/station.c:937
3 0x41a505 in __station_connect_network() at src/station.c:2551
4 0x41a683 in station_disconnect_onconnect_cb() at src/station.c:2581
5 0x40b4ae in netdev_disconnect() at src/netdev.c:3142
6 0x41a719 in station_disconnect_onconnect() at src/station.c:2603
7 0x41a89d in station_connect_network() at src/station.c:2652
8 0x433f1d in network_connect_psk() at src/network.c:886
9 0x43483a in network_connect() at src/network.c:1183
10 0x4add11 in _dbus_object_tree_dispatch() at ell/dbus-service.c:1802
11 0x49ff54 in message_read_handler() at ell/dbus.c:285
12 0x496d2f in io_callback() at ell/io.c:120
13 0x495894 in l_main_iterate() at ell/main.c:478
14 0x49599b in l_main_run() at ell/main.c:521
15 0x495cb3 in l_main_run_with_signal() at ell/main.c:647
16 0x404add in main() at src/main.c:490
17 0x7f4e15038b25 in /lib64/libc.so.6
The send_disconnect flag was being improperly set based only
on connect_cmd_id being zero. This does not take into account
the case of CMD_CONNECT having finished but not EAPoL. In this
case we do need to send a disconnect.
This adds a new connection type, TYPE_PSK_OFFLOAD, which
allows the 4-way handshake to be offloaded by the firmware.
Offloading will be used if the driver advertises support.
The CMD_ROAM event path was also modified to take into account
handshake offloading. If the handshake is offloaded we still
must issue GET_SCAN, but not start eapol since the firmware
takes care of this.
Until now FT was only supported via Auth/Assoc commands which barred
any fullmac cards from using FT AKMs. With PSK offload support these
cards can do FT but only when offloading is used.
In the FW scan callback eapol was being stared unconditionally which
isn't correct as roaming on open networks is possible. Instead check
that a SM exists just like is done in netdev_connect_event.
This should have been updated along with the connect and roam
event separation. Since netdev_connect_event is not being
re-used for CMD_ROAM the comment did not make sense anymore.
Still, there needs to be a check to ensure we were not disconnected
while waiting for GET_SCAN to come back.