Since netdev maintains the list of FT over DS info structs there is not
any need for station to get callbacks when the initial action frame
is received, or not. This removes the need for the callback handler,
user data, and response timeout.
Roam times can be slightly improved by sending out the FT-over-DS
action frames to any BSS in the mobility domain immediately after
connecting. This preauthenticates IWD to each AP which means
Reassociation can happen right away when a roam is needed.
When a roam is needed station_transition_start will first try
FT-over-DS (if supported) via netdev_fast_transtion_over_ds. The
return is checked and if netdev has no cached entries FT-over-Air
will be used instead.
The beauty of FT-over-DS is that a station can send and receive
action frames to many APs to prepare for a future roam. Each
AP authenticates the station and when a roam happens the station
can immediately move to reassociation.
To handle this a queue of netdev_ft_over_ds_info structs is used
instead of a single entry. Using the new ft.c parser APIs these
info structs can be looked up when responses come in. For now
the timeouts/callbacks are kept but these will be removed as it
really does not matter if the AP sends a response (keeps station
happy until the next patch).
This is to prepare for multiple concurrent FT-over-DS action frames.
A list will be kept in netdev and for lookup reasons it needs to
parse the start of the frame to grab the aa/spa addresses. In this
call the IEs are also returned and passed to the new
ft_over_ds_parse_action_response.
For now the address checks have been moved into netdev, but this will
eventually turn into a queue lookup.
This value sets the roaming threshold on 5GHz networks. The
threshold has been separated from 2.4GHz because in many cases
5GHz can perform much better at low RSSI than 2.4GHz.
In addition the BSS ranking logic was re-worked and now 5GHz is
much more preferred, even at low RSSI. This means we need a
lower floor for RSSI before roaming, otherwise IWD would end
up roaming immediately after connecting due to low RSSI CQM
events.
This is being added as a developer method and should not be used
in production. For testing purposes though, it is quite useful as
it forces IWD to roam to a provided BSS and bypasses IWD's roaming
and ranking logic for choosing a roam candidate.
To use this a BSSID is provided as the only parameter. If this
BSS is not in IWD's current scan results -EINVAL will be returned.
If IWD knows about the BSS it will attempt to roam to it whether
that is via FT, FT-over-DS, or Reassociation. These details are
still sorted out in IWDs station_transition_start() logic.
This will enable developer features to be used. Currently the
only user of this will be StationDiagnostics.Roam() method which
should only be exposed in this mode.
Expose the state directory/storage directory path on D-Bus because it
can't be known to clients until IWD runs, and client might need to
occasionally fiddle with the network config files. While there also
expose the IWD version string, similar to how some other D-Bus services
do.
Similar to 06aa84cca set the operstate when AdHoc is started and
stopped as it is no longer always set by netdev (only for station/p2p
interface types)
Previously resp was a simple array of bytes allocated on the stack.
This was changed to a dynamically allocated array, but the sizeof(resp)
argument to ap_build_beacon_pr_head() was never changed appropriately.
Fix this by introducing a new resp_len variable that holds the number of
bytes allocated for resp. Also, move the allocation after the basic
sanity checks have been performed to avoid allocating/freeing memory
unnecessarily.
Fixes: 18a63f91fd ("ap: Write extra frame IEs from the user")
Commit 1fe5070 added a workaround for drivers which may send the
connect event prior to the connect callback/ack. This caused IWD
to fail to start eapol if reassociation was used due to
netdev_reassociate never setting netdev->connected = false.
netdev_reassociate uses the same code path as normal connections,
but when the connect callback came in connected was already set
to true which then prevents eapol from being registered. Then,
once the connect event comes in, there is no frame watch for
eapol and IWD doesn't respond to any handshake frames.
Prior to this, an error sending the FT Reassociation was treated
as fatal, which is correct for FT-over-Air but not for FT-over-DS.
If the actual l_genl_family_send call fails for FT-over-DS the
existing connection can be maintained and there is no need to
call netdev_connect_failed.
Adding a return to the tx_associate function works for both FT
types. In the FT-over-Air case this return will ultimately get
sent back up to auth_proto_rx_authenticate in which case will
call netdev_connect_failed. For FT-over-DS tx_associate is
actually called from the 'start' operation which can fail and
still maintain the existing connection.
FT-over-DS was refactored to separate the FT action frame and
reassociation. From stations standpoint IWD needs to call
netdev_fast_transition_over_ds_action prior to actually roaming.
For now these two stages are being combined and the action
roam happens immediately after the action response callback.
FT-over-DS followed the same pattern as FT-over-Air which worked,
but really limited how the protocol could be used. FT-over-DS is
unique in that we can authenticate to many APs by sending out
FT action frames and parsing the results. Once parsed IWD can
immediately Reassociate, or do so at a later time.
To take advantage of this IWD need to separate FT-over-DS into
two stages: action frame and reassociation.
The initial action frame stage is started by netdev. The target
BSS is sent an FT action frame and a new cache entry is created
in ft.c. Once the response is received the entry is updated
with all the needed data to Reassociate. To limit the record
keeping on netdev each FT-over-DS entry holds a userdata pointer
so netdev doesn't need to maintain its own list of data for
callbacks.
Once the action response is parsed netdev will call back signalling
the action frame sequence was completed (either successfully or not).
At this point the 'normal' FT procedure can start using the
FT-over-DS auth-proto.
FT-over-DS is being separated into two independent stages. The
first of which is the processing of the action frame response.
This new class will hold all the parsed information from the action
frame and allowing it to be retrieved at a later time when IWD
needs to roam.
Initial info class should be created when the action frame is
being sent out. Once a response is received it can be parsed
with ft_over_ds_parse_action_response. This verifies the frame
and updates the ft_ds_info class with the parsed data.
ft_over_ds_prepare_handshake is the final step prior to
Reassociation. This sets all the stored IEs, anonce, and KH IDs
into the handshake and derives the new PTK.
This adds the RSNE verification to ft_parse_ies which will
be common between over-Air and over-DS. The MDE check was
also factored out into its own minimal function as to
retain the spec comment but allow reuse elsewhere.
Prior to this the diagnostic interface was taken down when station
transitioned to DISCONNECTED. This worked but once station is in
a DISCONNECTING state it then calls netdev_disconnect(). Trying to
get any diagnostic data during this time may not work as its
unknown what state exactly the kernel is in. To be safe take the
interface down when station is DISCONNECTING.
The building of the FT IEs for Action/Authenticate
frames will need to be shared between ft and netdev
once FT-over-DS is refactored.
The building was refactored to work off the callers
buffer rather than internal stack buffers. An argument
'new_snonce' was included as FT-over-DS will generate
a new snonce for the initial action frame, hence the
handshakes snonce cannot be used.
Break up the rather large code block which parses out IEs,
verifies, and sets into the handshake. FT-over-DS needs these
steps broken up in order to parse the action frame response
without modifying the handshake.
Under very rare circumstances the roaming scan triggered might not be
canceled properly. This is because we issue the roam scan recursively
from within a scan callback and re-use the id of the scan for the
subsequent request. The destroy callback is invoked right after the
callback and resets the id. This leads to the scan not being canceled
properly in roam_state_clear().
src/netdev.c:netdev_mlme_notify() MLME notification Notify CQM(64)
src/station.c:station_roam_trigger_cb() 37
src/station.c:station_roam_scan() ifindex: 37
src/station.c:station_roam_trigger_cb() Using cached neighbor report for roam
...
src/scan.c:get_scan_done() get_scan_done
src/station.c:station_roam_failed() 37
src/station.c:station_roam_scan() ifindex: 37
src/scan.c:scan_request_triggered() Active scan triggered for wdev 22
^CTerminate
src/netdev.c:netdev_free() Freeing netdev wlan0[37]
src/device.c:device_free()
src/station.c:station_free()
...
Removing scan context for wdev 22
src/scan.c:scan_context_free() sc: 0x4a362a0
src/wiphy.c:wiphy_radio_work_done() Work item 14 done
==19542== Invalid write of size 4
==19542== at 0x411500: station_roam_scan_destroy (station.c:2010)
==19542== by 0x420B5B: scan_request_free (scan.c:156)
==19542== by 0x410BAC: destroy_work (wiphy.c:294)
==19542== by 0x410BAC: wiphy_radio_work_done (wiphy.c:1613)
==19542== by 0x46C66E: l_queue_clear (queue.c:107)
==19542== by 0x46C6B8: l_queue_destroy (queue.c:82)
==19542== by 0x420BAE: scan_context_free (scan.c:205)
==19542== by 0x424135: scan_wdev_remove (scan.c:2272)
==19542== by 0x408754: netdev_free (netdev.c:847)
==19542== by 0x40E18C: netdev_shutdown (netdev.c:5773)
==19542== by 0x404756: iwd_shutdown (main.c:78)
==19542== by 0x404756: iwd_shutdown (main.c:65)
==19542== by 0x470E21: handle_callback (signal.c:78)
==19542== by 0x470E21: signalfd_read_cb (signal.c:104)
==19542== by 0x47166B: io_callback (io.c:120)
==19542== Address 0x4d81f98 is 200 bytes inside a block of size 288 free'd
==19542== at 0x48399CB: free (vg_replace_malloc.c:538)
==19542== by 0x47F3E5: interface_instance_free (dbus-service.c:510)
==19542== by 0x481DEA: _dbus_object_tree_remove_interface (dbus-service.c:1694)
==19542== by 0x481F1C: _dbus_object_tree_object_destroy (dbus-service.c:795)
==19542== by 0x40894F: netdev_free (netdev.c:844)
==19542== by 0x40E18C: netdev_shutdown (netdev.c:5773)
==19542== by 0x404756: iwd_shutdown (main.c:78)
==19542== by 0x404756: iwd_shutdown (main.c:65)
==19542== by 0x470E21: handle_callback (signal.c:78)
==19542== by 0x470E21: signalfd_read_cb (signal.c:104)
==19542== by 0x47166B: io_callback (io.c:120)
==19542== by 0x47088C: l_main_iterate (main.c:478)
==19542== by 0x47095B: l_main_run (main.c:525)
==19542== by 0x47095B: l_main_run (main.c:507)
==19542== by 0x470B6B: l_main_run_with_signal (main.c:647)
==19542== Block was alloc'd at
==19542== at 0x483879F: malloc (vg_replace_malloc.c:307)
==19542== by 0x46AB2D: l_malloc (util.c:62)
==19542== by 0x416599: station_create (station.c:3448)
==19542== by 0x406D55: netdev_newlink_notify (netdev.c:5324)
==19542== by 0x46D4BC: l_hashmap_foreach (hashmap.c:612)
==19542== by 0x472F46: process_broadcast (netlink.c:158)
==19542== by 0x472F46: can_read_data (netlink.c:279)
==19542== by 0x47166B: io_callback (io.c:120)
==19542== by 0x47088C: l_main_iterate (main.c:478)
==19542== by 0x47095B: l_main_run (main.c:525)
==19542== by 0x47095B: l_main_run (main.c:507)
==19542== by 0x470B6B: l_main_run_with_signal (main.c:647)
==19542== by 0x403EDB: main (main.c:490)
==19542==
Prior to this netdev_connect_ok set setting this which really
only applies to station mode. In addition this happens for each
new station that connects to the AP. Instead set the operstate /
link mode when AP starts and stops.
Change ap_start to load all of the AP configuration from a struct
l_settings, moving the 6 or so parameters from struct ap_config members
to the l_settings groups and keys. This extends the ap profile concept
used for the DHCP settings. ap_start callers create the l_settings
object and fill the values in it or read the settings in from a file.
Since ap_setup_dhcp and ap_load_profile_and_dhcp no longer do the
settings file loading, they needed to be refactored and some issues were
fixed in their logic, e.g. l_dhcp_server_set_ip_address() was never
called when the "IP pool" was used. Also the IP pool was previously only
used if the ap->config->profile was NULL and this didn't match what the
docs said:
"If [IPv4].Address is not provided and no IP address is set on the
interface prior to calling StartProfile the IP pool will be used."
The info struct is on the stack which leads to the potential
for uninitialized data access. Zero out the info struct prior
to calling the get station callback:
==141137== Conditional jump or move depends on uninitialised value(s)
==141137== at 0x458A6F: diagnostic_info_to_dict (diagnostic.c:109)
==141137== by 0x41200B: station_get_diagnostic_cb (station.c:3620)
==141137== by 0x405BE1: netdev_get_station_cb (netdev.c:4783)
==141137== by 0x4722F9: process_unicast (genl.c:994)
==141137== by 0x4722F9: received_data (genl.c:1102)
==141137== by 0x46F28B: io_callback (io.c:120)
==141137== by 0x46E5AC: l_main_iterate (main.c:478)
==141137== by 0x46E65B: l_main_run (main.c:525)
==141137== by 0x46E65B: l_main_run (main.c:507)
==141137== by 0x46E86B: l_main_run_with_signal (main.c:647)
==141137== by 0x403EA8: main (main.c:490)