AP roaming was structured such that any AP roam request would
force IWD to roam (assuming BSS's were found in scan results).
This isn't always the best behavior since IWD may be connected
to the best BSS in range.
Only force a roam if the AP includes one of the 3 disassociation/
termination bits. Otherwise attempt to roam but don't set the
ap_directed_roaming flag which will allows IWD to stay with the
current BSS if no better candidates are found.
There are a few checks that can be done prior to parsing the
request, in addition the explicit check for preparing_roam was
removed since this is taken care of by station_cannot_roam().
This both adds proper handling to the new roaming logic and fixes
a potential bug with firmware roams.
The new way roaming works doesn't use a connect callback. This
means that any disconnect event or call to netdev_connect_failed
will result in the event handler being called, where before the
connect callback would. This means we need to handle the ROAMING
state in the station disconnect event so IWD properly disassociates
and station goes out of ROAMING.
With firmware roams netdev gets an event which transitions station
into ROAMING. Then netdev issues GET_SCAN. During this time a
disconnect event could come in which would end up in
station_disconnect_event since there is no connect callback. This
needs to be handled the same and let IWD transition out of the
ROAMING state.
This converts station to using ft_action/ft_authenticate and
ft_associate and dropping the use of the netdev-only/auth-proto
logic.
Doing this allows for more flexibility if FT fails by letting
IWD try another roam candidate instead of disconnecting.
The current behavior is to only find the best roam candidate, which
generally is fine. But if for whatever reason IWD fails to roam it
would be nice having a few backup BSS's rather than having to
re-scan, or worse disassociate and reconnect entirely.
This patch doesn't change the roam behavior, just prepares for
using a roam candidate list. One difference though is any roam
candidates are added to station->bss_list, rather than just the
best BSS. This shouldn't effect any external behavior.
The candidate list is built based on scan_bss rank. First we establish
a base rank, the rank of the current BSS (or zero if AP roaming). Any
BSS in the results with a higher rank, excluding the current BSS, will
be added to the sorted station->roam_bss_list (as a new 'roam_bss'
entry) as well as stations overall BSS list. If the resulting list is
empty there were no better BSS's, otherwise station can now try to roam
starting with the best candidate (head of the roam list).
There may be situations (due to Multi-BSS operation) where an AP might
be advertising multiple SSIDs on the same BSSID. It is thus more
correct to lookup the preauthentication target on the network object
instead of the station bss_list. It used to be that the network list of
bsses was not updated when roam scan was performed. Hence the lookup
was always performed on the station bss_list. But this is no longer the
case, so it is safer to lookup on the network object directly on the
network.
FT is now driven (mostly) by station which removes the connect
callback. Instead once FT is completed, keys set, etc. netdev
will send an event to notify station.
This will make the debug API more robust as well as fix issues
certain drivers have when trying to roam. Some of these drivers
may flush scan results after CMD_CONNECT which results in -ENOENT
when trying to roam with CMD_AUTHENTICATE unless you rescan
explicitly.
Now this will be taken care of automatically and station will first
scan for the BSS (or full scan if not already in results) and
attempt to roam once the BSS is seen in a fresh scan.
The logic to replace the old BSS object was factored out into its
own function to be shared by the non-debug roam scan. It was also
simplified to just update the network since this will remove the
old BSS if it exists.
This adds a new netdev event for packet loss notifications from
the kernel. Depending on the scenario a station may see packet
loss events without any other indications like low RSSI. In these
cases IWD should still roam since there is no data flowing.
Some APs use an older hostapd OWE implementation which incorrectly
derives the PTK. To work around this group 19 should be used for
these APs. If there is a failure (reason=2) and the AKM is OWE
set force default group into network and retry. If this has been
done already the behavior is no different and the BSS will be
blacklisted.
The kernel handles setting the regulatory domain by receiving beacons
which set the country IE. Presumably since most regulatory domains
disallow 6GHz the default (world) domain also disables it. This means
until the country is set, 6GHz is disabled.
This poses a problem for IWD's quick scanning since it only scans a few
frequencies and this likely isn't enough beacons for the firmware to
update the country, leaving 6Ghz inaccessable to the user without manual
intervention (e.g. iw scan passive, or periodic scans by IWD).
To try and work around this limitation the quick scan logic has been
updated to check if a 6GHz AP has been connected to before and if that
frequency is disabled (but supported). If this is the case IWD will opt
for a full passive scan rather than scanning a limited set of
frequencies.
Provides useful information on why a roam might have failed, such as
failing to find the BSS or the BSS being ranked lower, and why that
might be.
The output format is the same as station_add_seen_bss for consistency.
Certain module dependencies were missing, which could cause a crash on
exit under (very unlikely) circumstances.
#0 l_queue_peek_head (queue=<optimized out>) at ../iwd-1.28/ell/queue.c:241
#1 0x0000aaaab752f2a0 in wiphy_radio_work_done (wiphy=0xaaaac3a129a0, id=6)
at ../iwd-1.28/src/wiphy.c:2013
#2 0x0000aaaab7523f50 in netdev_connect_free (netdev=netdev@entry=0xaaaac3a13db0)
at ../iwd-1.28/src/netdev.c:765
#3 0x0000aaaab7526208 in netdev_free (data=0xaaaac3a13db0) at ../iwd-1.28/src/netdev.c:909
#4 0x0000aaaab75a3924 in l_queue_clear (queue=queue@entry=0xaaaac3a0c800,
destroy=destroy@entry=0xaaaab7526190 <netdev_free>) at ../iwd-1.28/ell/queue.c:107
#5 0x0000aaaab75a3974 in l_queue_destroy (queue=0xaaaac3a0c800,
destroy=destroy@entry=0xaaaab7526190 <netdev_free>) at ../iwd-1.28/ell/queue.c:82
#6 0x0000aaaab7522050 in netdev_exit () at ../iwd-1.28/src/netdev.c:6653
#7 0x0000aaaab7579bb0 in iwd_modules_exit () at ../iwd-1.28/src/module.c:181
In this particular case, wiphy module was de-initialized prior to the
netdev module:
Jul 14 18:14:39 localhost iwd[2867]: ../iwd-1.28/src/wiphy.c:wiphy_free() Freeing wiphy phy0[0]
Jul 14 18:14:39 localhost iwd[2867]: ../iwd-1.28/src/netdev.c:netdev_free() Freeing netdev wlan0[45]
station_signal_agent_notify() has been refactored so that its usage is
simpler. station_rssi_level_changed() has been replaced by an inlined
call to station_signal_agent_notify().
ConnectHiddenNetwork creates a temporary network object and initiates a
connection with it. If the connection fails (due to an incorrect
passphrase or other reasons), then this temporary object is destroyed.
Delay its destruction until network_disconnected() since
network_connect_failed is called too early. Also, re-order the sequence
in station_reset_connection_state() in order to avoid using the network
object after it has been freed by network_disconnected().
Fixes: 85d9d6461f ("network: Hide hidden networks on connection error")
If a user connection fails on a freshly scanned psk or open hidden
network, during passphrase request or after, it shall be removed from
the network list. Otherwise, it would be possible to directly connect
to that known network, which will appear as not hidden.
The logic here assumed any BSS's in the roam scan were identical to
ones in station's bss_list with the same address. Usually this is true
but, for example, if the BSS changed frequency the one in station's
list is invalid.
Instead when a match is found remove the old BSS and re-insert the new
one.
This adds checks if MFP is set to 0 or 1:
0 - Always fail if the frequency is 6GHz
1 - Fail if MFPC=0 and the frequency is 6GHz.
If HW is capable set MFPR=1 for 6GHz
This debug print was before any checks which could bail out prior to
autoconnect starting. This was confusing because debug logs would
contain multiple "station_autoconnect_start()" prints making you think
autoconnect was started several times.
station_set_scan_results takes an autoconnect flag which was being
set true in both regular/quick autoconnect scans. Since OWE networks
are processed after setting the scan results IWD could end up
connecting to a network before all the OWE hidden networks are
populated.
To fix this regular/quick autoconnect results will set the flag to
false, then process OWE networks, then start autoconnect. If any
OWE network scans are pending station_autoconnect_start will fail
but will pick back up after the hidden OWE scan.
- Mostly problems with whitespace:
- Use of spaces instead of tabs
- Stray spaces before closing ')
- Missing spaces
- Missing 'void' from function declarations & definitions that
take no arguments.
- Wrong indentation level
There is an unchecked NULL pointer access in network_has_open_pair.
open_info can be NULL, when out of multiple APs in range that advertise
the same SSID some advertise OWE transition elments and some don't.
The Hotspot 2.0 spec has some requirements that IWD was missing depending
on a few bits in extended capabilities and the HS2.0 indication element.
These requirements correspond to a few sysfs options that can be set in
the kernel which are now set on CONNECTED and unset on DISCONNECTED.
Add netconfig_enabled() and use that in all places that want to know
whether network configuration is enabled. Drop the enable_network_config
deprecated setting, which was only being handled in one of these 5 or so
places.
It was seen during testing that several offload-capable cards
were not including the OCI in the 4-way handshake. This made
any OCV capable AP unconnectable.
To be safe disable OCV on any cards that support offloading.
netconfig_load_settings is called when establishing a new initial
association to a network. This function tries to update dhcp/dhcpv6
clients with the MAC address of the netdev being used. However, it is
too early to update the MAC here since netdev might need to powercycle
the underlying network device in order to update the MAC (i.e. when
AddressRandomization="network" is used).
If the MAC is set incorrectly, DHCP clients are unable to obtain the
lease properly and station is stuck in "connecting" mode indefinitely.
Fix this by delaying MAC address update until netconfig_configure() is
invoked.
Fixes: ad228461ab ("netconfig: Move loading settings to new method, refactor")
If the AP advertises FT-over-DS support it likely wants us to use
it. Additionally signal_low is probably going to be true since IWD
has started a roam attempt.
When netdev goes down so does station, but prior to netdev calling
the neighbor report callback. The way the logic was written station
is dereferenced prior to checking for any errors, causing a use
after free.
Since -ENODEV is used in this case check for that early before
accessing station.
This changes scan_bss from using separate members for each
OWE transition element data type (ssid, ssid_len, and bssid)
to a structure that holds them all.
This is being done because OWE transition has option operating
class and channel bytes which will soon be parsed. This would
end up needing 5 separate members in scan_bss which is a bit
much for a single IE that needs to be parsed.
This makes checking the presense of the IE more convenient
as well since it can be done with a simple NULL pointer check
rather than having to l_memeqzero the BSSID.
OWE Transition is described in the WiFi Alliance OWE Specification
version 1.1. The idea behind it is to support both legacy devices
without any concept of OWE as well as modern ones which support the
OWE protocol.
OWE is a somewhat special type of network. Where it advertises an
RSN element but is still "open". This apparently confuses older
devices so the OWE transition procedure was created.
The idea is simple: have two BSS's, one open, and one as a hidden
OWE network. Each network advertises a vendor IE which points to the
other. A device sees the open network and can connect (legacy) or
parse the IE, scan for the hidden OWE network, and connect to that
instead.
Care was taken to handle connections to hidden networks directly.
The policy is being set that any hidden network with the WFA OWE IE
is not connectable via ConnectHiddenNetwork(). These networks are
special, and can only be connected to via the network object for
the paired open network.
When scan results come in from any source (DBus, quick, autoconnect)
each BSS is checked for the OWE Transition IE. A few paths can be
taken here when the IE is found:
1. The BSS is open. The BSSID in the IE is checked against the
current scan results (excluding hidden networks). If a match is
found we should already have the hidden OWE BSS and nothing
else needs to be done (3).
2. The BSS is open. The BSSID in the IE is not found in the
current scan results, and the open network also has no OWE BSS
in it. This will be processed after scan results.
3. The BSS is not open and contains the OWE IE. This BSS will
automatically get added to the network object and nothing else
needs to be done.
After the scan results each network is checked for any non-paired
open BSS's. If found a scan is started for these BSS's per-network.
Once these scan results come in the network is notified.
From here network.c can detect that this is an OWE transition
network and connect to the OWE BSS rather than the open one.
DBus scan is performed in several subsets. In certain corner-case
circumstances it would be possible for autoconnect to run after each
subset scan. Instead, trigger autoconnect only after the dbus scan
completes.
This also works around a condition where ANQP results could trigger
autoconnect too early.
Several invocations of station_set_scan_results() base the
'add_to_autoconnect' parameter on station_is_autoconnecting(). Simplify
the code by having station_set_scan_results() invoke that itself.
'add_to_autoconnect' now becomes an 'intent' parameter, specifying
whether autoconnect path should be invoked as a result of these scan
results or not when station is in an appropriate state. Rename
'add_to_autoconnect' parameter to make this clearer.
If the frequency of the bss is not in the list of frequencies for the
current scan, then this is a cached bss. It was likely already
processed for ANQP before, so skip it.
IWD has restricted SSIDs to only utf8 so they can be displayed but
with the addition of OWE transition networks this is an unneeded
restriction (for these networks). The SSID of an OWE transition
network is never displayed to the user so limiting to utf8 isn't
required.
Allow non-utf8 SSIDs to be scanned for by including the length in
the scan parameters and not relying on strlen().
With the addition of OWE transition network needs to be notified
of the hidden OWE scan which is quite similar to how it is notified
of ANQP. The ANQP event watch can be made generic and reused to
allow other events besides ANQP.
This is being added to support OWE transition mode. For these
type of networks the OWE BSS may contain a different SSID than
that of the network, but the WFA spec requires this be hidden
from the user. This means we need to set the handshake SSID based
on the BSS rather than the network object.
Send and receive the FILS IP Address Assignment IEs during association.
As implemented this would work independently of FILS although the only
AP software handling this mechanism without FILS is likely IWD itself.
No support is added for handling the IP assignment information sent from
the server after the initial Association Request/Response frames, i.e.
the information is only used if it is received directly in the
Association Response without the "response pending" bit, otherwise the
DHCP client will be started.
Split loading settings out of network_configure into a new method,
network_load_settings. Make sure both consistently handle errors by
printing messages and informing the caller.
Add a handshake event for use by the AP side for mechanisms that
allocate client IPs during the handshake: P2P address allocation and
FILS address assignment. This is emitted only when EAPOL or the
auth_proto is actually about to send the network configuration data to
the client so that ap.c can skip allocating a DHCP leases altogether if
the client doesn't send the required KDE or IE.
This is meant to be used as a generic notification to autotests. For
now 'no-roam-candidates' is the only event being sent. The idea
is to extend these events to signal conditions that are otherwise
undiscoverable in autotesting.
This is to support the autotesting framework by allowing a smaller
scan subset. This will cut down on the amount of time spent scanning
via normal DBus scans (where the entire spectrum is scanned).
Most autotests do not want autoconnect behavior so it is being
turned off by default. There are a few tests where it is needed
and in these few cases the test can enable autoconnect through
the new station debug property.
This adds the property "AutoConnect" to the station debug interface
which can be read/written to disable or enable autoconnect globally.
As one would expect this property is only going to be used for testing
hence why it was put on the debug interface. Mosts tests disable
autoconnect (or they should) because it leads to unexpected connections.
This method will initiate a connection to a specific BSS rather
than relying on a network based connection (which the user has
no control over which specific BSS is selected).
The preparing_roam flag is expected to be set by a few roam
routines and normally this is done prior to the roam scan.
The Roam() developer option was not doing this and would
cause failed roams in some cases.
This variable ended up being used only on the fast-transition path. On
the re-associate path it was never used, but memcpy-ied nevertheless.
Since its only use is by auth_proto based protocols, move it to the
auth_proto object directly.
Due to how prepare_ft works (we need prev_bssid from the handshake, but
the handshake is reset), have netdev_ft_* methods take an 'orig_bss'
parameter, similar to netdev_reassociate.
This refactors some code to eliminate getting the ERP entry twice
by simply returning it from network_has_erp_identity (now renamed
to network_get_erp_cache). In addition this code was moved into
station_build_handshake_rsn and properly cleaned up in case there
was an error or if a FILS AKM was not chosen.
Transition Disable indications and information stored in the network
profile needs to be enforced. Since Transition Disable information is
now stored inside the network object, add a new method
'network_can_connect_bss' that will take this information into account.
wiphy_can_connect method is thus deprecated and removed.
Transition Disable can also result in certain AKMs and pairwise ciphers
being disabled, so wiphy_select_akm method's signature is changed and
takes the (possibly overriden) ie_rsn_info as input.
This indication can come in via EAPoL message 3 or during
FILS Association. It carries information as to whether certain
transition mode options should be disabled. See WPA3 Specification,
version 3 for more details.
Most parameters set into the handshake object are actually known by the
network object itself and not station. This includes address
randomization settings, EAPoL settings, passphrase/psk/8021x settings,
etc. Since the number of these settings will only keep growing, move
the handshake setup into network itself. This also helps keep network
internals better encapsulated.
There will be additional security-related settings that will be
introduced for settings files. In particular, Hash-to-Curve PT
elements, Transition Disable settings and potentially others in the
future. Since PSK is now not the only element that would require
update, rename this function to better reflect this.
If the idea is that the interface should only be present when connected
then don't do this in the DISCONNECTING state as there are various
possible transitions from CONNECTED or ROAMING directly to DISCONNECTED.
station_free() is invoked when one of two possibilities happen:
- Device has been powered down, and EVENT_DOWN has been emitted
- Device has been removed, and EVENT_DEL has been emitted
In both cases there is not much point for netdev_disconnect to be
invoked as that tries to cleanly shut down an existing connection. The
only thing the ABORTED error accomplishes in this case is to send a
dbus_aborted_error for the pending_connect message, if it exists.
There's already code for doing this in station_free().
src/station.c:station_enter_state() Old State: autoconnect_quick, new state: connecting (auto)
src/scan.c:scan_cancel() Trying to cancel scan id 1 for wdev 7
src/wiphy.c:wiphy_radio_work_done() Work item 1 done
src/wiphy.c:wiphy_radio_work_next() Starting work item 2
Terminate
src/netdev.c:netdev_free() Freeing netdev wlan0[9]
src/device.c:device_free()
src/station.c:station_free()
src/wiphy.c:wiphy_radio_work_done() Work item 2 done
src/station.c:station_connect_cb() 9, result: 5
src/netconfig.c:netconfig_destroy()
Removing scan context for wdev 7
src/scan.c:scan_context_free() sc: 0x4a39490
src/netdev.c:netdev_mlme_notify() MLME notification New Station(19)
src/netdev.c:netdev_link_notify() event 16 on ifindex 9
src/netdev.c:netdev_link_notify() event 16 on ifindex 9
src/netdev.c:netdev_mlme_notify() MLME notification Authenticate(37)
src/netdev.c:netdev_link_notify() event 16 on ifindex 9
src/netdev.c:netdev_mlme_notify() MLME notification Associate(38)
src/netdev.c:netdev_link_notify() event 16 on ifindex 9
src/netdev.c:netdev_mlme_notify() MLME notification Connect(46)
src/netdev.c:netdev_link_notify() event 16 on ifindex 9
src/wiphy.c:wiphy_reg_notify() Notification of command Reg Change(36)
src/wiphy.c:wiphy_update_reg_domain() New reg domain country code for (global) is US
src/netdev.c:netdev_link_notify() event 16 on ifindex 9
src/netdev.c:netdev_unicast_notify() Unicast notification 129
src/netdev.c:netdev_mlme_notify() MLME notification Del Station(20)
src/netdev.c:netdev_mlme_notify() MLME notification Deauthenticate(39)
src/netdev.c:netdev_mlme_notify() MLME notification Disconnect(48)
src/wiphy.c:wiphy_reg_notify() Notification of command Reg Change(36)
src/wiphy.c:wiphy_update_reg_domain() New reg domain country code for (global) is XX
==20311== Invalid write of size 4
==20311== at 0x406E74: netdev_cmd_disconnect_cb (netdev.c:1130)
==20311== by 0x4A78A8: process_unicast (genl.c:986)
==20311== by 0x4A7C6A: received_data (genl.c:1098)
==20311== by 0x4A2E1F: io_callback (io.c:120)
==20311== by 0x4A17BB: l_main_iterate (main.c:478)
==20311== by 0x4A18FC: l_main_run (main.c:525)
==20311== by 0x4A1C14: l_main_run_with_signal (main.c:647)
==20311== by 0x404D27: main (main.c:542)
==20311== Address 0x4a37a0c is 156 bytes inside a block of size 472 free'd
==20311== at 0x48399CB: free (vg_replace_malloc.c:538)
==20311== by 0x498991: l_free (util.c:136)
==20311== by 0x406651: netdev_free (netdev.c:883)
==20311== by 0x412976: netdev_shutdown (netdev.c:5970)
==20311== by 0x403A14: iwd_shutdown (main.c:79)
==20311== by 0x403A7D: signal_handler (main.c:90)
==20311== by 0x4A1B1D: sigint_handler (main.c:612)
==20311== by 0x4A1F5D: handle_callback (signal.c:78)
==20311== by 0x4A2052: signalfd_read_cb (signal.c:104)
==20311== by 0x4A2E1F: io_callback (io.c:120)
==20311== by 0x4A17BB: l_main_iterate (main.c:478)
==20311== by 0x4A18FC: l_main_run (main.c:525)
If the connected BSS changes channel, netdev will emit an event with the
new channel's frequency. In response, have station change the frequency
of the connected scan_bss struct and inform network about the update.
Right now, if a connection to a network selected by auto-connect fails,
the entire autoconnect process is restarted. This means that scans are
kicked off again, auto-connect list is rebuilt, etc. This was due to
auto-connect reusing the same failure path as connections triggered via
D-Bus.
The above behavior can lead to weird situations in certain corner cases.
For example, a highly preferred network configured with the wrong
password would result in auto-connect entering an infinite loop.
Fix this by making sure that all auto-connect entries are tried and
exhausted prior to re-scanning again.
This will be effectively the same as the CONNECTING state, but can be
used to enable differing behavior, depending on whether connection was
triggered by autoconnect or via D-Bus.