The known frequency list may include frequencies that once were
allowed but are now disabled due to regulatory restrictions. Don't
include these frequencies in the roam scan.
These events are sent if IWD fails to authentiate
(ft-over-air-roam-failed) or if it falls back to over air after
failing to use FT-over-DS (try-ft-over-air)
If IPv4 setup fails and the netconfig logic gives up, continue as if the
connection had failed at earlier stages so that autoconnect can try the
next available network.
Certain drivers support/require probe response offloading which
IWD did not check for or properly handle. If probe response
offloading is required the probe response frame watch will not
be added and instead the ATTR_PROBE_RESP will be included with
START_AP.
The head/tail builders were reused but slightly modified to check
if the probe request frame is NULL, since it will be for use with
START_AP.
Parse the AP probe response offload attribute during the dump. If
set this indicates the driver expects the probe response attribute
to be included with START_AP.
Clearing all authentications during ft_authenticate was a very large
hammer and may remove cached authentications that could be used if
the current auth attempt fails.
For example the best BSS may have a problem and fail to authenticate
early with FT-over-DS, then fail with FT-over-Air. But another BSS
may have succeeded early with FT-over-DS. If ft_authenticate clears
all ft_infos that successful authentication will be lost.
AP roaming was structured such that any AP roam request would
force IWD to roam (assuming BSS's were found in scan results).
This isn't always the best behavior since IWD may be connected
to the best BSS in range.
Only force a roam if the AP includes one of the 3 disassociation/
termination bits. Otherwise attempt to roam but don't set the
ap_directed_roaming flag which will allows IWD to stay with the
current BSS if no better candidates are found.
There are a few checks that can be done prior to parsing the
request, in addition the explicit check for preparing_roam was
removed since this is taken care of by station_cannot_roam().
Once offchannel completes we can check if the info structure was
parsed, indicating authentication succeeded. If not there is no
reason to keep it around since IWD will either try another BSS or
fail.
This both adds proper handling to the new roaming logic and fixes
a potential bug with firmware roams.
The new way roaming works doesn't use a connect callback. This
means that any disconnect event or call to netdev_connect_failed
will result in the event handler being called, where before the
connect callback would. This means we need to handle the ROAMING
state in the station disconnect event so IWD properly disassociates
and station goes out of ROAMING.
With firmware roams netdev gets an event which transitions station
into ROAMING. Then netdev issues GET_SCAN. During this time a
disconnect event could come in which would end up in
station_disconnect_event since there is no connect callback. This
needs to be handled the same and let IWD transition out of the
ROAMING state.
This finalizes the refactor by moving all the handshake prep
into FT itself (most was already in there). The netdev-specific
flags and state were added into netdev_ft_tx_associate which
now avoids any need for a netdev API related to FT.
The NETDEV_EVENT_FT_ROAMED event is now emitted once FT completes
(netdev_connect_ok). This did require moving the 'in_ft' flag
setting until after the keys are set into the kernel otherwise
netdev_connect_ok has no context as to if this was FT or some
other connection attempt.
In addition the prev_snonce was removed from netdev. Restoring
the snonce has no value once association begins. If association
fails it will result in a disconnect regardless which requires
a new snonce to be generated
This converts station to using ft_action/ft_authenticate and
ft_associate and dropping the use of the netdev-only/auth-proto
logic.
Doing this allows for more flexibility if FT fails by letting
IWD try another roam candidate instead of disconnecting.
Now the full action frame including the header is provided to ft
which breaks the existing parser since it assumes the buffer starts
at the body of the message.
This forwards Action, Authentication and Association frames to
ft.c via their new hooks in netdev.
Note that this will break FT-over-Air temporarily since the
auth-proto still is in use.
The current behavior is to only find the best roam candidate, which
generally is fine. But if for whatever reason IWD fails to roam it
would be nice having a few backup BSS's rather than having to
re-scan, or worse disassociate and reconnect entirely.
This patch doesn't change the roam behavior, just prepares for
using a roam candidate list. One difference though is any roam
candidates are added to station->bss_list, rather than just the
best BSS. This shouldn't effect any external behavior.
The candidate list is built based on scan_bss rank. First we establish
a base rank, the rank of the current BSS (or zero if AP roaming). Any
BSS in the results with a higher rank, excluding the current BSS, will
be added to the sorted station->roam_bss_list (as a new 'roam_bss'
entry) as well as stations overall BSS list. If the resulting list is
empty there were no better BSS's, otherwise station can now try to roam
starting with the best candidate (head of the roam list).
A new API was added, ft_authenticate, which will send an
authentication frame offchannel via CMD_FRAME. This bypasses
the kernel's authentication state allowing multiple auth
attempts to take place without disconnecting.
Currently netdev handles caching FT auth information and uses FT
parsers/auth-proto to manage the protocol. This sets up to remove
this state machine from netdev and isolate it into ft.c.
This does not break the existing auth-proto (hence the slight
modifications, which will be removed soon).
Eventually the auth-proto will be removed from FT entirely, replaced
just by an FT state machine, similar to how EAPoL works (netdev hooks
to TX/RX frames).
There may be situations (due to Multi-BSS operation) where an AP might
be advertising multiple SSIDs on the same BSSID. It is thus more
correct to lookup the preauthentication target on the network object
instead of the station bss_list. It used to be that the network list of
bsses was not updated when roam scan was performed. Hence the lookup
was always performed on the station bss_list. But this is no longer the
case, so it is safer to lookup on the network object directly on the
network.
The warnings in the authenticate and connect events were identical
so it could be difficult knowing which print it was if IWD is not
in debug mode (to see more context). The prints were changed to
indicate which event it was and for the connect event the reason
attribute is also parsed.
Note the resp_ies_len is also initialized to zero now. After making
the changes gcc was throwing a warning.
FT is special in that it really should not be interrupted. Since
FRAME/OFFCHANNEL have the highest priority we run the risk of
DPP or some other offchannel operation interfering with FT.
FT is now driven (mostly) by station which removes the connect
callback. Instead once FT is completed, keys set, etc. netdev
will send an event to notify station.
Since l_netconfig's DHCPv6 client instance no longer sets parameters on
the l_icmp6_client instance, call l_icmp6_client_set_nodelay() and
l_icmp6_client_set_debug() directly. Also enable optimistic DAD to
speed up IPv6 setup if available.
All uses of frame-xchg were for action frames, and the frame type
was hard coded. Soon other frame types will be needed so the type
must now be specified in the frame_xchg_prefix structure.
This will make the debug API more robust as well as fix issues
certain drivers have when trying to roam. Some of these drivers
may flush scan results after CMD_CONNECT which results in -ENOENT
when trying to roam with CMD_AUTHENTICATE unless you rescan
explicitly.
Now this will be taken care of automatically and station will first
scan for the BSS (or full scan if not already in results) and
attempt to roam once the BSS is seen in a fresh scan.
The logic to replace the old BSS object was factored out into its
own function to be shared by the non-debug roam scan. It was also
simplified to just update the network since this will remove the
old BSS if it exists.
Add a second netconfig-commit backend which, if enabled, doesn't
directly send any of the network configuration to the kernel or system
files but delegates the operation to an interested client's D-Bus
method as described in doc/agent-api.txt. This backend is switched to
when a client registers a netconfig agent object and is swiched away
from when the client disconnects or unregisters the agent. Only one
netconfig agent can be registered any given time.
Add netconfig_event_handler() that responds to events emitted by
the l_netconfig object by calling netconfig_commit, tracking whether
we're connected for either address family and emitting
NETCONFIG_EVENT_CONNECTED or NETCONFIG_EVENT_FAILED as necessary.
NETCONFIG_EVENT_FAILED is a new event as until now failures would cause
the netconfig state machine to stop but no event emitted so that
station.c could take action. As before, these events are only
emitted based on the IPv4 configuration state, not IPv6.
Add netconfig-commit.c whose main method, netconfig_commit actually sets
the configuration obtained by l_netconfig to the system netdev,
specifically it sets local addresses on the interface, adds routes to the
routing table, sets DNS related data and may add entries to the neighbor
cache. netconfig-commit.c uses a backend-ops type structure to allow
for switching backends. In this commit there's only a default backend
that uses l_netconfig_rtnl_apply() and a struct resolve object to write
the configuration.
netconfig_gateway_to_arp is moved from netconfig.c to netconfig-commit.c
(and renamed.) The struct netconfig definition is moved to netconfig.h
so that both files can access the settings stored in the struct.
To avoid repeated lookups by ifindex, replace the ifindex member in
struct netconfig with a struct netdev pointer. A struct netconfig
always lives shorter than the struct netdev.
* make the error handling simpler,
* make error messages more consistent,
* validate address families,
* for IPv4 skip l_rtnl_address_set_noprefixroute()
as l_netconfig will do this internally as needed.
* for IPv6 set the default prefix length to 64 as that's going to be
used for the local prefix route's prefix length and is a more
practical value.
Drop all the struct netconfig members where we were keeping the parsed
netconfig settings and add a struct l_netconfig object. In
netconfig_load_settings load all of the settings once parsed directly
into the l_netconfig object. Only preserve the mdns configuration and
save some boolean values needed to properly handle static configuration
and FILS. Update functions to use the new set of struct netconfig
members.
These booleans mirroring the l_netconfig state could be replaced by
adding l_netconfig getters for settings which currently only have
setters.
In anticipation of switching to use the l_netconfig API, which
internally handles DHCPv4, DHCPv6, ACD, etc., drop pointers to
instances of l_dhcp_client, l_dhcp6_client and l_acd from struct
netconfig. Also drop all code used for handling events from these
APIs, including code to commit the received configurations to the
system. Committing the final settings to the system netdevs is going to
be handled by a new set of utilities in a new file.
The RRM module was blindly scanning using the requested
frequency which may or may not be possible given the hardware.
Instead check that the frequency will work and if not reject
the request.
This was reported by a user seeing the RRM scan fail which was
due to the AP requesting a scan on 5GHz when the adapter was
2.4GHz only.
Support for MAC address changes while powered was recently added to
mac80211. This avoids the need to power down the device which both
saves time as well as preserves any allowed frequencies which may
have been disabled if the device powered down.
The code path for changing the address was reused but now just the
'up' callback will be provided directly to l_rtnl_set_mac. Since
there aren't multiple stages of callbacks the rtnl_data structure
isn't strictly needed, but the code looks cleaner and more
consistent between the powered/non-powered code paths.
The comment/debug error print was also updated to be more general
between the two MAC change code paths.
Documentation for MulticastDNS setting suggests it should be part of the
main iwd configuration file. See man iwd.config. However, in reality
the setting was being pulled from the network provisioning file instead.
The latter actually makes more sense since systemd-resolved has its own
set of global defaults. Fix the documentation to reflect the actual
implementation.
netdev does not keep any pointers to struct scan_bss arguments that are
passed in. Make this explicitly clear by modifying the API definitions
and mark these as const.
This adds a new netdev event for packet loss notifications from
the kernel. Depending on the scenario a station may see packet
loss events without any other indications like low RSSI. In these
cases IWD should still roam since there is no data flowing.
Some APs use an older hostapd OWE implementation which incorrectly
derives the PTK. To work around this group 19 should be used for
these APs. If there is a failure (reason=2) and the AKM is OWE
set force default group into network and retry. If this has been
done already the behavior is no different and the BSS will be
blacklisted.
If a OWE network is buggy and requires the default group this info
needs to be stored in network in order for it to set this into the
handshake on future connect attempts.
This functionality works around the kernel's behavior of allowing
6GHz only after a regulatory domain update. If the regdom updates
scan.c needs to be aware in order to split up periodic scans, or
insert 6GHz frequencies into an ongoing periodic scan. Doing this
allows any 6GHz BSS's to show up in the scan results rather than
needing to issue an entirely new scan to see these BSS's.
The kernel's regulatory domain updates after some number of beacons
are processed. This triggers a regulatory domain update (and wiphy
dump) but only after a scan request. This means a full scan started
prior to the regdom being set will not include any 6Ghz BSS's even
if the regdom was unlocked during the scan.
This can be worked around by splitting up a large scan request into
multiple requests allowing one of the first commands to trigger a
regdom update. Once the regdom updates (and wiphy dumps) we are
hopefully still scanning and could append an additional request to
scan 6GHz.
In the case of an external scan, we won't have a scan_request object,
sr. Make sure to not crash in this case.
Also, since scan_request can no longer carry the frequency set in all
cases, add a new member to scan_results in order to do so.
Fixes: 27d8cf4ccc ("scan: track scanned frequencies for entire request")
The kernel handles setting the regulatory domain by receiving beacons
which set the country IE. Presumably since most regulatory domains
disallow 6GHz the default (world) domain also disables it. This means
until the country is set, 6GHz is disabled.
This poses a problem for IWD's quick scanning since it only scans a few
frequencies and this likely isn't enough beacons for the firmware to
update the country, leaving 6Ghz inaccessable to the user without manual
intervention (e.g. iw scan passive, or periodic scans by IWD).
To try and work around this limitation the quick scan logic has been
updated to check if a 6GHz AP has been connected to before and if that
frequency is disabled (but supported). If this is the case IWD will opt
for a full passive scan rather than scanning a limited set of
frequencies.
For whatever reason the kernel will send regdom updates even if
the regdom didn't change. This ends up causing wiphy to dump
which isn't needed since there should be no changes in disabled
frequencies.
Now the previous country is checked against the new one, and if
they match the wiphy is not dumped again.
A change in regulatory domain can result in frequencies being
enabled or disabled depending on the domain. This effects the
frequencies stored in wiphy which other modules depend on
such as scanning, offchannel work etc.
When the regulatory domain changes re-dump the wiphy in order
to update any frequency restrictions.
A helper to check whether the country code corresponds to a
real country, or some special code indicating the country isn't
yet set. For now, the special codes are OO (world roaming) and
XX (unknown entity).
Events to indicate when a regulatory domain wiphy dump has
started and ended. This is important because certain actions
such as scanning need to be delayed until the dump has finished.
The NEW_SCAN_RESULTS handling was written to only parse the frequency
list if there were no additional scan commands to send. This results in
the scan callback containing frequencies of only the last CMD_TRIGGER.
Until now this worked fine because a) the queue is only used for hidden
networks and b) frequencies were never defined by any callers scanning
for hidden networks (e.g. dbus/periodic scans).
Soon the scan command queue will be used to break up scan requests
meaning only the last scan request frequencies would be used in the
callback, breaking the logic in station.
Now the NEW_SCAN_RESULTS case will parse the frequencies for each scan
command rather than only the last.
The compiler treated the '1' as an int type which was not big enough
to hold a bit shift of 31:
runtime error: left shift of 1 by 31 places cannot be represented in
type 'int'
Instead of doing the iftype check manually, refactor
wiphy_get_supported_iftypes by adding a subroutine which just parses
out iftypes from a mask into a char** list. This removes the need to
case each iftype into a string.
Add extra logging around CQM events to help track wifi status. This is
useful for headless systems that can only be accessed over the network
and so information in the logs is invaluable for debugging outages.
Prior to this change, the only log for CQM messages is saying one was
received. This adds details to what attributes were set and the
associated data with them.
The signal strength log format was chosen to roughly match
wpa_supplicant's which looks like this:
CTRL-EVENT-SIGNAL-CHANGE above=1 signal=-60 noise=-96 txrate=6000
Provides useful information on why a roam might have failed, such as
failing to find the BSS or the BSS being ranked lower, and why that
might be.
The output format is the same as station_add_seen_bss for consistency.
If a frequency is disabled IWD should keep track and disallow any
operations on that channel such as scanning. A new list has been added
which contains only disabled frequencies.
The scan_passive API wasn't using a const struct scan_freq_set as it
should be since it's not modifying the contents. Changing this to
const did require some additional changes like making the scan_parameters
'freqs' member const as well.
After changing scan_parameters, p2p needed updating since it was using
scan_parameters.freqs directly. This was changed to using a separate
scan_freq_set pointer, then setting to scan_parameters.freqs when needed.
Similar to the HT/VHT APIs, this estimates the data rate based on the
HE Capabilities element, in addition to our own capabilities. The
logic is much the same as HT/VHT. The major difference being that HE
uses several MCS tables depending on the channel width. Each width
MCS set is checked (if supported) and the highest estimated rate out
of all the MCS sets is used.
There appears to be a compiler bug with gcc 11.2 which thinks the vht_mcs_set
is a zero length array, and the memset of size 8 is out of bounds. This is only
seen once an element is added to 'struct band'.
In file included from /usr/include/string.h:519,
from src/wiphy.c:34:
In function ‘memset’,
inlined from ‘band_new_from_message’ at src/wiphy.c:1300:2,
inlined from ‘parse_supported_bands’ at src/wiphy.c:1423:11,
inlined from ‘wiphy_parse_attributes’ at src/wiphy.c:1596:5,
inlined from ‘wiphy_update_from_genl’ at src/wiphy.c:1773:2:
/usr/include/bits/string_fortified.h:59:10: error: ‘__builtin_memset’ offset [0, 7] is out of the bounds [0, 0] [-Werror=array-bounds]
59 | return __builtin___memset_chk (__dest, __ch, __len,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
60 | __glibc_objsize0 (__dest));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~