This adds support for rekeys to AP mode. A single timer is used and
reset to the next station needing a rekey. A default rekey timer of
600 seconds is used unless the profile sets a timeout.
The only changes required was to set the secure bit for message 1,
reset the frame retry counter, and change the 2/4 verifier to use
the rekey flag rather than ptk_complete. This is because we must
set ptk_complete false in order to detect retransmissions of the
4/4 frame.
Initiating a rekey can now be done by simply calling eapol_start().
If IWD ends up dumping wiphy's twice (because of NEW_WIPHY event
soon after initial dump) it will also try and dump interfaces
twice leading to multiple DEL_INTERFACE calls. The second attempt
will fail with -ENODEV (since the interface was already deleted).
Just silently fail with this case and let the other DEL_INTERFACE
path handle the re-creation.
With really badly timed events a wiphy can be registered twice. This
happens when IWD starts and requests a wiphy dump. Immediately after
a NEW_WIPHY event comes in (presumably when the driver loads) which
starts another dump. The NEW_WIPHY event can't simply be ignored
since it could be a hotplug (e.g. USB card) so to fix this we can
instead just prevent it from being registered.
This does mean both dumps will happen but the information will just
be added to the same wiphy object.
Past commits should address any potential problems of the timer
firing during FT, but its still good practice to cancel the timer
once it is no longer needed, i.e. once FT has started.
If station has already started FT ensure station_cannot_roam takes
that into account. Since the state has not yet changed it must also
check if the FT work ID is set.
Under the following conditions IWD can accidentally trigger a second
roam scan while one is already in progress:
- A low RSSI condition is met. This starts the roam rearm timer.
- A packet loss condition is met, which triggers a roam scan.
- The roam rearm timer fires and starts another roam scan while
also overwriting the first roam scan ID.
- Then, if IWD gets disconnected the overwritten roam scan gets
canceled, and the roam state is cleared which NULL's
station->connected_network.
- The initial roam scan results then come in with the assumption
that IWD is still connected which results in a crash trying to
reference station->connected_network.
This can be fixed by adding a station_cannot_roam check in the rearm
timer. If IWD is already doing a roam scan station->preparing_roam
should be set which will cause it to return true and stop any further
action.
Aborting (signal 11) [/usr/libexec/iwd]
iwd[426]: ++++++++ backtrace ++++++++
iwd[426]: #0 0x7f858d7b2090 in /lib/x86_64-linux-gnu/libc.so.6
iwd[426]: #1 0x443df7 in network_get_security() at ome/locus/workspace/iwd/src/network.c:287
iwd[426]: #2 0x421fbb in station_roam_scan_notify() at ome/locus/workspace/iwd/src/station.c:2516
iwd[426]: #3 0x43ebc1 in scan_finished() at ome/locus/workspace/iwd/src/scan.c:1861
iwd[426]: #4 0x43ecf2 in get_scan_done() at ome/locus/workspace/iwd/src/scan.c:1891
iwd[426]: #5 0x4cbfe9 in destroy_request() at ome/locus/workspace/iwd/ell/genl.c:676
iwd[426]: #6 0x4cc98b in process_unicast() at ome/locus/workspace/iwd/ell/genl.c:954
iwd[426]: #7 0x4ccd28 in received_data() at ome/locus/workspace/iwd/ell/genl.c:1052
iwd[426]: #8 0x4c79c9 in io_callback() at ome/locus/workspace/iwd/ell/io.c:120
iwd[426]: #9 0x4c62e3 in l_main_iterate() at ome/locus/workspace/iwd/ell/main.c:476
iwd[426]: #10 0x4c6426 in l_main_run() at ome/locus/workspace/iwd/ell/main.c:519
iwd[426]: #11 0x4c6752 in l_main_run_with_signal() at ome/locus/workspace/iwd/ell/main.c:645
iwd[426]: #12 0x405987 in main() at ome/locus/workspace/iwd/src/main.c:600
iwd[426]: #13 0x7f858d793083 in /lib/x86_64-linux-gnu/libc.so.6
iwd[426]: +++++++++++++++++++++++++++
If the authenticator has already set an snonce then the packet must
be a retransmit. Handle this by sending 3/4 again but making sure
to not reset the frame counter.
Old wpa_supplicant versions do not set the secure bit on 2/4 during
rekeys which causes IWD to reject the message and eventually time out.
Modern versions do set it correctly but even Android 13 (Pixel 5a)
still uses an ancient version of wpa_supplicant which does not set the
bit.
Relax this check and instead just print a warning but allow the message
to be processed.
In try_handshake_complete() we return early if all the keys had
been installed before (initial associations). For rekeys we can
now emit the REKEY_COMPLETE event which lets AP mode reset the
rekey timer for that station.
When the TK is installed the 'ptk_installed' flag was never set to
zero. For initial associations this was fine (already zero) but for
rekeys the flag needs to be unset so try_handshake_complete knows
if the key was installed. This is consistent with how gtk/igtk keys
work as well.
Rekeys for station mode don't need to know when complete since
there is nothing to do once done. AP mode on the other hand needs
to know if the rekey was successful in order to reset/set the next
rekey timer.
The second handshake message was hard coded with the secure bit as
zero but for rekeys the secure bit should be set to 1. Fix this by
changing the 2/4 builder to take a boolean which will set the bit
properly.
It should be noted that hostapd doesn't check this bit so EAPoL
worked just fine, but IWD's checks are more strict.
The PEAP RFC wants implementations to enforce that Phase2 methods have
been successfully completed prior to accepting a successful result TLV.
However, when TLS session resumption is used, some servers will skip
phase2 methods entirely and simply send a Result TLV with a success
code. This results in iwd (erroneously) rejecting the authentication
attempt.
Fix this by marking phase2 method as successful if session resumption is
being used.
This adds a builder which sets the country IE in probes/beacons.
The IE will use the 'single subband triplet sequence' meaning
dot11OperatingClassesRequired is false. This is much easier to
build and doesn't require knowing an operating class.
The IE itself is variable in length and potentially could grow
large if the hardware has a weird configuration (many different
power levels or segmentation in supported channels) so the
overall builder was changed to take the length of the buffer and
warnings will be printed if any space issues are encountered.
IWD's channel/frequency conversions use simple math to convert and
have very minimal checks to ensure the input is valid. This can
lead to some channels/frequencies being calculated which are not
in IWD's E-4 table, specifically in the 5GHz band.
This is especially noticable using mac80211_hwsim which includes
some obscure high 5ghz frequencies which are not part of the 802.11
spec.
To fix this calculate the frequency or channel then iterate E-4
operating classes to check that the value actually matches a class.
The 6GHz test was not incrementing the frequencies properly which
was resulting in invalid frequencies, but since the conversion
API was never linked to E-4 the test was still passing.
The country IE can sometimes have a zero pad byte at the end for
alignment. This was not being checked for which caused the loop
to go past the end of the IE and print an entry for channel 0
(the pad byte) plus some garbage data.
Fix this by checking for the pad byte explicitly which skips the
print and terminates the loop.
If supported this will include the HT capabilities and HT
operations elements in beacons/probes. Some shortcuts were taken
here since not all the information is currently parsed from the
hardware. Namely the HT operation element does not include the
basic MCS set. Still, this will at least show stations that the
AP is capable of more than just basic rates.
The builders themselves are structured similar to the basic rates
builder where they build only the contents and return the length.
The caller must set the type/length manually. This is to support
the two use cases of using with an IE builder vs direct pointer.
To include HT support a chandef needs to be created for whatever
frequency is being used. This allows IWD to provide a secondary
channel to the kernel in the case of 40MHz operation. Now the AP
will generate a chandef when starting based on the channel set
in the user profile (or default).
If HT is not supported the chandef width is set to 20MHz no-HT,
otherwise band_freq_to_ht_chandef is used.
The WMM parameter IE is expected by the linux kernel for any AP
supporting HT/VHT etc. IWD won't actually use WMM and its not
clear exactly why the kernel uses this restriction, but regardless
it must be included to support HT.
For AP mode its convenient for IWD to choose an appropriate
channel definition rather than require the user provide very
low level parameters such as channel width, center1 frequency
etc. For now only HT is supported as VHT/HE etc. require
additional secondary channel frequencies.
The HT API tries to find an operating class using 40Mhz which
complies with any hardware restrictions. If an operating class is
found that is supported/not restricted it is marked as 'best' until
a better one is found. In this case 'better' is a larger channel
width. Since this is HT only 20mhz and 40mhz widths are checked.
This adds some additional parsing to obtain the AMPDU parameter
byte as well as wiphy_get_ht_capabilities() which returns the
complete IE (combining the 3 separate kernel attributes).
The supported rates IE was being built in two places. This makes that
code common. Unfortunately it needs to support both an ie builder
and using a pointer directly which is why it only builds the contents
of the IE and the caller must set the type/length.
Move the l_netconfig_set_route_priority() and
l_netconfig_set_optimistic_dad_enabled() calls from netconfig_new, which
is called once for the l_netconfig object's lifetime, to
netconfig_load_settings, which is called before every connection attempt.
This is needed because we clean up the l_netconfig configuration by calling
l_netconfig_reset_config() at different points in connection setup and
teardown so we'd reset the route priority that we've set in netconfig_new,
back to 0 and never reload it.
The disabled_freqs list is being removed and replaced with a new
list in the band object. This completely removes the need for
the pending_freqs list as well since any regdom related dumps
can just overwrite the existing frequency list.
This adds two new APIs:
wiphy_get_frequency_info(): Used to get information about a given
frequency such as disabled/no-IR. This can also be used to check
if the frequency is supported (NULL return is unsupported).
wiphy_band_is_disabled(): Checks if a band is disabled. Note that
an unsupported band will also return true. Checking support should
be done with wiphy_get_supported_bands()
As additional frequency info is needed it doesn't make sense to
store a full list of frequencies for every attribute (i.e.
supported, disabled, no-IR, etc).
This changes nl80211_parse_supported_frequencies to take a list
of frequency attributes where each index corresponds to a channel,
and each value can be filled with flag bits to signal any
limitations on that frequency.
wiphy.c then had to be updated to use this rather than the existing
scan_freq_set lists. This, as-is, will break anything using
wiphy_get_disabled_freqs().
Currently the wiphy object keeps track of supported and disabled
frequencies as two separate scan_freq_set's. This is very expensive
and limiting since we have to add more sets in order to track
additional frequency flags (no-IR, no-HT, no-HE etc).
Instead we can refactor how frequencies are stored. They will now
be part of the band object and stored as a list of flag structures
where each index corresponds to a channel
IWD was optimizing FT-over-DS by authenticating to multiple BSS's
at the time of connecting which then made future roams slightly
faster since they could jump right into association. So far this
hasn't posed a problem but it was reported that some AP's actually
enforce a reassociation timeout (included in 4-way handshake).
Hostapd itself does no such enforcement but anything external to
hostapd could monitor FT events and clear the cache if any exceeded
this timeout.
For now remove the early action frames and treat FT-over-DS the
same as FT-over-Air. In the future we could parse the reassociation
timeout, batch out FT-Action frames and track responses but for the
time being this just fix the issue at a small performance cost.
Queue the FT action just like we do with FT Authenticate which makes
it able to be used the same way, i.e. call ft_action() then queue
the ft_associate work right away.
A timer was added to end the work item in case the target never
responds.