This exposes the [Rank].BandModifier* settings so other modules
can use then. Doing this will allow user-disabling of certain
bands by setting these modifier values to 0.0.
The loop iterating the frequency attributes list was not including
the entire channel set since it was stopping at i < band->freqs_len.
The freq_attrs array is allocated to include the last channel:
band->freq_attrs = l_new(struct band_freq_attrs, num_channels + 1);
band->freqs_len = num_channels;
So instead the for loop should use i <= band->freqs_len. (I also
changed this to start the loop at 1 since channel zero is invalid).
The auth/action status is now tracked in ft.c. If an AP rejects the
FT attempt with "Invalid PMKID" we can now assume this AP is either
mis-configured for FT or is lagging behind getting the proper keys
from neighboring APs (e.g. was just rebooted).
If we see this condition IWD can now fall back to reassociation in
an attempt to still roam to the best candidate. The fallback decision
is still rank based: if a BSS fails FT it is marked as such, its
ranking is reset removing the FT factor and it is inserted back
into the queue.
The motivation behind this isn't necessarily to always force a roam,
but instead to handle two cases where IWD can either make a bad roam
decision or get 'stuck' and never roam:
1. If there is one good roam candidate and other bad ones. For
example say BSS A is experiencing this FT key pull issue:
Current BSS: -85dbm
BSS A: -55dbm
BSS B: -80dbm
The current logic would fail A, and roam to B. In this case
reassociation would have likely succeeded so it makes more sense
to reassociate to A as a fallback.
2. If there is only one candidate, but its failing FT. IWD will
never try anything other than FT and repeatedly fail.
Both of the above have been seen on real network deployments and
result in either poor performance (1) or eventually lead to a full
disconnect due to never roaming (2).
Certain return codes, though failures, can indicate that the AP is
just confused or booting up and treating it as a full failure may
not be the best route.
For example in some production deployments if an AP is rebooted it
may take some time for neighboring APs to exchange keys for
current associations. If a client roams during that time it will
reject saying the PMKID is invalid.
Use the ft_associate call return to communicate the status (if any)
that was in the auth/action response. If there was a parsing error
or no response -ENOENT is still returned.
Removed several debug prints which are very verbose and provide
little to no important information.
The get_scan_{done,callback} prints are pointless since all the
parsed scan results are printed by station anyways.
Printing the BSS load is also not that useful since it doesn't
include the BSSID. If anything the BSS load should be included
when station prints out each individual BSS (along with frequency,
rank, etc).
The advertisement protocol print was just just left in there by
accident when debugging, and also provides basically no useful
information.
Some APs don't include the RSNE in the associate reply during
the OWE exchange. This causes IWD to be incompatible since it has
a hard requirement on the AKM being included.
This relaxes the requirement for the AKM and instead warns if it
is not included.
Below is an example of an association reply without the RSN element
IEEE 802.11 Association Response, Flags: ........
Type/Subtype: Association Response (0x0001)
Frame Control Field: 0x1000
.000 0000 0011 1100 = Duration: 60 microseconds
Receiver address: 64:c4:03:88:ff:26
Destination address: 64:c4:03:88:ff:26
Transmitter address: fc:34:97:2b:1b:48
Source address: fc:34:97:2b:1b:48
BSS Id: fc:34:97:2b:1b:48
.... .... .... 0000 = Fragment number: 0
0001 1100 1000 .... = Sequence number: 456
IEEE 802.11 wireless LAN
Fixed parameters (6 bytes)
Tagged parameters (196 bytes)
Tag: Supported Rates 6(B), 9, 12(B), 18, 24(B), 36, 48, 54, [Mbit/sec]
Tag: RM Enabled Capabilities (5 octets)
Tag: Extended Capabilities (11 octets)
Ext Tag: HE Capabilities (IEEE Std 802.11ax/D3.0)
Ext Tag: HE Operation (IEEE Std 802.11ax/D3.0)
Ext Tag: MU EDCA Parameter Set
Ext Tag: HE 6GHz Band Capabilities
Ext Tag: OWE Diffie-Hellman Parameter
Tag Number: Element ID Extension (255)
Ext Tag length: 51
Ext Tag Number: OWE Diffie-Hellman Parameter (32)
Group: 384-bit random ECP group (20)
Public Key: 14ba9d8abeb2ecd5d95e6c12491b16489d1bcc303e7a7fbd…
Tag: Vendor Specific: Broadcom
Tag: Vendor Specific: Microsoft Corp.: WMM/WME: Parameter Element
Reported-By: Wen Gong <quic_wgong@quicinc.com>
Tested-By: Wen Gong <quic_wgong@quicinc.com>
Hostapd commit b6d3fd05e3 changed the PMKID derivation in accordance
with 802.11-2020 which then breaks PMKID validation in IWD. This
breaks the FT-8021x AKM in IWD if the AP uses this hostapd version
since the PMKID doesn't validate during EAPoL.
This updates the PMKID derivation to use the correct SHA hash for
this AKM and adds SHA1 based PMKID checking for interoperability
with older hostapd versions.
The PMKID derivation has gotten messy due to the spec
updating/clarifying the hash size for the FT-8021X AKM. This
has led to hostapd updating the derivation which leaves older
hostapd versions using SHA1 and newer versions using SHA256.
To support this the checksum type is being fed to
handshake_state_get_pmkid so the caller can decide what sha to
use. In addition handshake_state_pmkid_matches is being added
which uses get_pmkid() but handles sorting out the hash type
automatically.
This lets preauthentication use handshake_state_get_pmkid where
there is the potential that a new PMKID is derived and eapol
can use handshake_state_pmkid_matches which only derives the
PMKID to compare against the peers.
The existing API was limited to SHA1 or SHA256 and assumed a key
length of 32 bytes. Since other AKMs plan to be added update
this to take the checksum/length directly for better flexibility.
This is consistent with the over-Air path, and makes it clear when
reading the logs if over-DS was used, if there was a response frame,
and if the frame failed to parse in some way.
Disable power save if the wiphy indicates its needed. Do this
before issuing GET_LINK so the netdev doesn't signal its up until
power save is disabled.
Certain drivers do not handle power save very well resulting in
missed frames, firmware crashes, or other bad behavior. Its easy
enough to disable power save via iw, iwconfig, etc but since IWD
removes and creates the interface on startup it blows away any
previous power save setting. The setting must be done *after* IWD
creates the interface which can be done, but needs to be via some
external daemon monitoring IWD's state. For minimal systems,
e.g. without NetworkManager, it becomes difficult and annoying to
persistently disable power save.
For this reason a new driver flag POWER_SAVE_DISABLE is being
added. This can then be referenced when creating the interfaces
and if set, disable power save.
The driver_infos list in wiphy.c is hard coded and, naturally,
not configurable from a user perspective. As drivers are updated
or added users may be left with their system being broken until the
driver is added, IWD released, and packaged.
This adds the ability to define driver flags inside main.conf under
the "DriverQuirks" group. Keys in this group correspond to values in
enum driver_flag and values are a list of glob matches for specific
drivers:
[DriverQuirks]
DefaultInterface=rtl81*,rtl87*,rtl88*,rtw_*,brcmfmac,bcmsdh_sdmmc
ForcePae=buggy_pae_*
Rather than keep a pointer to the driver_info entry copy the flags
into the wiphy object. This preps for supporting driver flags via
a configuration file, specifically allowing for entries that are a
subset of others. For example:
{ "rtl88*", DEFAULT_IF },
{ "rtl88x2bu", FORCE_PAE },
Before it was not possible to add entires like this since only the
last entry match would get set. Now DEFAULT_IF would get set to all
matches, and FORCE_PAE to only rtl88x2bu. This isn't especially
important for the static list since it could be modified to work
correctly, but will be needed when parsing flags from a
configuration file that may contain duplicates or subsets of the
static list.
If there was some problem during the FT authenticate stage
its nice to know more of what happened: whether the AP didn't
respond, rejected the attempt, or sent an invalid frame/IEs.
In some situations its convenient for the same work item to be
inserted (rescheduled) while its in progress. FT for example does
this now if a roam fails. The same ft_work item gets re-inserted
which, currently, is not safe to do since the item is modified
and removed once completed.
Fix this by introducing wiphy_radio_work_reschedule which is an
explicit API for re-inserting work items from within the do_work
callback.
The wiphy work logic was changed around slightly to remove the item
at the head of the queue prior to starting and note the ID going
into do_work. If do_work signaled done and ID changed we know it
was re-inserted and can skip the destroy logic and move onto the
next item. If the item is not done continue as normal but set the
priority to INT_MIN, as usual, to prevent other items from getting
to the head of the queue.
If IWD connects under bad RF conditions and netconfig takes
a while to complete (e.g. slow DHCP), the roam timeout
could fire before DHCP is done. Then, after the roam,
IWD would transition automatically to connected before
DHCP was finished. In theory DHCP could still complete after
this point but any process depending on IWD's connected
state would be uninformed and assume IP networking is up.
Fix this by stopping netconfig prior to a roam if IWD is not
in a connected state. Then, once the roam either failed or
succeeded, start netconfig again.
When acting as a configurator the enrollee can start on a different
channel than IWD is connected to. IWD will begin the auth process
on this channel but tell the enrollee to transition to the current
channel after the auth request. Since a configurator must be
connected (a requirement IWD enforces) we can assume a channel
transition will always be to the currently connected channel. This
allows us to simply cancel the offchannel request and wait for a
response (rather than start another offchannel).
Doing this improves the DPP performance and reduces the potential
for a lost frame during the channel transition.
This patch also addresses the comment that we should wait for the
auth request ACK before canceling the offchannel. Now a flag is
set and IWD will cancel the offchannel once the ACK is received.
If IWD gets a disconnect during FT the roaming state will be
cleared, as well as any ft_info's during ft_clear_authentications.
This includes canceling the offchannel operation which also
destroys any pending ft_info's if !info->parsed. This causes a
double free afterwards. In addition the l_queue_remove inside the
foreach callback is not a safe operation either.
To fix this don't remove the ft_info inside the offchannel
destroy callback. The info will get freed by ft_associate regardless
of the outcome (parsed or !parsed). This is also consistent with
how the onchannel logic works.
Log and crash backtrace below:
iwd[488]: src/station.c:station_try_next_transition() 5, target aa:46:8d:37:7c:87
iwd[488]: src/wiphy.c:wiphy_radio_work_insert() Inserting work item 16668
iwd[488]: src/wiphy.c:wiphy_radio_work_insert() Inserting work item 16669
iwd[488]: src/wiphy.c:wiphy_radio_work_done() Work item 16667 done
iwd[488]: src/wiphy.c:wiphy_radio_work_next() Starting work item 16668
iwd[488]: src/netdev.c:netdev_mlme_notify() MLME notification Remain on Channel(55)
iwd[488]: src/netdev.c:netdev_mlme_notify() MLME notification Del Station(20)
iwd[488]: src/netdev.c:netdev_link_notify() event 16 on ifindex 5
iwd[488]: src/netdev.c:netdev_mlme_notify() MLME notification Deauthenticate(39)
iwd[488]: src/netdev.c:netdev_deauthenticate_event()
iwd[488]: src/netdev.c:netdev_mlme_notify() MLME notification Disconnect(48)
iwd[488]: src/netdev.c:netdev_disconnect_event()
iwd[488]: Received Deauthentication event, reason: 6, from_ap: true
iwd[488]: src/station.c:station_disconnect_event() 5
iwd[488]: src/station.c:station_disassociated() 5
iwd[488]: src/station.c:station_reset_connection_state() 5
iwd[488]: src/station.c:station_roam_state_clear() 5
iwd[488]: double free or corruption (fasttop)
5 0x0000555b3dbf44a4 in ft_info_destroy ()
6 0x0000555b3dbf45b3 in remove_ifindex ()
7 0x0000555b3dc4653c in l_queue_foreach_remove ()
8 0x0000555b3dbd0dd1 in station_reset_connection_state ()
9 0x0000555b3dbd37e5 in station_disassociated ()
10 0x0000555b3dbc8bb8 in netdev_mlme_notify ()
11 0x0000555b3dc4e80b in received_data ()
12 0x0000555b3dc4b430 in io_callback ()
13 0x0000555b3dc4a5ed in l_main_iterate ()
14 0x0000555b3dc4a6bc in l_main_run ()
15 0x0000555b3dc4a8e0 in l_main_run_with_signal ()
16 0x0000555b3dbbe888 in main ()
Hostapd commit bc36991791 now properly sets the secure bit on
message 1/4. This was addressed in an earlier IWD commit but
neglected to allow for backwards compatibility. The check is
fatal which now breaks earlier hostapd version (older than 2.10).
Instead warn on this condition rather than reject the rekey.
Fixes: 7fad6590bd ("eapol: allow 'secure' to be set on rekeys")
The HT40+/- flags were reversed when checking against the 802.11
behavior flags.
HT40+ means the secondary channel is above (+) the primary channel
therefore corresponds to the PRIMARY_CHANNEL_LOWER behavior. And
the opposite for HT40-.
Reported-By: Alagu Sankar <alagusankar@gmail.com>
Use a more appropriate printf conversion string in order to avoid
unnecessary implicit conversion which can lead to a buffer overflow.
Reasons similar to commit:
98b758f893 ("knownnetworks: fix printing SSID in hex")
In the case that the FT target is on the same channel as we're currently
operating on, use ft_authenticate_onchannel instead of ft_authenticate.
Going offchannel in this case can confuse some drivers.
Currently when we try FT-over-Air, the Authenticate frame is always
sent via offchannel infrastructure We request the driver to go
offchannel, then send the Authenticate frame. This works fine as long
as the target AP is on a different channel. On some networks some (or
all) APs might actually be located on the same channel. In this case
going offchannel will result in some drivers not actually sending the
Authenticate frame until after the offchannel operation completes.
Work around this by introducing a new ft_authenticate variant that will
not request an offchannel operation first.
Force conversion to unsigned char before printing to avoid sign
extension when printing SSID in hex. For example, if there are CJK
characters in SSID, it will generate a very long string like
/net/connman/iwd/ffffffe8ffffffaeffffffa1.
If a very long ssid was used (e.g. CJK characters in SSID), it might do
out of bounds write to static variable for lack of checking the position
before the last snprintf() call.
Seeing that some authenticators can't handle TLS session caching
properly, allow the EAP-TLS-based methods session caching support to be
disabled per-network using a method specific FastReauthentication setting.
Defaults to true.
With the previous commit, authentication should succeed at least every
other attempt. I'd also expect that EAP-TLS is not usually affected
because there's no phase2, unlike with EAP-PEAP/EAP-TTLS.
If we have a TLS session cached from this attempt or a previous
successful connection attempt but the overall EAP method fails, forget
the session to improve the chances that authentication succeeds on the
next attempt considering that some authenticators strangely allow
resumption but can't handle it all the way to EAP method success.
Logically the session resumption in the TLS layers on the server should
be transparent to the EAP layers so I guess those may be failed
attempts to further optimise phase 2 when the server thinks it can
already trust the client.
The extra IE length for the WMM IE was being set to 26 which is
the HT IE length, not WMM. Fix this and use the proper size for
the WMM IE of 50 bytes.
This shouldn't have caused any problems prior as the tail length
is always allocated with 256 or 512 extra bytes of headroom.
Since channels numbers are used as indexes into the array, and given
that channel numbers start at '1' instead of 0, make sure to allocate a
buffer large enough to not overflow when the max channel number for a
given band is accessed.
src/manager.c:manager_wiphy_dump_callback() New wiphy phy1 added (1)
==22290== Invalid write of size 2
==22290== at 0x4624B2: nl80211_parse_supported_frequencies (nl80211util.c:570)
==22290== by 0x417CA5: parse_supported_bands (wiphy.c:1636)
==22290== by 0x418594: wiphy_parse_attributes (wiphy.c:1805)
==22290== by 0x418E20: wiphy_update_from_genl (wiphy.c:1991)
==22290== by 0x464589: manager_wiphy_dump_callback (manager.c:564)
==22290== by 0x4CBDDA: process_unicast (genl.c:944)
==22290== by 0x4CC19C: received_data (genl.c:1056)
==22290== by 0x4C7140: io_callback (io.c:120)
==22290== by 0x4C5A97: l_main_iterate (main.c:476)
==22290== by 0x4C5BDC: l_main_run (main.c:523)
==22290== by 0x4C5F0F: l_main_run_with_signal (main.c:645)
==22290== by 0x40503B: main (main.c:600)
==22290== Address 0x4aa76ec is 0 bytes after a block of size 28 alloc'd
==22290== at 0x48417B5: malloc (vg_replace_malloc.c:393)
==22290== by 0x4BC4D1: l_malloc (util.c:62)
==22290== by 0x417BE4: parse_supported_bands (wiphy.c:1619)
==22290== by 0x418594: wiphy_parse_attributes (wiphy.c:1805)
==22290== by 0x418E20: wiphy_update_from_genl (wiphy.c:1991)
==22290== by 0x464589: manager_wiphy_dump_callback (manager.c:564)
==22290== by 0x4CBDDA: process_unicast (genl.c:944)
==22290== by 0x4CC19C: received_data (genl.c:1056)
==22290== by 0x4C7140: io_callback (io.c:120)
==22290== by 0x4C5A97: l_main_iterate (main.c:476)
==22290== by 0x4C5BDC: l_main_run (main.c:523)
==22290== by 0x4C5F0F: l_main_run_with_signal (main.c:645)
==22290==