With the addition of 6GHz '6000' is no longer the maximum frequency
that could be in .known_network.freq. For more robustness
band_freq_to_channel is used to validate the frequency.
Scanning while in AP mode is somewhat of an edge case, but it does
have some usefulness specifically with onboarding new devices, i.e.
a new device starts an AP, a station connects and provides the new
device with network credentials, the new device switches to station
mode and connects to the desired network.
In addition this could be used later for ACS (though this is a bit
overkill for IWD's access point needs).
Since AP performance is basically non-existant while scanning this
feature is meant to be used in a limited scope.
Two DBus API's were added which mirror the station interface: Scan and
GetOrderedNetworks.
Scan is no different than the station variant, and will perform an active
scan on all channels.
GetOrderedNetworks diverges from station and simply returns an array of
dictionaries containing basic information about networks:
{
Name: <ssid>
SignalStrength: <mBm>
Security: <psk, open, or 8021x>
}
Limitations:
- Hidden networks are not supported. This isn't really possible since
the SSID's are unknown from the AP perspective.
- Sharing scan results with station is not supported. This would be a
convenient improvement in the future with respect to onboarding new
devices. The scan could be performed in AP mode, then switch to
station and connect immediately without needing to rescan. A quick
hack could better this situation by not flushing scan results in
station (if the kernel retains these on an iftype change).
This was already implemented in station but with no dependency on
that module at all. AP will need this for a scanning API so its
being moved into scan.c.
The 802.11ax standards adds some restrictions for the 6GHz band. In short
stations must use SAE, OWE, or 8021x on this band and frame protection is
required.
All uses of this macro will work with a bitwise comparison which is
needed for 6GHz checks and somewhat more flexible since it can be
used to compare RSN info, not only single AKM values.
This adds checks if MFP is set to 0 or 1:
0 - Always fail if the frequency is 6GHz
1 - Fail if MFPC=0 and the frequency is 6GHz.
If HW is capable set MFPR=1 for 6GHz
This is a new band defined in the WiFi 6E (ax) amendment. A completely
new value is needed due to channel reuse between 2.4/5 and 6GHz.
util.c needed minimal updating to prevent compile errors which will
be fixed later to actually handle this band. WSC also needed a case
added for 6GHz but the spec does not outline any RF Band value for
6GHz so the 5GHz value will be returned in this case.
sae.c was failing to build on some platforms:
error: implicit declaration of function 'reallocarray'; did you mean 'realloc'?
[-Werror=implicit-function-declaration]
In certain rare cases IWD gets a link down event before nl80211 ever sends
a disconnect event. Netdev notifies station of the link down which causes
station to be freed, but netdev remains in the same state. Then later the
disconnect event arrives and netdev still thinks its connected, calls into
(the now freed) station object and causes a crash.
To fix this netdev_connect_free() is now called on any link down events
which will reset the netdev object to a proper state.
src/netdev.c:netdev_link_notify() event 16 on ifindex 16
src/netdev.c:netdev_mlme_notify() MLME notification Del Station(20)
src/netdev.c:netdev_link_notify() event 16 on ifindex 16
src/netdev.c:netdev_mlme_notify() MLME notification Deauthenticate(39)
src/netdev.c:netdev_deauthenticate_event()
src/netdev.c:netdev_link_notify() event 16 on ifindex 16
src/station.c:station_free()
src/netconfig.c:netconfig_destroy()
src/resolve.c:resolve_systemd_revert() ifindex: 16
src/station.c:station_roam_state_clear() 16
src/netdev.c:netdev_mlme_notify() MLME notification Disconnect(48)
src/netdev.c:netdev_disconnect_event()
Received Deauthentication event, reason: 3, from_ap: false
0 0x472fa4 in station_disconnect_event src/station.c:2916
1 0x472fa4 in station_netdev_event src/station.c:2954
2 0x43a262 in netdev_disconnect_event src/netdev.c:1213
3 0x43a262 in netdev_mlme_notify src/netdev.c:5471
4 0x6706eb in process_multicast ell/genl.c:1029
5 0x6706eb in received_data ell/genl.c:1096
6 0x65e630 in io_callback ell/io.c:120
7 0x65a94e in l_main_iterate ell/main.c:478
8 0x65b0b3 in l_main_run ell/main.c:525
9 0x65b0b3 in l_main_run ell/main.c:507
10 0x65b5cc in l_main_run_with_signal ell/main.c:647
11 0x4124d7 in main src/main.c:532
The difference between the existing code is that IWD will send the
authentication request, making it the initiator.
This handles the use case where IWD is provided a peers URI containing
its bootstrapping key rather than IWD always providing its own URI.
A new DBus API was added, ConfigureEnrollee().
Using ConfigureEnrollee() IWD will act as a configurator but begin by
traversing a channel list (URI provided or default) and waiting for
presence announcements (with one caveat). When an announcement is
received IWD will send an authentication request to the peer, receive
its reply, and send an authentication confirm.
As with being a responder, IWD only supports configuration to the
currently connected BSS and will request the enrollee switch to this
BSS's frequency to preserve network performance.
The caveat here is that only one driver (ath9k) supports multicast frame
registration which prevents presence frame from being received. In this
case it will be required the the peer URI contains a MAC and channel
information. This is because IWD will jump right into sending auth
requests rather than waiting for a presence announcement.
The frame watch which covers the presence procedure (and most
frames for that matter) needs to support multicast frames for
presence to work. Doing this in frame-xchg seems like the right
choice but only ath9k supports multicast frame registration.
Because of this limited support DPP will register for these frames
manually.
Parses K (key), M (mac), C (class/channels), and V (version) tokens
into a new structure dpp_uri_info. H/I are not parsed since there
currently isn't any use for them.
This was caught by static analysis. As is common this should never
happen in the real world since the only way this can fail (apart from
extreme circumstances like OOM) is if the key size is incorrect, which
it will never be.
Static analysis flagged that 'path' was never being checked (which
should not ever be NULL) but during that review I noticed stat()
was being called, then fstat afterwards.
Recently systemd added the ability to pass secret credentials to
services via LoadCredentialEncrypted/SetCredentialEncrypted. Once
set up the service is able to read the decrypted credentials from
a file. The file path is found in the environment variable
CREDENTIALS_DIRECTORY + an identifier. The value of SystemdEncrypt
should be set to the systemd key ID used when the credential was
created.
When SystemdEncrypt is set IWD will attempt to read the decrypted
secret from systemd. If at any point this fails warnings will be
printed but IWD will continue normally. Its expected that any failures
will result in the inability to connect to any networks which have
previously encrypted the passphrase/PSK without re-entering
the passphrase manually. This could happen, for example, if the
systemd secret was changed.
Once the secret is read in it is set into storage to be used for
profile encryption/decryption.
Using storage_decrypt() hotspot can also support profile encyption.
The hotspot consortium name is used as the 'ssid' since this stays
consistent between hotspot networks for any profile.
Some users don't like the idea of storing network credentials in
plaintext on the file system. This patch implements an option to
encrypt such profiles using a secret key. The origin of the key can in
theory be anything, but would typically be provided by systemd via
'LoadEncryptedCredential' setting in the iwd unit file.
The encryption operates on the entire [Security] group as well as all
embedded groups. Once encrypted the [Security] group will be replaced
with two key/values:
EncryptedSalt - A random string of bytes used for the encryption
EncryptedSecurity - A string of bytes containing the encrypted
[Security] group, as well as all embedded groups.
After the profile has been encrypted these values should not be
modified. Note that any values added to [Security] after encryption
has no effect. Once the profile is encrypted there is no way to modify
[Security] without manually decrypting first, or just re-creating it
entirely which effectively treated a 'new' profile.
The encryption/decryption is done using AES-SIV with a salt value and
the network SSID as the IV.
Once a key is set any profiles opened will automatically be encrypted
and re-written to disk. Modules using network_storage_open will be
provided the decrypted profile, and will be unaware it was ever
encrypted in the first place. Similarly when network_storage_sync is
called the profile will by automatically encrypted and written to disk
without the caller needing to do anything special.
A few private storage.c helpers were added to serve several purposes:
storage_init/exit():
This sets/cleans up the encryption key direct from systemd then uses
extract and expand to create a new fixed length key to perform
encryption/decryption.
__storage_decrypt():
Low level API to decrypt an l_settings object using a previously set
key and the SSID/name for the network. This returns a 'changed' out
parameter signifying that the settings need to be encrypted and
re-written to disk. The purpose of exposing this is for a standalone
decryption tool which does not re-write any settings.
storage_decrypt():
Wrapper around __storage_decrypt() that handles re-writing a new
profile to disk. This was exposed in order to support hotspot profiles.
__storage_encrypt():
Encrypts an l_settings object and returns the full profile as data
This got merged without a few additional fixes, in particular an
over 80 character line and incorrect length check.
Fixes: d8116e8828 ("dpp-util: add dpp_point_from_asn1()")
When we detect a new phy being added, we schedule a filtered dump of
the newly detected WIPHY and associated INTERFACEs. This code path and
related processing of the dumps was mostly shared with the un-filtered
dump of all WIPHYs and INTERFACEs which is performed when iwd starts.
This normally worked fine as long as a single WIPHY was created at a
time. However, if multiphy new phys were detected in a short amount of
time, the logic would get confused and try to process phys that have not
been probed yet. This resulted in iwd trying to create devices or not
detecting devices properly.
Fix this by only processing the target WIPHY and related INTERFACEs
when the filtered dump is performed, and not any additional ones that
might still be pending.
While here, remove a misleading comment:
manager_wiphy_check_setup_done() would succeed only if iwd decided to
keep the default interfaces created by the kernel.
This debug print was before any checks which could bail out prior to
autoconnect starting. This was confusing because debug logs would
contain multiple "station_autoconnect_start()" prints making you think
autoconnect was started several times.
The periodic scan code was refactored to make normal scans and
periodic scans consistent by keeping both in the same queue. But
that change left out the abort path where periodic scans were not
actually removed from the queue.
This fixes a rare crash when a periodic scan has been triggered and
the device goes down. This path never removes the request from the
queue but still frees it. Then when the scan context is removed the
stale request is freed again.
0 0x4bb65b in scan_request_cancel src/scan.c:202
1 0x64313c in l_queue_clear ell/queue.c:107
2 0x643348 in l_queue_destroy ell/queue.c:82
3 0x4bbfb7 in scan_context_free src/scan.c:209
4 0x4c9a78 in scan_wdev_remove src/scan.c:2115
5 0x42fecd in netdev_free src/netdev.c:965
6 0x445827 in netdev_destroy src/netdev.c:6507
7 0x52beb9 in manager_config_notify src/manager.c:765
8 0x67084b in process_multicast ell/genl.c:1029
9 0x67084b in received_data ell/genl.c:1096
10 0x65e790 in io_callback ell/io.c:120
11 0x65aaae in l_main_iterate ell/main.c:478
12 0x65b213 in l_main_run ell/main.c:525
13 0x65b213 in l_main_run ell/main.c:507
14 0x65b72c in l_main_run_with_signal ell/main.c:647
15 0x4124e7 in main src/main.c:532
If netdev_connect_failed is called before netdev_get_oci_cb() the
netdev's handshake will be destroyed and ultimately crash when the
callback is called.
This patch moves the cancelation into netdev_connect_free rather than
netdev_free.
++++++++ backtrace ++++++++
0 0x7f4e1787d320 in /lib64/libc.so.6
1 0x42634c in handshake_state_set_chandef() at src/handshake.c:1057
2 0x40a11b in netdev_get_oci_cb() at src/netdev.c:2387
3 0x483d7b in process_unicast() at ell/genl.c:986
4 0x480d3c in io_callback() at ell/io.c:120
5 0x48004d in l_main_iterate() at ell/main.c:472 (discriminator 2)
6 0x4800fc in l_main_run() at ell/main.c:521
7 0x48032c in l_main_run_with_signal() at ell/main.c:649
8 0x403e95 in main() at src/main.c:532
9 0x7f4e17867b75 in /lib64/libc.so.6
+++++++++++++++++++++++++++
Commit 4d2176df29 ("handshake: Allow event handler to free handshake")
introduced a re-entrancy guard so that handshake_state objects that are
destroyed as a result of the event do not cause a crash. It rightly
used a temporary object to store the passed in handshake. Unfortunately
this caused variable shadowing which resulted in crashes fixed by commit
d22b174a73 ("handshake: use _hs directly in handshake_event").
However, since the temporary was no longer used, this fix itself caused
a crash:
#0 0x00005555f0ba8b3d in eapol_handle_ptk_1_of_4 (sm=sm@entry=0x5555f2b4a920, ek=0x5555f2b62588, ek@entry=0x16, unencrypted=unencrypted@entry=false) at src/eapol.c:1236
1236 handshake_event(sm->handshake,
(gdb) bt
#0 0x00005555f0ba8b3d in eapol_handle_ptk_1_of_4 (sm=sm@entry=0x5555f2b4a920, ek=0x5555f2b62588, ek@entry=0x16, unencrypted=unencrypted@entry=false) at src/eapol.c:1236
#1 0x00005555f0bab118 in eapol_key_handle (unencrypted=<optimized out>, frame=<optimized out>, sm=0x5555f2b4a920) at src/eapol.c:2343
#2 eapol_rx_packet (proto=<optimized out>, from=<optimized out>, frame=<optimized out>, unencrypted=<optimized out>, user_data=0x5555f2b4a920) at src/eapol.c:2665
#3 0x00005555f0bac497 in __eapol_rx_packet (ifindex=62, src=src@entry=0x5555f2b62574 "x\212 J\207\267", proto=proto@entry=34958, frame=frame@entry=0x5555f2b62588 "\002\003",
len=len@entry=121, noencrypt=noencrypt@entry=false) at src/eapol.c:3017
#4 0x00005555f0b8c617 in netdev_control_port_frame_event (netdev=0x5555f2b64450, msg=0x5555f2b62588) at src/netdev.c:5574
#5 netdev_unicast_notify (msg=msg@entry=0x5555f2b619a0, user_data=<optimized out>) at src/netdev.c:5613
#6 0x00007f60084c9a51 in dispatch_unicast_watches (msg=0x5555f2b619a0, id=<optimized out>, genl=0x5555f2b3fc80) at ell/genl.c:954
#7 process_unicast (nlmsg=0x7fff61abeac0, genl=0x5555f2b3fc80) at ell/genl.c:973
#8 received_data (io=<optimized out>, user_data=0x5555f2b3fc80) at ell/genl.c:1098
#9 0x00007f60084c61bd in io_callback (fd=<optimized out>, events=1, user_data=0x5555f2b3fd20) at ell/io.c:120
#10 0x00007f60084c536d in l_main_iterate (timeout=<optimized out>) at ell/main.c:478
#11 0x00007f60084c543e in l_main_run () at ell/main.c:525
#12 l_main_run () at ell/main.c:507
#13 0x00007f60084c5670 in l_main_run_with_signal (callback=callback@entry=0x5555f0b89150 <signal_handler>, user_data=user_data@entry=0x0) at ell/main.c:647
#14 0x00005555f0b886a4 in main (argc=<optimized out>, argv=<optimized out>) at src/main.c:532
This happens when the driver does not support rekeying, which causes iwd to
attempt a disconnect and re-connect. The disconnect action is
taken during the event callback and destroys the underlying eapol state
machine. Since a temporary isn't used, attempting to dereference
sm->handshake results in a crash.
Fix this by introducing a UNIQUE_ID macro which should prevent shadowing
and using a temporary variable as originally intended.
Fixes: d22b174a73 ("handshake: use _hs directly in handshake_event")
Fixes: 4d2176df29 ("handshake: Allow event handler to free handshake")
Reported-By: Toke Høiland-Jørgensen <toke@toke.dk>
Tested-by: Toke Høiland-Jørgensen <toke@toke.dk>
There is no need to punch the holes for netdev/wheel groups to send to
the .Agent interface. This is only done by the iwd daemon itself and
the policy for user 'root' already takes care of this.
A select few drivers send this instead of SIGNAL_MBM. The docs say this
value is the signal 'in unspecified units, scaled to 0..100'. The range
for SIGNAL_MBM is -10000..0 so this can be scaled to the MBM range easy
enough...
Now, this isn't exactly correct because this value ultimately gets
returned from GetOrderedNetworks() and is documented as 100 * dBm where
in reality its just a unit-less signal strength value. Its not ideal, but
this patch at least will fix BSS ranking for these few drivers.
The 'at_console' D-Bus policy setting has been deprecated for more then
10 years and could be ignored at any time in the future. Moreover, while
the intend was to allow locally logged on users to interact with iwd, it
didn't actually do that.
More info at https://www.spinics.net/lists/linux-bluetooth/msg75267.html
and https://gitlab.freedesktop.org/dbus/dbus/-/issues/52
Therefor remove the 'at_console' setting block.
On Debian (based) systems, there is a standard defined group which is
allowed to manage network interfaces, and that is the 'netdev' group.
So add a D-Bus setting block to grant the 'netdev' group that access.
Building on GCC 8 resulted in this compiler error.
src/sae.c:107:25: error: implicit declaration of function 'reallocarray';
did you mean 'realloc'? [-Werror=implicit-function-declaration]
sm->rejected_groups = reallocarray(NULL, 2, sizeof(uint16_t));
src/erp.c:134:10: error: comparison of integer expressions of different
signedness: 'unsigned int' and 'int' [-Werror=sign-compare]
src/eap-ttls.c:378:10: error: comparison of integer expressions of different signedness: 'uint32_t' {aka 'unsigned int'} and 'int' [-Werror=sign-compare]
Fixes the following crash:
#0 0x000211c4 in netdev_connect_event (msg=<optimized out>, netdev=0x2016940) at src/netdev.c:2915
#1 0x76f11220 in process_multicast (nlmsg=0x7e8acafc, group=<optimized out>, genl=<optimized out>) at ell/genl.c:1029
#2 received_data (io=<optimized out>, user_data=<optimized out>) at ell/genl.c:1096
#3 0x76f0da08 in io_callback (fd=<optimized out>, events=1, user_data=0x200a560) at ell/io.c:120
#4 0x76f0ca78 in l_main_iterate (timeout=<optimized out>) at ell/main.c:478
#5 0x76f0cb74 in l_main_run () at ell/main.c:525
#6 l_main_run () at ell/main.c:507
#7 0x76f0cdd4 in l_main_run_with_signal (callback=callback@entry=0x18c94 <signal_handler>, user_data=user_data@entry=0x0)
at ell/main.c:647
#8 0x00018178 in main (argc=<optimized out>, argv=<optimized out>) at src/main.c:532
This crash was introduced in commit:
4d2176df29 ("handshake: Allow event handler to free handshake")
The culprit seems to be that 'hs' is being used both in the caller and
in the macro. Since the macro defines a variable 'hs' in local block
scope, it overrides 'hs' from function scope. Yet (_hs) still evaluates
to 'hs' leading the local variable to be initialized with itself. Only
the 'handshake_event(hs, HANDSHAKE_EVENT_SETTING_KEYS))' is affected
since it is the only macro invocation that uses 'hs' from function
scope. Thus, the crash would only happen on hardware supporting handshake
offload (brcmfmac).
Fix this by removing the local scope variable declaration and evaluate
(_hs) instead.
Fixes: 4d2176df29 ("handshake: Allow event handler to free handshake")
- Ensure that input isn't an empty string
- Ensure that EINVAL errno (which could be optionally returned by
strto{ul|l} is also checked.
- Since strtoul allows '+' and '-' characters in input, ensure that
input which is expected to be an unsigned number doesn't start with
'-'
Given an ASN1 blob of the right form, parse and create
an l_ecc_point object. The form used is specific to DPP
hence why this isn't general purpose and put into dpp-util.
Like in ap.c, allow the event callback to mark the handshake state as
destroyed, without causing invalid accesses after the callback has
returned. In this case the crash was because try_handshake_complete
needed to access members of handshake_state after emitting the event,
as well as access the netdev, which also has been destroyed:
==257707== Invalid read of size 8
==257707== at 0x408C85: try_handshake_complete (netdev.c:1487)
==257707== by 0x408C85: try_handshake_complete (netdev.c:1480)
(...)
==257707== Address 0x4e187e8 is 856 bytes inside a block of size 872 free'd
==257707== at 0x484621F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==257707== by 0x437887: ap_stop_handshake (ap.c:151)
==257707== by 0x439793: ap_del_station (ap.c:316)
==257707== by 0x43EA92: ap_station_disconnect (ap.c:3411)
==257707== by 0x43EA92: ap_station_disconnect (ap.c:3399)
==257707== by 0x454276: p2p_group_event (p2p.c:1006)
==257707== by 0x439147: ap_event (ap.c:281)
==257707== by 0x4393AB: ap_new_rsna (ap.c:390)
==257707== by 0x4393AB: ap_handshake_event (ap.c:1010)
==257707== by 0x408C7F: try_handshake_complete (netdev.c:1485)
==257707== by 0x408C7F: try_handshake_complete (netdev.c:1480)
(...)
Previously we added logic to defer doing anything in ap_free() to after
the AP event handler has returned so that ap_event() has a chance to
inform whoever called it that the ap_state has been freed. But there's
also a chance that the event handler is destroying both the AP and the
netdev it runs on, so after the handler has returned we can't even use
netdev_get_wdev_id or netdev_get_ifindex. The easiest solution seems to
be to call ap_reset() in ap_free() even if we're within an event handler
to ensure we no longer need any external objects. Also make sure
ap_reset() can be called multiple times.
Another option would be to watch for NETDEV_WATCH_EVENT_DEL and remove
our reference to the netdev (because there's no need actually call
l_rtnl_ifaddr_delete or frame_watch_wdev_remove if the netdev was
destroyed -- frame_watch already tracks netdev removals), or to save
just the ifindex and the wdev id...
The purpose of this was to have a single utility to both cancel an
existing offchannel operation (if one exists) and start a new one.
The problem was the previous offchannel operation was being canceled
first which opened up the radio work queue to other items. This is
not desireable as, for example, a scan would end up breaking the
DPP protocol most likely.
Starting the new offchannel then canceling is the correct order of
operations but to do this required saving the new ID, canceling, then
setting offchannel_id to the new ID so dpp_presence_timeout wouldn't
overwrite the new ID to zero.
This also removes an explicit call to offchannel_cancel which is
already done by dpp_offchannel_start.
Several members are named based on initiator/responder (i/r)
terminology. Eventually both initiator and responder will be
supported so rename these members to use own/peer naming
instead.
ASN1 parsing will soon be required which will need some utilities in
asn1-private.h. To avoid duplication include this private header and
replace the OID's with the defined structures as well as remove the
duplicated macros.
station_set_scan_results takes an autoconnect flag which was being
set true in both regular/quick autoconnect scans. Since OWE networks
are processed after setting the scan results IWD could end up
connecting to a network before all the OWE hidden networks are
populated.
To fix this regular/quick autoconnect results will set the flag to
false, then process OWE networks, then start autoconnect. If any
OWE network scans are pending station_autoconnect_start will fail
but will pick back up after the hidden OWE scan.
scan_request_failed and scan_finished remove the finished scan_request
from the request queue right away, before calling the callback. This
breaks those clients that rely on scan_cancel working on such requests
(i.e. to force the destroy callback to be invoked synchronously, see
a0911ca778 ("station: Make sure roam_scan_id is always canceled").
Fix this by removing the scan_request from the request queue after
invoking the callback. Also provide a re-entrancy guard that will make
sure that the scan_request isn't removed in scan_cancel itself.
There are similar operations being performed but with different
callbacks and userdata, depending on whether 'sr' is NULL or not.
Optimize the function flow slightly to make if-else unnecessary.
While here, update the comment. periodic scans are now scheduled only
based on the periodic timeout timer.
If periodic scan is active and we receive a SCAN_ABORTED event, we would
still invoke the periodic scan callback with an error. This is rather
pointless since the periodic scan callback cannot do anything useful
with this information. Fix that.
We should never reach a point where NEW_SCAN_RESULTS or SCAN_ABORTED are
received before a corresponding TRIGGER_SCAN is received. Even if this
does happen, there's no harm from processing the commands anyway.
This makes it a little easier to book-keep the started variable. Since
scan_request already has a 'passive' bit-field, there should be no
storage penalty.
If scan_cancel is called on a scan_request that is 'finished' but with
the GET_SCAN command still in flight, it will trigger a crash as
follows:
Received Deauthentication event, reason: 2, from_ap: true
src/station.c:station_disconnect_event() 11
src/station.c:station_disassociated() 11
src/station.c:station_reset_connection_state() 11
src/station.c:station_roam_state_clear() 11
src/scan.c:scan_cancel() Trying to cancel scan id 6 for wdev 200000002
src/scan.c:scan_cancel() Scan is at the top of the queue, but not triggered
src/scan.c:get_scan_done() get_scan_done
Aborting (signal 11) [/home/denkenz/iwd-master/src/iwd]
++++++++ backtrace ++++++++
#0 0x7f9871aef3f0 in /lib64/libc.so.6
#1 0x41f470 in station_roam_scan_notify() at /home/denkenz/iwd-master/src/station.c:2285
#2 0x43936a in scan_finished() at /home/denkenz/iwd-master/src/scan.c:1709
#3 0x439495 in get_scan_done() at /home/denkenz/iwd-master/src/scan.c:1739
#4 0x4bdef5 in destroy_request() at /home/denkenz/iwd-master/ell/genl.c:676
#5 0x4c070b in l_genl_family_cancel() at /home/denkenz/iwd-master/ell/genl.c:1960
#6 0x437069 in scan_cancel() at /home/denkenz/iwd-master/src/scan.c:842
#7 0x41dc2e in station_roam_state_clear() at /home/denkenz/iwd-master/src/station.c:1594
#8 0x41dd2b in station_reset_connection_state() at /home/denkenz/iwd-master/src/station.c:1619
#9 0x41dea4 in station_disassociated() at /home/denkenz/iwd-master/src/station.c:1644
The happens because get_scan_done callback is still called as a result of
l_genl_cancel. Add a re-entrancy guard in the form of 'canceled'
variable in struct scan_request. If set, get_scan_done will skip invoking
scan_finished.
It isn't clear what 'l_queue_peek_head() == results->sr' check was trying
to accomplish. If GET_SCAN dump was scheduled, then it should be
reported. Drop it.
results->sr is set to NULL for 'opportunistic' scans which were
triggered externally. See scan_notify() for details. However,
get_scan_done would only invoke scan_finished (and thus the periodic
scan callback sc->sp.callback) only if the scan queue was empty. It
should do so in all cases.
The point type was being hard coded to 0x3 (BIT1) which may have resulted
in the peer subtracting Y from P when reading in the point (depending on
if Y was odd or not).
Instead set the compressed type to whatever avoids the subtraction which
both saves IWD from needing to do it, as well as the peer.
The intent of this check is to make sure that at least 2 bytes are
available for reading. However, the unintended consequence is that tags
with a zero length at the end of input would be rejected.
While here, rework the check to be more resistant to potential
overflow conditions.
The DPP spec says nothing about how to handle re-transmits but it
was found in testing this can happen relatively easily for a few
reasons.
If the configurator requests a channel switch but does not get onto
the new channel quick enough the enrollee may have already sent the
authenticate response and it was missed. Also by nature of how the
kernel goes offchannel there are moments in time between ROC when
the card is idle and not receiving any frames.
Only frames where there was no ACK will be retransmitted. If the
peer received the frame and dropped it resending the same frame wont
do any good.
Now the result is sent immediately. Prior a connect attempt or
scan could have started, potentially losing this frame. In addition
the offchannel operation is cancelled after sending the result
which will allow the subsequent connect or scan to happen much
faster since it doesn't have to wait for ROC to expire.
The previous (incorrect) else was removed since it ended up
printing in most cases since the if clause returned. This should
have been an else if conditional from the start and only print if the
station device was not found.
IWD may be in the middle of some long operation, e.g. scanning.
If the URI is returned before IWD is ready, a configurator could
start sending frames and IWD either wont receive them, or will
be unable to respond quickly.
The offchannel priority was also changed to zero, which matches the
priority of frames. Currently there should be no interaction between
offchannel and connect (previous offchannel priority).
Periodic scans were handled specially where they were only
started if no other requests were pending in the scan queue.
This is fine, and what we want, but this can actually be
handled automatically by nature of the wiphy work queue rather
than needing to check the request queue explicitly.
Instead we can insert periodic scans at a lower priority than
other scans. This puts them at the end of the work queue, as
well as allows future requests to jump ahead if a periodic scan
has not yet started.
Eventually, once all pending scans are done, the peridoic scan
may begin. This is no different than the preivous behavior and
avoids the need for any special checks once scan requests
complete.
One check was added to address the problem of the periodic scan
timer firing before the scan could even start. Currently this
happened to be handled fine in scan_periodic_queue, as it checks
the queue length. Since this check was removed we must see check
for this condition inside scan_periodic_timeout.
This adds a priority argument to scan_common rather than hard
coding it when inserting the work item and uses the newly
defined wiphy priority for scanning.
Work priority was never explicitly defined anywhere, and a module
using wiphy_radio_work APIs needed to ensure it was not inserting
at a priority that would interfere with other work.
Now all the types of work have been defined with their own priority
and future priorities can easily be added before, after, or in
between existing priorities.