It was found that if the user cancels/disconnects the agent prior to
entering credentials, IWD would get stuck and could no longer accept
any connect calls with the error "Operation already in progress".
For example exiting iwctl in the Password prompt would cause this:
iwctl
$ station wlan0 connect myssid
$ Password: <Ctrl-C>
This was due to the agent never calling the network callback in the
case of an agent disconnect. Network would wait indefinitely for the
credentials, and disallow any future connect attempts.
To fix this agent_finalize_pending can be called in agent_disconnect
with a NULL reply which behaves the same as if there was an
internal timeout and ultimately allows network to fail the connection
The 8021x offloading procedure still does EAP in userspace which
negotiates the PMK. The kernel then expects to obtain this PMK
from userspace by calling SET_PMK. This then allows the firmware
to begin the 4-way handshake.
Using __eapol_install_set_pmk_func to install netdev_set_pmk,
netdev now gets called into once EAP finishes and can begin
the final userspace actions prior to the firmware starting
the 4-way handshake:
- SET_PMK using PMK negotiated with EAP
- Emit SETTING_KEYS event
- netdev_connect_ok
One thing to note is that the kernel provides no way of knowing if
the 4-way handshake completed. Assuming SET_PMK/SET_STATION come
back with no errors, IWD assumes the PMK was valid. If not, or
due to some other issue in the 4-way, the kernel will send a
disconnect.
This adds a new type for 8021x offload as well as support in
building CMD_CONNECT.
As described in the comment, 8021x offloading is not particularly
similar to PSK as far as the code flow in IWD is concerned. There
still needs to be an eapol_sm due to EAP being done in userspace.
This throws somewhat of a wrench into our 'is_offload' cases. And
as such this connection type is handled specially.
802.1x offloading needs a way to call SET_PMK after EAP finishes.
In the same manner as set_tk/gtk/igtk a new 'install_pmk' function
was added which eapol can call into after EAP completes.
The chances were extremely low, but using l_idle_oneshot
could end up causing a invalid memory access if the netdev
went down while waiting for the disconnect idle callback.
Instead netdev can keep track of the idle with l_idle_create
and remove it if the netdev goes down prior to the idle callback.
This fixes an infinite loop issue when authenticate frames time
out. If the AP is not responding IWD ends up retrying indefinitely
due to how SAE was handling this timeout. Inside sae_auth_timeout
it was actually sending another authenticate frame to reject
the SAE handshake. This, again, resulted in a timeout which called
the SAE timeout handler and repeated indefinitely.
The kernel resend behavior was not taken into account when writing
the SAE timeout behavior and in practice there is actually no need
for SAE to do much of anything in response to a timeout. The
kernel automatically resends Authenticate frames 3 times which mirrors
IWDs SAE behavior anyways. Because of this the authenticate timeout
handler can be completely removed, which will cause the connection
to fail in the case of an autentication timeout.
This crash was caused from the disconnect_cb being called
immediately in cases where send_disconnect was false. The
previous patch actually addressed this separately as this
flag was being set improperly which will, indirectly, fix
one of the two code paths that could cause this crash.
Still, there is a situation where send_disconnect could
be false and in this case IWD would still crash. If IWD
is waiting to queue the connect item and netdev_disconnect
is called it would result in the callback being called
immediately. Instead we can add an l_idle as to allow the
callback to happen out of scope, which is what station
expects.
Prior to this patch, the crashing behavior can be tested using
the following script (or some variant of it, your system timing
may not be the same as mine).
iwctl station wlan0 disconnect
iwctl station wlan0 connect <network1> &
sleep 0.02
iwctl station wlan0 connect <network2>
++++++++ backtrace ++++++++
0 0x7f4e1504e530 in /lib64/libc.so.6
1 0x432b54 in network_get_security() at src/network.c:253
2 0x416e92 in station_handshake_setup() at src/station.c:937
3 0x41a505 in __station_connect_network() at src/station.c:2551
4 0x41a683 in station_disconnect_onconnect_cb() at src/station.c:2581
5 0x40b4ae in netdev_disconnect() at src/netdev.c:3142
6 0x41a719 in station_disconnect_onconnect() at src/station.c:2603
7 0x41a89d in station_connect_network() at src/station.c:2652
8 0x433f1d in network_connect_psk() at src/network.c:886
9 0x43483a in network_connect() at src/network.c:1183
10 0x4add11 in _dbus_object_tree_dispatch() at ell/dbus-service.c:1802
11 0x49ff54 in message_read_handler() at ell/dbus.c:285
12 0x496d2f in io_callback() at ell/io.c:120
13 0x495894 in l_main_iterate() at ell/main.c:478
14 0x49599b in l_main_run() at ell/main.c:521
15 0x495cb3 in l_main_run_with_signal() at ell/main.c:647
16 0x404add in main() at src/main.c:490
17 0x7f4e15038b25 in /lib64/libc.so.6
The send_disconnect flag was being improperly set based only
on connect_cmd_id being zero. This does not take into account
the case of CMD_CONNECT having finished but not EAPoL. In this
case we do need to send a disconnect.
This adds a new connection type, TYPE_PSK_OFFLOAD, which
allows the 4-way handshake to be offloaded by the firmware.
Offloading will be used if the driver advertises support.
The CMD_ROAM event path was also modified to take into account
handshake offloading. If the handshake is offloaded we still
must issue GET_SCAN, but not start eapol since the firmware
takes care of this.
Until now FT was only supported via Auth/Assoc commands which barred
any fullmac cards from using FT AKMs. With PSK offload support these
cards can do FT but only when offloading is used.
In the FW scan callback eapol was being stared unconditionally which
isn't correct as roaming on open networks is possible. Instead check
that a SM exists just like is done in netdev_connect_event.
This should have been updated along with the connect and roam
event separation. Since netdev_connect_event is not being
re-used for CMD_ROAM the comment did not make sense anymore.
Still, there needs to be a check to ensure we were not disconnected
while waiting for GET_SCAN to come back.
netdev_connect_event was being reused for parsing of CMD_ROAM
attributes which made some amount of sense since these events
are nearly identical, but due to the nature of firmware roaming
there really isn't much IWD needs to parse from CMD_ROAM. In
addition netdev_connect_event was getting rather complicated
since it had to handle both CMD_ROAM and CMD_CONNECT.
The only bits of information IWD needs to parse from CMD_ROAM
is the roamed BSSID, authenticator IEs, and supplicant IEs. Since
this is so limited it now makes little sense to reuse the entire
netdev_connect_event function, and intead only parse what is
needed for CMD_ROAM.
station should be isolated as much as possible from the details of the
driver type and how a particular AKM is handled under the hood. It will
be up to wiphy to pick the best AKM for a given bss. netdev in turn
will pick how to drive the particular AKM that was picked.
Currently netdev handles SoftMac and FullMac drivers mostly in the same
way, by building CMD_CONNECT nl80211 commands and letting the kernel
figure out the details. Exceptions to this are FILS/OWE/SAE AKMs which
are only supported on SoftMac drivers by using
CMD_AUTHENTICATE/CMD_ASSOCIATE.
Recently, basic support for SAE (WPA3-Personal) offload on FullMac cards
was introduced. When offloaded, the control flow is very different than
under typical conditions and required additional logic checks in several
places. The logic is now becoming quite complex.
Introduce a concept of a connection type in order to make it clearer
what driver and driver features are being used for this connection. In
the future, connection types can be expanded with 802.1X handshake
offload, PSK handshake offload and CMD_EXTERNAL_AUTH based SAE
connections.
Commit 6e8b76527 added a switch statement for AKM suites which
was not correct as this is a bitmask and may contain multiple
values. Intead we can rely on wiphy_select_akm which is a more
robust check anyways.
Fixes: 6e8b765278 ("wiphy: add check for CMD_AUTH/CMD_ASSOC support")
If there is an associate timeout, retry a few times in case
it was just a fluke. At this point SAE is fully negotiated
so it makes sense to attempt to save the connection.
Any auth proto which did not implement the assoc_timeout handler
could end up getting 'stuck' forever if there was an associate
timeout. This is because in the event of an associate timeout IWD
only sets a few flags and relies on the connect event to actually
handle the failure. The problem is a connect event never comes
if the failure was a timeout.
To fix this we can explicitly fail the connection if the auth
proto has not implemented assoc_timeout or if it returns false.
In the same vein as requesting a neighbor report after
connecting for the first time, it should also be done
after a roam to obtain the latest neighbor information.
Converts ie_rsn_akm_suite values (and WPA1 hint) into a more
human readable security string such as:
WPA2-Personal, WPA3-Personal, WPA2-Personal + FT etc.
When we cancel a quick scan that has already been triggered, the
Scanning property is never reset to false. This doesn't fully reflect
the actual scanning state of the hardware since we don't (yet) abort
the scan, but at least corrects the public API behavior.
{Network} [/net/connman/iwd/0/7/73706733_psk] Connected = False
{Station} [/net/connman/iwd/0/7] Scanning = True
{Station} [/net/connman/iwd/0/7] State = connecting
{Station} [/net/connman/iwd/0/7] ConnectedNetwork =
/net/connman/iwd/0/7/73706733_psk
{Network} [/net/connman/iwd/0/7/73706733_psk] Connected = True
If IWD is connecting to a SAE/WPA3 BSS and Auth/Assoc commands
are not supported the only option is SAE offload. At this point
network_connect should have verified that the extended feature
for SAE offload exists so we can simply enable offload if these
commands are not supported.
SAE offload support requires some minor tweaks to CMD_CONNECT
as well as special checks once the connect event comes in. Since
at this point we are fully connected.
After adding network_bss_update, network now has a match_addr
queue function which can be used to replace an unneeded
l_queue_get_entries loop with l_queue_find.
This will swap out a scan_bss object with a duplicate that may
exist in a networks bss_list. The duplicate will be removed by
since the object is owned by station it is assumed that it will
be freed elsewhere.
If the hardware roams automatically we want to be sure to not
react to CQM events and attempt to roam/disconnect on our own.
Note: this is only important for very new kernels where CQM
events were recently added to brcmfmac.
Roaming on a full mac card is quite different than soft mac
and needs to be specially handled. The process starts with
the CMD_ROAM event, which tells us the driver is already
roamed and associated with a new AP. After this it expects
the 4-way handshake to be initiated. This in itself is quite
simple, the complexity comes with how this is piped into IWD.
After CMD_ROAM fires its assumed that a scan result is
available in the kernel, which is obtained using a newly
added scan API scan_get_firmware_scan. The only special
bit of this is that it does not 'schedule' a scan but simply
calls GET_SCAN. This is treated special and will not be
queued behind any other pending scan requests. This lets us
reuse some parsing code paths in scan and initialize a
scan_bss object which ultimately gets handed to station so
it can update connected_bss/bss_list.
For consistency station must also transition to a roaming state.
Since this roam is all handled by netdev two new events were
added, NETDEV_EVENT_ROAMING and NETDEV_EVENT_ROAMED. Both allow
station to transition between roaming/connected states, and ROAMED
provides station with the new scan_bss to replace connected_bss.
Adds support for getting firmware scan results from the kernel.
This is intended to be used after the firmware roamed automatically
and the scan result is require for handshake initialization.
The scan 'request' is competely separate from the normal scan
queue, though scan_results, scan_request, and the scan_context
are all used for consistency and code reuse.
Register P2P group's vendor IE writers using the new API to build and
attach the necessary P2P IE and WFD IEs to the (Re)Association Response,
Probe Response and Beacon frames sent by the GO.
Roughly validate the IEs and save some information for use in our own
IEs. p2p_extract_wfd_properties and p2p_device_validate_conn_wfd are
being moved unchanged to be usable in p2p_group_event without forward
declarations and to be next to p2p_build_wfd_ie.
Make the WSC IE processing and writing more self-contained (i.e. so that
it can be more easily moved to a separate file if desired) by using the
new ap_write_extra_ies() mechanism.
Pass the string IEs from the incoming STA association frames to
the user in the AP event data. I drop
ap_event_station_added_data.rsn_ie because that probably wasn't
going to ever be useful and the RSN IE is included in the .assoc_ies
array in any case.
Since GET_STATION (and in turn GetDiagnostics) gets the most
current station info this attribute serves as a better indication
of the current signal strength. In addition full mac cards don't
appear to always have the average attribute.
No instances of this macro now exist. If future instances crop up, the
better approach would be to use pragma directives to quiet such warnings
and allow static analysis to catch any issues.
Expanded packets with a 0 vendor id need to be treated just like
non-expanded ones. This led to very nasty looking if statements
throughout this function. Fix that by introducing a nested function
to take care of the response type normalization. This also allows us to
drop uninitialized_var usage.
Expanded Nak packet contains (possibly multiple) 8 byte chunks that
contain the type (1 byte, always '254') vendor-id (3 bytes) and
vendor-type (4) bytes.
Unfortunately the current logic was reading the vendor-id at the wrong
offset (0 instead of 1) and so the extracted vendor-type was incorrect.
Fixes: 17c569ba4c ("eap: Add authenticator method logic and API")
If we received a Nak or an Expanded Nak packet, the intent was to print
our own method type. Instead we tried to print the Nak type contents.
Fix that by always passing in our method info to eap_type_to_str.
Fixes: 17c569ba4c ("eap: Add authenticator method logic and API")
The '__' prefix is meant for private, semi-private,
inner implementation or otherwise special APIs that
are typically exposed in a header. In the case of watchlist, these
functions were static and do not fit the above description. Remove the
__ prefix accordingly.
When using iwd.conf:[General].EnableNetworkConfiguration=true, it is not
possible to configure systemd.network:[Network].MulticastDNS= as
systemd-networkd considers the link to be unmanaged. This patch allows
iwd to configure that setting on systemd-resolved directly.
If the extended feature for CQM levels was not supported no CQM
registration would happen, not even for a single level. This
caused IWD to completely lose the ability to roam since it would
only get notified when the kernel was disconnecting, around -90
dBm, not giving IWD enough time to roam.
Instead if the extended feature is not supported we can still
register for the event, just without multiple signal levels.
There is no functional change here but checking the return
value makes static analysis much happier. Checking the
return and setting the default inside the if clause is also
consistent with how IWD does it many other places.
Handle situations where the BSS we're trying to connect to is no longer
in the kernel scan result cache. Normally, the kernel will re-scan the
target frequency if this happens on the CMD_CONNECT path, and retry the
connection.
Unfortunately, CMD_AUTHENTICATE path used for WPA3, OWE and FILS does
not have this scanning behavior. CMD_AUTHENTICATE simply fails with
a -ENOENT error. Work around this by trying a limited scan of the
target frequency and re-trying CMD_AUTHENTICATE once.
An earlier patch fixed a problem where a queued quick scan would
be triggered and fail once already connected, resulting in a state
transition from connected --> autoconnect_full. This fixed the
Connect() path but this could also happen via autoconnect. Starting
from a connected state, the sequence goes:
- DBus scan is triggered
- AP disconnects IWD
- State transition from disconnected --> autoconnect_quick
- Queue quick scan
- DBus scan results come in and used to autoconnect
- A connect work item is inserted ahead of all others, transition
from autoconnect_quick --> connecting.
- Connect completes, transition from connecting --> connected
- Quick scan can finally get triggered, which the kernel fails to
do since IWD is connected, transition from connected -->
autoconnect_full.
This can be fixed by checking for a pending quick scan in the
autoconnect path.
Commit eac2410c83 ("station: Take scanned frequencies into account")
has made it unnecessary to explicitly invoke station_set_scan_results
with the expire to true in case a dbus scan finished prematurely or a
subset was not able to be started. Remove this no-longer needed logic.
Fixes: eac2410c83 ("station: Take scanned frequencies into account")
The diagnostic interface returns an error anyways if station is
not connected so it makes more sense to only bring the interface
up when its actually usable. This also removes the interface
when station disconnects, which was never done before (the
interface stayed up indefinitely due to a forgotten remove call).
When we're auto-connecting and have hidden networks configured, use
active scans regardless of whether we see any hidden BSSes in our
existing scan results.
This allows us to more effectively see/connect to hidden networks
when first powering up or after suspend.
Kernel might report hidden BSSes that are reported from beacon frames
separately than ones reported due to probe responses. This may confuse
the station network collation logic since the scan_bss generated by the
probe response might be removed erroneously when processing the scan_bss
that was generated due to a beacon.
Make sure that bss_match also takes the SSID into account and only
matches scan_bss structures that have the same BSSID and SSID contents.
Instead of manually managing whether to expire BSSes or not, use the
scanned frequency set instead. This makes the API slightly easier to
understand (dropping two boolean arguments in a row) and also a bit more
future-proof.
Commit d372d59bea checks whether a hidden network had a previous
connection attempt and re-tries. However, it inadvertently dropped
handling of a condition where a non-hidden network SSID is provided to
ConnectHiddenNetwork. Fix that.
Fixes: d372d59bea ("station: Allow ConnectHiddenNetwork to be retried")
The diagnostic interface serves no purpose until the AP has
been started. Any calls on it will return an error so instead
it makes more sense to bring it up when the AP is started, and
down when the AP is stopped.
Its useful being able to refer to the network Name/SSID once
an AP is started. For example opening an iwctl session with an
already started AP provides no way of obtaining the SSID.
In some cases the AP can send a deauthenticate frame right after
accepting our authentication. In this case the kernel never properly
sends a CMD_CONNECT event with a failure, even though CMD_COONNECT was
used to initiate the connection. Try to work around that by detecting
that a Deauthenticate event arrives prior to any Associte or Connect
events and handle this case as a connect failure.
Now that ConnectHiddenNetwork can be invoked while we're connected, set
the mac randomization hint parameter properly. The kernel will reject
requests if randomization is enabled while we're connected to a network.
If we forget a hidden network, then make sure to remove it from the
network list completely. Otherwise it would be possible to still
issue a Network.Connect to that particular object, but the fact that the
network is hidden would be lost.
==17639== 72 (16 direct, 56 indirect) bytes in 1 blocks are definitely
lost in loss record 3 of 3
==17639== at 0x4C2F0CF: malloc (vg_replace_malloc.c:299)
==17639== by 0x4670AD: l_malloc (util.c:61)
==17639== by 0x4215AA: scan_freq_set_new (scan.c:1906)
==17639== by 0x412A9C: parse_neighbor_report (station.c:1910)
==17639== by 0x407335: netdev_neighbor_report_frame_event
(netdev.c:3522)
==17639== by 0x44BBE6: frame_watch_unicast_notify (frame-xchg.c:233)
==17639== by 0x470C04: dispatch_unicast_watches (genl.c:961)
==17639== by 0x470C04: process_unicast (genl.c:980)
==17639== by 0x470C04: received_data (genl.c:1101)
==17639== by 0x46D9DB: io_callback (io.c:118)
==17639== by 0x46CC0C: l_main_iterate (main.c:477)
==17639== by 0x46CCDB: l_main_run (main.c:524)
==17639== by 0x46CF01: l_main_run_with_signal (main.c:656)
==17639== by 0x403EDE: main (main.c:490)
In the case that ConnectHiddenNetwork scans successfully, but fails for
some other reason, the network object is left in the scan results until
it expires. This will prevent subsequent attempts to use
ConnectHiddenNetwork with a .NotHidden error. Fix that by checking
whether a found network is hidden, and if so, allow the request to
proceed.
Rework the logic slightly so that this function returns an error message
on error and NULL on success, just like other D-Bus method
implementations. This also simplifies the code slightly.
We used to not allow to connect to a different network while already
connected. One had to disconnect first. This also applied to
ConnectHiddenNetwork calls.
This restriction can be dropped now. station will intelligently
disconnect from the current AP when a station_connect_network() is
issued.
If the disconnect fails and station_disconnect_onconnect_cb is called
with an error, we reply to the original message accordingly.
Unfortunately pending_connect is not unrefed or cleared in this case.
Fix that.
Fixes: d0ee923dda ("station: Disconnect, if needed, on a new connection attempt")
An invalid known_network.freq file containing several UUID
groups which have the same 'name' key results in memory leaks
in IWD. This is because the file is loaded and the group's
are iterated without detecting duplicates. This leads to the
same network_info's known_frequencies being set/overridden
multiple times.
To fix this we just check if the network_info already has a
UUID set. If so remove the stale entry.
There may be other old, invalid, or stale entries from previous
versions of IWD, or a user misconfiguring the file. These will
now also be removed during load.
netdev_shutdown calls queue_destroy on the netdev_list, which in turn
calls netdev_free. netdev_free invokes the watches to notify them about
the netdev being removed. Those clients, or anything downstream can
still invoke netdev_find. Unfortunately queue_destroy is not re-entrant
safe, so netdev_find might return stale data. Fix that by using
l_queue_peek_head / l_queue_pop_head instead.
src/station.c:station_enter_state() Old State: connecting, new state:
connected
^CTerminate
src/netdev.c:netdev_free() Freeing netdev wlan1[6]
src/device.c:device_free()
Removing scan context for wdev 100000001
src/scan.c:scan_context_free() sc: 0x4ae9ca0
src/netdev.c:netdev_free() Freeing netdev wlan0[48]
src/device.c:device_free()
src/station.c:station_free()
src/netconfig.c:netconfig_destroy()
==103174== Invalid read of size 8
==103174== at 0x467AA9: l_queue_find (queue.c:346)
==103174== by 0x43ACFF: netconfig_reset (netconfig.c:1027)
==103174== by 0x43AFFC: netconfig_destroy (netconfig.c:1123)
==103174== by 0x414379: station_free (station.c:3369)
==103174== by 0x414379: station_destroy_interface (station.c:3466)
==103174== by 0x47C80C: interface_instance_free (dbus-service.c:510)
==103174== by 0x47C80C: _dbus_object_tree_remove_interface
(dbus-service.c:1694)
==103174== by 0x47C99C: _dbus_object_tree_object_destroy
(dbus-service.c:795)
==103174== by 0x409A87: netdev_free (netdev.c:770)
==103174== by 0x4677AE: l_queue_clear (queue.c:107)
==103174== by 0x4677F8: l_queue_destroy (queue.c:82)
==103174== by 0x40CDC1: netdev_shutdown (netdev.c:5089)
==103174== by 0x404736: iwd_shutdown (main.c:78)
==103174== by 0x404736: iwd_shutdown (main.c:65)
==103174== by 0x46BD61: handle_callback (signal.c:78)
==103174== by 0x46BD61: signalfd_read_cb (signal.c:104)
In the case of module_init failing due to a module that comes after
netdev, the netdev module doesn't clean up netdev_list properly.
==6254== 24 bytes in 1 blocks are still reachable in loss record 1 of 1
==6254== at 0x483777F: malloc (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6254== by 0x4675ED: l_malloc (util.c:61)
==6254== by 0x46909D: l_queue_new (queue.c:63)
==6254== by 0x406AE4: netdev_init (netdev.c:5038)
==6254== by 0x44A7B3: iwd_modules_init (module.c:152)
==6254== by 0x404713: nl80211_appeared (main.c:171)
==6254== by 0x4713DE: process_unicast (genl.c:993)
==6254== by 0x4713DE: received_data (genl.c:1101)
==6254== by 0x46E00B: io_callback (io.c:118)
==6254== by 0x46D20C: l_main_iterate (main.c:477)
==6254== by 0x46D2DB: l_main_run (main.c:524)
==6254== by 0x46D2DB: l_main_run (main.c:506)
==6254== by 0x46D502: l_main_run_with_signal (main.c:656)
==6254== by 0x403EDB: main (main.c:490)
Rather than the previous hack which disabled group traffic it
was found that the GTK RSC could be manually set to zero which
allows group traffic. This appears to fix AP mode on brcmfmac
along with the previous fixes. This is not documented in
nl80211, but appears to work with this driver.
This is how a fullmac card tells userspace that a station has
left. This fixes the issue where the same client cannot re-connect
to the same AP multiple times. ap_new_station was renamed to
ap_handle_new_station for consistency.
Some fullmac cards were found to be buggy with getting the GTK
where it returns a BIP key for the GTK index, even after creating
a GTK with NEW_KEY explicitly. In an effort to get these cards
semi-working we can treat this just as a warning and continue with
the handshake without a GTK set which disables group traffic. A
warning is printed in this case so the user is not completely in
the dark.
Fix an issue with the recent changes to signal monitoring from commit
f456501b ("station: retry roaming unless notified of a high RSSI"):
1. driver sends NL80211_CQM_RSSI_THRESHOLD_EVENT_LOW
2. netdev->cur_rssi_low changes from FALSE to TRUE
3. netdev sends NETDEV_EVENT_RSSI_THRESHOLD_LOW to station
4. on roam reassociation, cur_rssi_low is reset to FALSE
5. station still assumes RSSI is low, periodically roams
until netdev sends NETDEV_EVENT_RSSI_THRESHOLD_HIGH
6. driver sends NL80211_CQM_RSSI_THRESHOLD_EVENT_HIGH
7. netdev->cur_rssi_low doesn't change (still FALSE)
8. netdev never sends NETDEV_EVENT_RSSI_THRESHOLD_HIGH
9. station remains stuck in an infinite roaming loop
The commit in question introduced the logic in (5). Previously the
assumption in station was - like in netdev - that if the signal was
still low, the driver would send a duplicate LOW event after
reassociation. This change makes netdev follow the same new logic as
station, i.e. assume the same signal state (LOW/HIGH) until told
otherwise by the driver.
Since fullmac cards handle auth/assoc in firmware IWD must
react differently while in AP mode just as it does in station.
For fullmac cards a NEW_STATION event is emitted post association
and from here the 4-way handshake can begin. In this NEW_STATION
handler a new sta_state is created and the needed members are
set in order to inject us back into the normal code execution
for softmac post association (i.e. creating group keys and
starting the 4-way handshake). From here everything works the
same as softmac.
At some point the non-interactive client tests began failing.
This was due to a bug in station where it would transition from
'connected' to 'autoconnect' due to a failed scan request. This
happened because a quick scan got scheduled during an ongoing
scan, then a Connect() gets issued. The work queue treats the
Connect as a priority so it delays the quick scan until after the
connection succeeds. This results in a failed quick scan which
IWD does not expect to happen when in a 'connected' state. This
failed scan actually triggers a state transition which then
gets IWD into a strange state where its connected from the
kernel point of view but does not think it is:
src/station.c:station_connect_cb() 13, result: 0
src/station.c:station_enter_state() Old State: connecting, new state: connected
src/wiphy.c:wiphy_radio_work_done() Work item 6 done
src/wiphy.c:wiphy_radio_work_next() Starting work item 5
src/station.c:station_quick_scan_triggered() Quick scan trigger failed: -95
src/station.c:station_enter_state() Old State: connected, new state: autoconnect_full
To fix this IWD should simply cancel any pending quick scans
if/when a Connect() call comes in.
Switch EAP-TLS-ClientCert and EAP-TLS-ClientKey to use
l_cert_load_container_file for file loading so that the file format is
autodetected. Add new setting EAP-TLS-ClientKeyBundle for loading both
the client certificate and private key from one file.
As requested move the client certificate and private key loading from
eap-tls-common.c to eap-tls.c. No man page change needed because those
two settings weren't documented in it in the first place.
This adds a new AccessPointDiagnostic interface. This interface
provides similar low level functionality as StationDiagnostic, but
for when IWD is in AP mode. This uses netdev_get_all_stations
which will dump all stations, parse, and return each station in
an individual callback. Once the dump is complete the destroy is
called and all data is packaged as an array of dictionaries.
AP mode will use the same structure for its diagnostic interface
and mostly the same dictionary keys. Apart from ConnectedBss and
Address being different, the remainder are the same so the
diagnostic_station_info to DBus dictionary conversion has been made
common so both station and AP can use it to build its diagnostic
dictionaries.
With AP now getting its own diagnostic interface it made sense
to move the netdev_station_info struct definition into its own
header which eventually can be accompanied by utilities in
diagnostic.c. These utilities can then be shared with AP and
station as needed.
systemd specifies a special passive target unit 'network-pre.target'
which may be pulled in by services that want to run before any network
interface is brought up or configured. Correspondingly, network
management services such as iwd and ead should specify
After=network-pre.target to ensure a proper ordering with respect to
this special target. For more information on network-pre.target, see
systemd.special(7).
Two examples to explain the rationale of this change:
1. On one of our embedded systems running iwd, a oneshot service is
run on startup to configure - among other things - the MAC address of
the wireless network interface based on some data in an EEPROM.
Following the systemd documentation, the oneshot service specifies:
Before=network-pre.target
Wants=network-pre.target
... to ensure that it is run before any network management software
starts. In practice, before this change, iwd was starting up and
connecting to an AP before the service had finished. iwd would then
get kicked off by the AP when the MAC address got changed. By
specifying After=network-pre.target, systemd will take care to avoid
this situation.
2. An administrator may wish to use network-pre.target to ensure
firewall rules are applied before any network management software is
started. This use-case is described in the systemd documentation[1].
Since iwd can be used for IP configuration, it should also respect
the After=network-pre.target convention.
Note that network-pre.target is a passive unit that is only pulled in if
another unit specifies e.g. Wants=network-pre.target. If no such unit
exists, this change will have no effect on the order in which systemd
starts iwd or ead.
[1] https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
Following a successful roaming sequence, schedule another attempt unless
the driver has sent a high RSSI notification. This makes the behaviour
analogous to a failed roaming attempt where we remained connected to the
same BSS.
This makes iwd compatible with wireless drivers which do not necessarily
send out a duplicate low RSSI notification upon reassociation. Without
this change, iwd risks getting indefinitely stuck to a BSS with low
signal strength, even though a better BSS might later become available.
In the case of a high RSSI notification, the minimum roam time will also
be reset to zero. This preserves the original behaviour in the case
where a high RSSI notification is processed after station_roamed().
Doing so also gives a chance for faster roaming action in the following
example scenario:
1. RSSI LOW
2. schedule roam in 5 seconds
(5 seconds pass)
3. try roaming
4. roaming fails, same BSS
5. schedule roam in 60 seconds
(20 seconds pass)
6. RSSI HIGH
7. cancel scheduled roam
(20 seconds pass)
8. RSSI LOW
9. schedule roam in 5 seconds or 20 seconds?
By resetting the minimum roam time, we can avoid waiting 20 seconds when
the station may have moved considerably. And since the high/low RSSI
notifications are configured with a hysteresis, we should still be
protected against too frequent spurious roaming attempts.
This is a nl80211 dump version of netdev_get_station aimed at
AP mode. This will dump all stations, parse into
netdev_station_info structs, and call the callback for each
individual station found. Once the dump is completed the destroy
callback is called.
This adds a generalized API for GET_STATION. This API handles
calling and parsing the results into a new structure,
netdev_station_info. This results structure will hold any
data needed by consumers of netdev_get_station. A helper API
(netdev_get_current_station) was added as a convenience which
automatically passes handshake->aa as the MAC.
For now only the RSSI is parsed as this is already being
done for RSSI polling/events. Looking further more info will
be added such as rx/tx rates and estimated throughput.
Arrays of dictionaries are quite common, and for basic
types this API makes things much more convenient by
putting all the enter/append/leave calls in one place.
Add a parameter to station_set_scan_results to allow skipping the
removal of old BSSes. In the DBus-triggered scan only expire BSSes
after having gone through the full supported frequency set.
It should be safe to pass partial scan results to
station_set_scan_results() when not expiring BSSes so using this new
parameter I guess we could also call it for roam scan results.
A scan normally takes about 2 seconds on my dual-band wifi adapter when
connected. The drivers will normally probe on each supported channel in
some unspecified order and will have new partial results after each step
but the kernel sends NL80211_CMD_NEW_SCAN_RESULTS only when the full
scan request finishes, and for segmented scans we will wait for all
segments to finish before calling back from scan_active() or
scan_passive().
To improve user experience define our own channel order favouring the
2.4 channels 1, 6 and 11 and probe those as an individual scan request
so we can update most our DBus org.connman.iwd.Network objects more
quickly, before continuing with 5GHz band channels, updating DBus
objects again and finally the other 2.4GHz band channels.
The overall DBus-triggered scan on my wifi adapter takes about the same
time but my measurements were not very strict, and were not very
consistent with and without this change. With the change most Network
objects are updated after about 200ms though, meaning that I get most
of the network updates in the nm-applet UI 200ms from opening the
network list. The 5GHz band channels take another 1 to 1.5s to scan and
remaining 2.4GHz band channels another ~300ms.
Hopefully this is similar when using other drivers although I can easily
imagine a driver that parallelizes 2.4GHz and 5GHz channel probing using
two radios, or uses 2, 4 or another number of dual-band radios to probe
2, 4, ... channels simultanously. We'd then lose some of the
performance benefit. The faster scan results may be worth the longer
overall scan time anyway.
I'm also assuming that the wiphy's supported frequency list is exactly
what was scanned when we passed no frequency list to
NL80211_CMD_TRIGGER_SCAN and we won't get errors for passing some
frequency that shouldn't have been scanned.
When the IP is configured to be static we can now use ACD in
order to check that the IP is available and not already in
use. If a conflict is found netconfig will be reset and no IP
will be set on the interface. The ACD client is left with
the default 'defend once' policy, and probes are not turned
off. This will increase connection time, but for static IP's
it is the best approach.
The docs just specified what a IP prefix looks like, not an
actual example. Though its not recommended to just copy paste
blindly, its still useful to have some value in the man pages
that actually works if someone just wants to get a DHCP server
working.
In the strange case that the dns list or the domain list are empty and
openresolv is being used, delete the openresolv entry instance instead
of trying to set it to an empty value
Make sure to erase the network_info of a known network that has been
removed before disconnecting any stations connected to it. This fixes
the following warning observed when forgetting a connected network:
WARNING: ../git/src/network.c:network_rank_update() condition n < 0 failed
This also fixes a bug where such a forgotten network would incorrectly
appear as the first element in the response to GetOrderedNetworks(). By
clearing the network_info, network_rank_update() properly negates the
rank of the now-unknown network.
==5279== 104 bytes in 2 blocks are definitely lost in loss record 1 of 1
==5279== at 0x4C2F0CF: malloc (vg_replace_malloc.c:299)
==5279== by 0x4655CD: l_malloc (util.c:61)
==5279== by 0x47116B: l_rtnl_address_new (rtnl.c:136)
==5279== by 0x438F4B: netconfig_get_dhcp4_address (netconfig.c:429)
==5279== by 0x438F4B: netconfig_ipv4_dhcp_event_handler
(netconfig.c:735)
==5279== by 0x491C77: dhcp_client_event_notify (dhcp.c:332)
==5279== by 0x491C77: dhcp_client_rx_message (dhcp.c:810)
==5279== by 0x492A88: _dhcp_default_transport_read_handler
(dhcp-transport.c:151)
==5279== by 0x46BECB: io_callback (io.c:118)
==5279== by 0x46B10C: l_main_iterate (main.c:477)
==5279== by 0x46B1DB: l_main_run (main.c:524)
==5279== by 0x46B3EA: l_main_run_with_signal (main.c:646)
==5279== by 0x403ECE: main (main.c:490)
Fix the AlwaysRandomizeAddress setting name.
Add the stricter specification of the extension syntax.
Clarify that GTC and MD5 can't be used as outer EAP methods with wifi.
Tracking of addresses that weren't set by us seemed a bit questionable.
Take this out for now. If this is ever needed, then a queue with
l_rtnl_address objects should be used.
Introduce a new v4_address member which will hold the currently
configured IPV4 address (static or obtained via DHCP). Use the new
l_rtnl_address class for this.
As a side-effect, lease expiration will now properly remove the
configured address.
This patch converts the code to use the new l_rtnl_address class. The
settings parsing code will now return an l_rtnl_address object which
can be installed directly.
Also, address removal path for static addresses has been removed, since
netconfig_reset() sets disable_ipv6 setting to '1', which will remove
all IPV6 addresses for the interface.
This patch converts the code to use the new l_rtnl_route class instead
of using l_rtnl_route6* utilities. The settings parsing code will now
return an l_rtnl_route object which can be installed directly.
Also, the route removal path has been removed since netconfig_reset()
sets disable_ipv6 setting to '1' which will remove all IPV6 routes and
addresses for the interface.
This also changes the resolve API a little bit to act as a 'set' API
instead of an incremental 'add' API. This is actually easier to manage
in the resolve module since both systemd and resolvconf want changes
wholesale and not incrementally.
Waiting to request neighbor reports until we are in need of a roam
delays the roam time, and probably isn't as reliable since we are
most likely in a low RSSI state. Instead the neighbor report can
be requested immediately after connecting, saved, and used if/when
a roam is needed. The existing behavior is maintained if the early
neighbor report fails where a neighbor report is requested at the
time of the roam.
The code which parses the reports was factored out and shared
between the existing (late) neighbor report callback and the early
neighbor report callback.
handshake_state_set_authenticator_ie must be called to set group_cipher
in struct handshake_shake before handshake_set_gtk_state, otherwise
handshake_set_gtk_state is unable to determine the key length to set
handshake state gtk.
Fixes: 4bc20a0979 ("ap: Start EAP-WSC authentication with WSC enrollees")
For now the RA client is ran automatically when DHCPv6 client starts.
RA takes care of installing / deleting prefix routes and installing the
default gateway. If Router Advertisements indicate support DHCPv6, then
DHCPv6 transactions are kicked off and the address is set / removed
automatically.
Stateless configuration is not yet supported.
Modern kernels ~5.4+ have changed the way lost beacons are
reported and effectively make the lost beacon event useless
because it is immediately followed by a disconnect event. This
does not allow IWD enough time to do much of anything before
the disconnect comes in and we are forced to fully re-connect
to a different AP.
If EnableNetworkConfiguration was enabled ap.c required that
APRanges also be set. This prevents IWD from starting which
effects a perfectly valid station configuration. Instead if
APRanges is not provided IWD still allows ap_init to pass but
DHCP just will not be enabled.