An invalid known_network.freq file containing several UUID
groups which have the same 'name' key results in memory leaks
in IWD. This is because the file is loaded and the group's
are iterated without detecting duplicates. This leads to the
same network_info's known_frequencies being set/overridden
multiple times.
To fix this we just check if the network_info already has a
UUID set. If so remove the stale entry.
There may be other old, invalid, or stale entries from previous
versions of IWD, or a user misconfiguring the file. These will
now also be removed during load.
netdev_shutdown calls queue_destroy on the netdev_list, which in turn
calls netdev_free. netdev_free invokes the watches to notify them about
the netdev being removed. Those clients, or anything downstream can
still invoke netdev_find. Unfortunately queue_destroy is not re-entrant
safe, so netdev_find might return stale data. Fix that by using
l_queue_peek_head / l_queue_pop_head instead.
src/station.c:station_enter_state() Old State: connecting, new state:
connected
^CTerminate
src/netdev.c:netdev_free() Freeing netdev wlan1[6]
src/device.c:device_free()
Removing scan context for wdev 100000001
src/scan.c:scan_context_free() sc: 0x4ae9ca0
src/netdev.c:netdev_free() Freeing netdev wlan0[48]
src/device.c:device_free()
src/station.c:station_free()
src/netconfig.c:netconfig_destroy()
==103174== Invalid read of size 8
==103174== at 0x467AA9: l_queue_find (queue.c:346)
==103174== by 0x43ACFF: netconfig_reset (netconfig.c:1027)
==103174== by 0x43AFFC: netconfig_destroy (netconfig.c:1123)
==103174== by 0x414379: station_free (station.c:3369)
==103174== by 0x414379: station_destroy_interface (station.c:3466)
==103174== by 0x47C80C: interface_instance_free (dbus-service.c:510)
==103174== by 0x47C80C: _dbus_object_tree_remove_interface
(dbus-service.c:1694)
==103174== by 0x47C99C: _dbus_object_tree_object_destroy
(dbus-service.c:795)
==103174== by 0x409A87: netdev_free (netdev.c:770)
==103174== by 0x4677AE: l_queue_clear (queue.c:107)
==103174== by 0x4677F8: l_queue_destroy (queue.c:82)
==103174== by 0x40CDC1: netdev_shutdown (netdev.c:5089)
==103174== by 0x404736: iwd_shutdown (main.c:78)
==103174== by 0x404736: iwd_shutdown (main.c:65)
==103174== by 0x46BD61: handle_callback (signal.c:78)
==103174== by 0x46BD61: signalfd_read_cb (signal.c:104)
In the case of module_init failing due to a module that comes after
netdev, the netdev module doesn't clean up netdev_list properly.
==6254== 24 bytes in 1 blocks are still reachable in loss record 1 of 1
==6254== at 0x483777F: malloc (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6254== by 0x4675ED: l_malloc (util.c:61)
==6254== by 0x46909D: l_queue_new (queue.c:63)
==6254== by 0x406AE4: netdev_init (netdev.c:5038)
==6254== by 0x44A7B3: iwd_modules_init (module.c:152)
==6254== by 0x404713: nl80211_appeared (main.c:171)
==6254== by 0x4713DE: process_unicast (genl.c:993)
==6254== by 0x4713DE: received_data (genl.c:1101)
==6254== by 0x46E00B: io_callback (io.c:118)
==6254== by 0x46D20C: l_main_iterate (main.c:477)
==6254== by 0x46D2DB: l_main_run (main.c:524)
==6254== by 0x46D2DB: l_main_run (main.c:506)
==6254== by 0x46D502: l_main_run_with_signal (main.c:656)
==6254== by 0x403EDB: main (main.c:490)
Rather than the previous hack which disabled group traffic it
was found that the GTK RSC could be manually set to zero which
allows group traffic. This appears to fix AP mode on brcmfmac
along with the previous fixes. This is not documented in
nl80211, but appears to work with this driver.
This is how a fullmac card tells userspace that a station has
left. This fixes the issue where the same client cannot re-connect
to the same AP multiple times. ap_new_station was renamed to
ap_handle_new_station for consistency.
Some fullmac cards were found to be buggy with getting the GTK
where it returns a BIP key for the GTK index, even after creating
a GTK with NEW_KEY explicitly. In an effort to get these cards
semi-working we can treat this just as a warning and continue with
the handshake without a GTK set which disables group traffic. A
warning is printed in this case so the user is not completely in
the dark.
Fix an issue with the recent changes to signal monitoring from commit
f456501b ("station: retry roaming unless notified of a high RSSI"):
1. driver sends NL80211_CQM_RSSI_THRESHOLD_EVENT_LOW
2. netdev->cur_rssi_low changes from FALSE to TRUE
3. netdev sends NETDEV_EVENT_RSSI_THRESHOLD_LOW to station
4. on roam reassociation, cur_rssi_low is reset to FALSE
5. station still assumes RSSI is low, periodically roams
until netdev sends NETDEV_EVENT_RSSI_THRESHOLD_HIGH
6. driver sends NL80211_CQM_RSSI_THRESHOLD_EVENT_HIGH
7. netdev->cur_rssi_low doesn't change (still FALSE)
8. netdev never sends NETDEV_EVENT_RSSI_THRESHOLD_HIGH
9. station remains stuck in an infinite roaming loop
The commit in question introduced the logic in (5). Previously the
assumption in station was - like in netdev - that if the signal was
still low, the driver would send a duplicate LOW event after
reassociation. This change makes netdev follow the same new logic as
station, i.e. assume the same signal state (LOW/HIGH) until told
otherwise by the driver.
The testAPRoam autotest was silently failing on my machine until I
realized that my distribution hostapd (Arch Linux) is not built with
CONFIG_WNM_AP=y. Indeed, it is also disabled by default in upstream
hostapd. This resulted in the send_bss_transition() function of
hostapd.py silently failing. With this change, throw an exception in
case the BSS_TM_REQ command does not succeed to hopefully save others
the time of debugging this problem.
Since fullmac cards handle auth/assoc in firmware IWD must
react differently while in AP mode just as it does in station.
For fullmac cards a NEW_STATION event is emitted post association
and from here the 4-way handshake can begin. In this NEW_STATION
handler a new sta_state is created and the needed members are
set in order to inject us back into the normal code execution
for softmac post association (i.e. creating group keys and
starting the 4-way handshake). From here everything works the
same as softmac.
After the test-runner re-write many tests were left with
stale options that are no longer used at all. These were
periodically getting removed as changes were made to
individual tests, but its apparent now that a tree wide
removal was needed.
The kvmguest shorthand was removed after the release of Linux 5.10. It
was just shorthand for kvm_guest.config anyway, so update the
test-runner documentation accordingly.
At some point the non-interactive client tests began failing.
This was due to a bug in station where it would transition from
'connected' to 'autoconnect' due to a failed scan request. This
happened because a quick scan got scheduled during an ongoing
scan, then a Connect() gets issued. The work queue treats the
Connect as a priority so it delays the quick scan until after the
connection succeeds. This results in a failed quick scan which
IWD does not expect to happen when in a 'connected' state. This
failed scan actually triggers a state transition which then
gets IWD into a strange state where its connected from the
kernel point of view but does not think it is:
src/station.c:station_connect_cb() 13, result: 0
src/station.c:station_enter_state() Old State: connecting, new state: connected
src/wiphy.c:wiphy_radio_work_done() Work item 6 done
src/wiphy.c:wiphy_radio_work_next() Starting work item 5
src/station.c:station_quick_scan_triggered() Quick scan trigger failed: -95
src/station.c:station_enter_state() Old State: connected, new state: autoconnect_full
To fix this IWD should simply cancel any pending quick scans
if/when a Connect() call comes in.
There were some major problems related to logging and process
output. Tests which required output from start_process would
break if used with '--log/--verbose'. This is because we relied
on 'communicate' to retrieve the process output, but Popen does
not store process output when stdout/stderr are anything other
than PIPE.
Intead, in the case of logging or outfiles, we can simply read
from the file we just wrote to.
For an explicit --verbose application we must handle things
slightly different. A keyword argument was added to Process,
'need_out' which will ensure the process output is kept
regardless of --log or --verbose.
Now a user should be able to use --log/--verbose without any
tests failing.
The verbose arguments come in from the QEMU command line as a
single string. This should have been split into an array immediately
but was not. This led to issues like hostapd debug being enabled
when "-v hostapd_cli" was passed in.
Since the list of files copied to /tmp was part of the return value from
pre_test(), if an exception occurred inside pre_test(), "copied" would
be undefined and the post_test(ctx, copied) call in the finally clause
cause another exception:
raceback (most recent call last):
File "/home/balrog/repos/iwd/tools/test-runner", line 1508, in <module>
run_tests()
File "/home/balrog/repos/iwd/tools/test-runner", line 1242, in run_tests
run_auto_tests(config.ctx, args)
File "/home/balrog/repos/iwd/tools/test-runner", line 1166, in run_auto_tests
post_test(ctx, copied)
UnboundLocalError: local variable 'copied' referenced before assignment
(apart from not being able to clean up the files). Pass "copied" as a
paremeter to pre_test instead.
Switch EAP-TLS-ClientCert and EAP-TLS-ClientKey to use
l_cert_load_container_file for file loading so that the file format is
autodetected. Add new setting EAP-TLS-ClientKeyBundle for loading both
the client certificate and private key from one file.
As requested move the client certificate and private key loading from
eap-tls-common.c to eap-tls.c. No man page change needed because those
two settings weren't documented in it in the first place.
After the re-write this was broken and not noticed until
recently. The issue appeared to be that the GLib timeout
callback retained no context of local variables. Previously
_wait_timed_out was set as a class variable, but this was
removed so multiple IWD instances could work. Without
_wait_timed_out being a class variable the GLib timeout
setting it had no effect on the wait loop.
To fix this we can set _wait_timed_out on the object being
passed in. This is preserved in the GLib timeout callback
and setting it gets honored in the wait loop.
This command uses GetDiagnostics to show a list of connected
clients and some information about them. The information
contained for each connected station nearly maps 1:1 with the
station diagnostics information shown in "station <wlan> show"
apart from "ConnectedBss" which is now "Address".
For now this module serves as a helper for printing diagnostic
dictionary values. The new API (diagnostic_display) takes a
Dbus iterator which has been entered into a dictionary and
prints out each key and value. A mapping struct was defined
which maps keys to types and units. For simple cases the mapping
will consist of a dbus type character and a units string,
e.g. dBm, Kbit/s etc. For more complex printing which requires
processing the value the 'units' void* cant be set to a
function which can be custom written to handle the value.
This adds a new AccessPointDiagnostic interface. This interface
provides similar low level functionality as StationDiagnostic, but
for when IWD is in AP mode. This uses netdev_get_all_stations
which will dump all stations, parse, and return each station in
an individual callback. Once the dump is complete the destroy is
called and all data is packaged as an array of dictionaries.
AP mode will use the same structure for its diagnostic interface
and mostly the same dictionary keys. Apart from ConnectedBss and
Address being different, the remainder are the same so the
diagnostic_station_info to DBus dictionary conversion has been made
common so both station and AP can use it to build its diagnostic
dictionaries.
With AP now getting its own diagnostic interface it made sense
to move the netdev_station_info struct definition into its own
header which eventually can be accompanied by utilities in
diagnostic.c. These utilities can then be shared with AP and
station as needed.
systemd specifies a special passive target unit 'network-pre.target'
which may be pulled in by services that want to run before any network
interface is brought up or configured. Correspondingly, network
management services such as iwd and ead should specify
After=network-pre.target to ensure a proper ordering with respect to
this special target. For more information on network-pre.target, see
systemd.special(7).
Two examples to explain the rationale of this change:
1. On one of our embedded systems running iwd, a oneshot service is
run on startup to configure - among other things - the MAC address of
the wireless network interface based on some data in an EEPROM.
Following the systemd documentation, the oneshot service specifies:
Before=network-pre.target
Wants=network-pre.target
... to ensure that it is run before any network management software
starts. In practice, before this change, iwd was starting up and
connecting to an AP before the service had finished. iwd would then
get kicked off by the AP when the MAC address got changed. By
specifying After=network-pre.target, systemd will take care to avoid
this situation.
2. An administrator may wish to use network-pre.target to ensure
firewall rules are applied before any network management software is
started. This use-case is described in the systemd documentation[1].
Since iwd can be used for IP configuration, it should also respect
the After=network-pre.target convention.
Note that network-pre.target is a passive unit that is only pulled in if
another unit specifies e.g. Wants=network-pre.target. If no such unit
exists, this change will have no effect on the order in which systemd
starts iwd or ead.
[1] https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
Following a successful roaming sequence, schedule another attempt unless
the driver has sent a high RSSI notification. This makes the behaviour
analogous to a failed roaming attempt where we remained connected to the
same BSS.
This makes iwd compatible with wireless drivers which do not necessarily
send out a duplicate low RSSI notification upon reassociation. Without
this change, iwd risks getting indefinitely stuck to a BSS with low
signal strength, even though a better BSS might later become available.
In the case of a high RSSI notification, the minimum roam time will also
be reset to zero. This preserves the original behaviour in the case
where a high RSSI notification is processed after station_roamed().
Doing so also gives a chance for faster roaming action in the following
example scenario:
1. RSSI LOW
2. schedule roam in 5 seconds
(5 seconds pass)
3. try roaming
4. roaming fails, same BSS
5. schedule roam in 60 seconds
(20 seconds pass)
6. RSSI HIGH
7. cancel scheduled roam
(20 seconds pass)
8. RSSI LOW
9. schedule roam in 5 seconds or 20 seconds?
By resetting the minimum roam time, we can avoid waiting 20 seconds when
the station may have moved considerably. And since the high/low RSSI
notifications are configured with a hysteresis, we should still be
protected against too frequent spurious roaming attempts.
This takes a Dbus iterator which has been entered into a
dictionary and prints out each key and value. It requires
a mapping which maps keys to types and units. For simple
cases the mapping will consist of a dbus type character
and a units string, e.g. dBm, Kbit/s etc. For more complex
printing which requires processing the value the 'units'
void* cant be set to a function which can be custom written
to handle the value.