The packet loss test had a few problems. First being that the RSSI for
the original BSS was not low enough to change the rank. This meant any
roam was just lucky that the intended BSS was first in the results.
The second problem is timing related, and only happens on UML. Disabling
the rules after the roaming condition sometimes allows IWD to fully
roam and connect before the next state change checks.
A new test which blocks all data frames once connected, then tries
to send 100 packets. This should result in the kernel sending a
packet loss event to userspace, which IWD should react to and roam.
Test that the DHCPv4 lease got renewed after the T1 timer runs out.
Then also simulate the DHCPREQUEST during renew being lost and
retransmitted and the lease eventually getting renewed T1 + 60s later.
The main downside is that this test will inevitably take a while if
running in Qemu without the time travel ability.
Update the test and some utility code to run hostapd in an isolated net
namespace for connection_test.py. We now need a second hostapd
instance though because in static_test.py we test ACD and we need to
produce an IP conflict. Moving the hostapd instance unexpectedly fixes
dhcpd's internal mechanism to avoid IP conflicts and it would no longer
assign 192.168.1.10 to the second client, it'd notice that address was
already in use and assign the next free address, or fail if there was
none. So add a second hostapd instance that runs in the main namespace
together with the statically-configured client, it turns out the test
relies on the kernel being unable to deliver IP traffic to interfaces on
the same system.
testAgent had a few tests which weren't reliable, and one was not
actually testing anything, or at least not what the name implied it
should be testing.
The first issue was using iwctl in the first place. There is not a
reliable way to know when iwctl has registered its agent so relying on
that with a sleep, or waiting for the service to become available isn't
100% fool proof. To fix this use the updated PSKAgent which allows
multiple to be registered. This ensures the agent is ready for requests.
This test was also renamed to be consistent with what its actually
testing: that IWD uses the first agent registered.
This removes test_connection_with_other_agent as well because this test
case is covered by the client test itself. There is no need to re-test
iwctl's agent functionality here.
By creating a new bus connection for each agent we can register multiple
with IWD. This did mean the agent interface needs to be unique for each
agent (removing _agent_manager_if) as well as tracking multiple agents
in a list.
The test here is verifying that a DBus Connect() call will still
work with 'other' agents registered. In this case it uses iwctl to
set a password, then call Connect() manually.
The problem here is that we have no way of knowing when iwctl fully
starts and registers its agent. There was a sleep in there but that
is unreliable and we occationally were still getting past that without
iwctl having started fully.
To fix this properly we need to wait for iwctl's agent service to appear
on the bus. Since the bus name is unknown we must first find all names,
then cross reference their PID's against the iwctl PID. This is done
using ListNames, and GetConnectionUnixProcessID APIs.
The rekey/reauth logic was broken in a few different ways.
For rekeys the event list was not being reset so any past 4-way
handshake would allow the call to pass. This actually removes
the need for the sleep in the extended key ID test because the
actual handshake event is waited for correctly.
For both rekeys and reauths, just waiting for the EAP/handshake
events was not enough. Without checking if the client got
disconnected we essentially allow a full disconnect and reconnect,
meaning the rekey/reauth failed.
Now a 'disallow' array can be passed to wait_for_event which will
throw an exception if any events in that array are encountered
while waiting for the target event.
Yet another weird UML quirk. The intent of this tests was to ensure
the profile gets encrypted, and to check this both the mtime and
contents of the profile were checked.
But under UML the profile is copied, IWD started, and the profile
is encrypted all without any time passing. The (same) mtime was
then updated without any changes which fails the mtime check.
This puts a sleep after copying the profile to ensure the system
time differs once IWD encrypts the profile.
In UML if any process dies while test-runner is waiting for the DBus
service or some socket to be available it will block forever. This
is due to the way the non_block_wait works.
Its not optimal but it essentially polls over some input function
until the conditions are met. And, depending on the input function,
this can cause UML to hang since it never has a chance to go idle
and advance the time clock.
This can be fixed, at least for services/sockets, by sleeping in
the input function allowing time to pass. This will then allow
test-runner to bail out with an exception.
This patch adds a new wait_for_service function which handles this
automatically, and wait_for_socket was refactored to behave
similarly.
test_decryption_failure is quite simple and only verifies that a known
network exists after starting. This causes the test to end before IWD can
fully start up leaving the DBus utilities in limbo having not fully
initialized.
Then, on the next test, stale InterfaceAdded signals arrive (for Station
and P2P) which throw exceptions when trying to get the bus (since IWD is
long gone). In addition the next IWD instance has started so any paths
included in the InterfaceAdded signals are bogus and cause additional
exceptions.
At the end of this test we can call list_devices() which will wait for
the InterfaceAdded signal, and cleanly exit afterwards.
This was copy pasted from the autoconnect test, and depending on
how the python module cache is ordered can incorrectly use the
wrong test class. This should nothappen because we insert
the paths to the head of the list but for consistency the class
should be named something that reflects what the test is doing.
Add a fake resolvconf executable to verify that the right nameserver
addresses were actually committed by iwd. Again use unique nameserver
addresses to reduce the possibility that the test succeeds by pure luck.
In static_test.py add IPv6. Add comments on what we're actually testing
since it wasn't very clear. After the expected ACD conflict detection,
succeed if either the lost address was removed or the client disconnected
from the AP since this seems like a correct action for netconfig to
implement.
In iwd.py make sure all the static methods that touch IWD storage take the
storage_dir parameter instead of hardcoding IWD_STORAGE_DIR, and make
sure that parameter is actually used.
Create the directory if it doesn't exist before copying files into it.
This fixes a problem in testNetconfig where
`IWD.copy_to_storage('ssidTKIP.psk', '/tmp/storage')`
would result in /tmp/storage being created as a file, rather than a
directory containing a file, and resulting in IWD failing to start with:
`Failed to create /tmp/storage`
runner.py creates /tmp/iwd but that doesn't account for IWD sessions
with a custom storage dir path.
Extend test_ip_address_match to support IPv6 and to test the
netmask/prefix length while it reads the local address since those are
retrieved using the same API.
Modify testNetconfig to validate the prefix lengths, change the prefix
lengths to be less common values (not 24 bits for IPv4 or 64 for IPv6),
minor cleanup.
Since commit 922fa099721903b106a7bc1ccd1ffe8c4a7bce69 in hostap, our
setting of config_methods on P2P-client interface was ignored. Work
around that commit, in addition to the previous workaround we have in
this test, to again ensure the correct config_methods value is used.
Similarly to ofono/phonesim allow tests to be skipped if wpa_supplicant
is not found on the system.
This required some changes to DPP/P2P where Wpas() should be called first
since this can now throw a SkipTest exception.
The Wpas class was also made to allow __del__ to be called without
throwing additional exceptions in case wpa_supplicant was not found.
This allows the EAP tests to pass, but the fix really needs to be in
hostapd itself. Hostapd currently tries to lookup the EAP session
immediately after receiving EAPOL_REAUTH. This uses the identity
it has stored which, in the case of PEAP/TTLS, will always be a phase2
identity. During this initial lookup hostapd hard codes the identity
to be phase1 which is not true for PEAP/TTLS, and the lookup fails.
The current way this was being done was to import collections and
use collections.Mapping. This has been deprecated since python 3.3
but has worked up until python 3.10. After python 3.10 this will
no longer work, and Mapping must be imported from collections.abc.
This was passing IFNAME= along with EAPOL_REAUTH which does not work
in the context of a hostapd socket where the iface is already implied.
This fixes that issue as well as resets the events array and actually
waits for the required events afterwards.
The signal agent notifications were changed which breaks this test.
Specifically commit ce227e7b94 sends a notification when connected
which breaks the 'agent.calls' check. Since this check is done both
after connecting and once already connected the initial value may
be 1 or 0. Because of this that check was removed entirely.
This test was just piping the PSK files into /tmp/iwd/ssidCCMP.psk
which is a bit fragile if the storage dir was ever to change. Instead
use copy_to_storage and the 'name' keyword to copy the file.
Use scapy library which allows one to easily construct and fudge various
network packets. This makes constructing spoofed packets much easier
and more readable compared to hex-encoded, hand-crafted frames.
The TA/BSSID addresses of spoofed disassociate frames were set
incorrectly. They should be using the 02:00:00:XX:XX:XX address, but
instead were being converted over to 42:00:00:XX:XX:XX address
update_config=1 lets wpa_supplicant write config changes
to the config file. In the real world this is what you want
so your DPP credentials are persistant. But for testing this
is not correct since multiple tests use the same config file
and expect it to be pristine.
Occationally wpa_supplicant was connecting to the AP without
running DPP because the config already had the network
credentials.
There is really no reason to have hwsim create interfaces automatically
for test-runner. test-runner already does this for wpa_supplicant and
hostapd, and IWD can create the interface itself.
The test was rekeying in a loop which ends up confusing hostapd
depending on the timing of when it gets the REKEY command and any
responses from IWD. UML seemed to handle this fine but not QEMU.
Instead delay the rekey a bit to allow it to fully complete before
sending another.
Similarly to hostapd.wait_for_event, IWD's variant needed to act on
an IO watch because events were being received prior to even calling
wait_for_event.
With how fast UML is hostapd events were being sent out prior to
ever calling wait_for_event. Instead set an IO watch on the control
socket and cache all events as they come. Then, when wait_for_event
is called, it can reference this list. If the event is found any
older events are purged from the list.
The AP-ENABLED event needed a special case because hostapd gets
started before the IO watch can be registered. To fix this an
enabled property was added which queries the state directly. This
is checked first, and if not enabled wait_for_event continues normally.
This removes prints which were never supposed to make it upstream as
well as changes sleep() to wd.wait() as well as increase the wait
period to fix issues with how fast UML runs the tests.
Any test using assertTrue(hostapd.list_sta()) improperly has been
replaced with wait_for_event(). There were a few places where this
was actually ok (i.e. IWD is already connected) but most needed to
be changed since the check was just after IWD connected and hostapd's
list_sta() API may not return a fully updated list.
- Setting the IP address was resulting in an error:
Error: any valid prefix is expected rather than "wln58".
This is fixed by reordering the arguments with the IP address first
- Remove the sleep, and use non_block_wait to wait for the IPv6 address
to be set.
Before setting the address, wait for the interface to go down. This
fixes somewhat rare cases where setting the address returns -EBUSY
and ultimately breaks the neighbor reports.
All tests which could avoid calling scan() directly have been
changed to use the 'full_scan' argument to get_ordered_network.
This was done because of unreliable scanning behavior on slower
systems, like VMs. If we get unlucky with the scheduler some beacons
are not received in time and in turn scan results are missing.
Using full_scan=True works around this issue by repeatedly scanning
until the SSID is found.
When configuring wpa_supplicant all we care about is that it
received the configuration object. wpa_supplicant takes quite a bit
of time to connect in some cases so waiting for that is unneeded.
This also increases the DPP timeout which may be required on slower
systems or if the timing is particularly unlucky when receiving
frames.
Change a few critical checks that were failing sometimes:
- A few asserts were changed to wait_for_object_condition
- A 15 second timeout was removed (default used instead)
- Do a full scan at beginning of each test to clear any
cached BSS's. The second test run was getting stale results
and the RSSI values were not expected.
This was not being properly honored when existing networks were
already populated. This poses an issue for any test which uses
full_scan after setting radio values such as signal strength.
If an event is in response to some command which is returning an
unexpected value (unexpected with respect to wpas.py) handle_eow
would raise an exception.
Specifically with DPP this was being hit when the URI was being
returned.
Adds a new wait argument which, if false, will call the DBus method
and return immediately. This allows the caller to create multiple
radios very quickly, simulating (as close as we can) a wifi card
with dual phy's which appear in the kernel simultaneously.
The name argument was also changed to be mandatory, which is now
required by hwsim.
This simulates the conditions that trigger a free-after-use which was
fixed with:
2c355db7 ("scan: remove periodic scans from queue on abort")
This behavior can be reproduced reliably using this test with the above
patch reverted.
During investigation another separate crash was found. The original is
caused by a disconnect event coming in after a neighbor report scan
was completed (roam failed) during the full roam scan.
The second crash is caused by a disconnect coming in during a full
roam scan when no neighbor report scan was ever issued.
First disconnect wpa_supplicant to make sure it wont miss frames if
it decides to connect. Also alter the order of things for the
configurator test so autoconnect doesn't start until after hostapd
is up (avoids additional scanning and delays)
Controlling wpa_supplicant/hostapd from a text based interface is
problematic in that there is no way of knowing if an event corresponds
to a request. In certain cases if wpa_s/hostapd is sending out multiple
events and we make a request, a random event may come back after the
request, but before the actual result.
To fix this, at least for this specific case, we can continue to read
from the socket until the result is numeric.
Some wpa_cli utilities return some result which isn't possible to
get with wait_for_event unless you know what the result will be.
This adds wait_for_result which just returns the first event that
comes in.
wait_for_event was checking the event string presence in the rx_data
array which meant the event string had to match perfectly to any
received events. This poses problems with events that include additional
information which the caller may not be able to know or does not care
about. For example:
DPP-RX src=02:00:00:00:02:00 freq=2437 type=11
Waiting for this event previously would require the caller know src, freq,
and type. If the caller only wants to wait for DPP-RX, it can now do that.
Since a Device class can represent multiple modes (AP, AdHoc, station)
move StationDebug out of the init and only create this class when it
is used (presumably only when the device is in station mode).
The StationDebug class is now created in a property method consistent
with 'station_if'. If Device is not in station mode it is automatically
switched if the test tries any StationDebug methods.
If the Device mode is changed from 'station' the StationDebug class
instance is destroyed.
Passing the full argument list to StationDebug was removed
because any existing properties (for Device) were being
included and causing incorrect behavior.
This neglected to handle namespaces which should also be
passed to StationDebug. Unfortunately the arguments are not
named when Device() is initialized so they cannot easily be
sorted. Instead just define Device() arguments to match the
DBus abstraction and pass only the path and namespace to
StationDebug
Make sure we wipe the leases file both for server and client, so that
dhclient doesn't try to re-use leases from previous tests (should really
happen) and waste time waiting for a reply. Extend the timeout from 1s
to 5s, sometimes it takes dhclient 1s just to start. Disable verbose
mode if not needed to avoid dhclient stalling if the pipe is not being
read.
Passing *args, **kwargs into StationDebug ended up initializing the
class with Station properties since devices can be initialized from
existing property dictionaries. Since the object path is all
StationDebug needs, pass args[0] instead.