There is a common block of code in nearly every test which is incorrect,
most likely a copy-paste from long ago. It goes something like:
wd.wait_for_object_condition(device, 'not obj.scanning')
device.scan()
wd.wait_for_object_condition(device, 'not obj.scanning')
network = device.get_ordered_network("ssid")
The problem here is that sometimes the scanning property does not get
updated fast enough before device.scan() returns, meaning get_ordered_network
comes up with nothing. Some tests pass scan_if_needed=True which 'fixes'
this but ends up re-scanning after the original scan finishes.
To put this to rest scan_if_needed is now defaulted to True, and no
explicit scan should be needed.
This will use the Roam() developer method to force a roam to
a certain BSS. This is particularly useful for any test requiring
roams that are not testing IWD's BSS selection logic. Rather than
creating hwsim rules, setting low RSSI values, and waiting for the
roam logic/scan to happen Roam() can be used to force the roam
logic immediately.
Several tests tests for connectivity with the expectation that it
will fail. This ends up taking 30+ seconds because testutil retries
3 times, each with a 10 second timeout. By passing expect_fail=True
this lowers the timeout to zero, and skips any retries.
Break up the SAE tests into two parts: testSAE and testSAE-AntiClogging
testSAE is simplified to only use two radios and a single phy managed
by hostapd. hostapd configurations are changed via the new 'set_value'
method added to hostapd utils. This allows forcing hostapd to use a
particular sae group set, or force hostapd to use SAE H2E/Hunting and
Pecking Loop for key derivation. A separate test for IKE Group 20 is no
longer required and is folded into connection_test.py
testSAE-AntiClogging is added with an environment for 5 radios instead
of 7, again with hostapd running on a single phy. 'sae_pwe' is used to
force hostapd to use SAE H2E or Hunting and Pecking for key derivation.
Both Anti-Clogging protocol variants are thus tested.
main.conf is added to both directories to force scan randomization off.
This seems to be required for hostapd to work properly on hwsim.
This is similar to wait_for_object_condition, but will not allow
any intermediate state changes between the initial and expected
conditions. This is useful for roaming tests when the expected
state change is 'connected' --> 'roaming' with no changes in
between.
Sometimes scan results can come in with a MAC address which
should be in the first index of addrs[] (42:xx:xx:xx:xx:xx).
This causes a failure to lookup the radio path.
There was also a failure path added if the radio cannot be
found rather than rely on DBus to fail with a None path.
The arguments to SendFrame were also changed to use the
ByteArray DBus type rather than python's internal bytearray.
This shouldn't have any effect, but its more consistent with
how DBus arguments should be used.
After recent changes fixing wait_for_object_condition it was accidentally
made to only work with classes, not other types of objects. Instead
create a minimal class to hold _wait_timed_out so it doesnt rely on
'obj' holding the boolean.
The testAPRoam autotest was silently failing on my machine until I
realized that my distribution hostapd (Arch Linux) is not built with
CONFIG_WNM_AP=y. Indeed, it is also disabled by default in upstream
hostapd. This resulted in the send_bss_transition() function of
hostapd.py silently failing. With this change, throw an exception in
case the BSS_TM_REQ command does not succeed to hopefully save others
the time of debugging this problem.
There were some major problems related to logging and process
output. Tests which required output from start_process would
break if used with '--log/--verbose'. This is because we relied
on 'communicate' to retrieve the process output, but Popen does
not store process output when stdout/stderr are anything other
than PIPE.
Intead, in the case of logging or outfiles, we can simply read
from the file we just wrote to.
For an explicit --verbose application we must handle things
slightly different. A keyword argument was added to Process,
'need_out' which will ensure the process output is kept
regardless of --log or --verbose.
Now a user should be able to use --log/--verbose without any
tests failing.
After the re-write this was broken and not noticed until
recently. The issue appeared to be that the GLib timeout
callback retained no context of local variables. Previously
_wait_timed_out was set as a class variable, but this was
removed so multiple IWD instances could work. Without
_wait_timed_out being a class variable the GLib timeout
setting it had no effect on the wait loop.
To fix this we can set _wait_timed_out on the object being
passed in. This is preserved in the GLib timeout callback
and setting it gets honored in the wait loop.
Certain classes were still using the default namespace. This
didn't matter yet since testAP was the only test using namespaces,
and the AP interface was the only one being used.
For an IWD station on a separate namespace all objects need to
be accessable, so the namespace is passed along to those as needed.
When network namespaces are introduced there may be multiple
IWD class instances. This makes IWD.get_instance ambiguous
when namespaces are involved. iwd.py has been refactored to
not use IWD.get_instance, but testutil still needs it since
its purely based off interface names. Rather than remove it
and modify every test to pass the IWD object we can just
maintain the existing behavior for only the root namespace.
The agent path was generated based on the current time which
sometimes yielded duplicate paths if agents were created quickly
after one another. Instead a simple iterator removes any chance
of a duplicate path.
If the caller specifies the number of devices only return that many.
Some sub-tests may only need a subset of the total number of devices
for the test. If the number of devices expected is less than the total
being returned, python would throw an exception.
If a test does not need any hostapd instances but still loads
hostapd.py for some reason we want to gracefully throw an
exception rather than fail in some other manor.
Add the new wpas.Wpas class roughly based on hostapd.HostapdCLI but only
adding methods for the P2P-related stuff.
Adding "wpa_supplicant" to -v will enable output from the wpa_supplicant
process to be printed and "wpa_supplicant-dbg" will make it more verbose
("wpa_supplicant" is not needed because it seems to be automatically
enabled because of the glob matching in ctx.is_verbose)
The host systems configuration directories for IWD/EAD were
being mounted in the virtual machine. This required that the
host create these directories before hand. Instead we can
just set up the system and IWD/EAD to use directories in /tmp
that we create when we start the VM. This avoids the need for
any host configuration.
This module is essentially a heavily stripped down version of iwd.py
to work with EAD. Class names were changed to match EAD but basically
the EAD, Adapter, and AdapterList classes map 1:1 to IWD, Device, and
DeviceList.
This is somewhat of a hack, but the IWDDBusAbstract is a very
convenient abstraction to DBus objects. The only piece that restricts
it to IWD is the hardcoded IWD_SERVICE. Instead we can pass in a
keyword argument which defaults to IWD_SERVICE. That way other modules
(like EAD) can utilize this abstraction with their own service simply
by changing that service argument.
The AdHoc functionality in iwd.py was not consistent at all with
how all the other classes worked (my bad). Instead we can create
a very simple AdHocDevice class which inherits all the DBus magic
in the IWDDBusAbstract class.
This got added in the re-write but a __del__ method was also
added to the Rule class as well. This caused problems if hwsim
cleaned up since it removed the rules, which caused each rule
to call __del__. Since the rule had already been removed there
was no longer a DBus interface which raised an exception.
Before the re-write there was interesting escapes being used for
set_neighbor. Curiously now hostapd fails to set the neighbor due
to these escapes so they have been removed.
Slower systems may not be able to make some timeouts that tests
mandated. All timeouts were increased significantly to allow tests
to pass on slow systems.
It is not safe to assume that the python dbus implementation will
wait for a method to return. The documentation says this with
respect to reply_handler/error_handler:
"If both are None, the implementation may request that no reply is sent"
To stay on the safe side we should always include the error/reply
handlers and wait for the operation to complete.
iwd.py was updated to use the TestContext APIs to start/stop
IWD. This makes the process managment consistent between starting
IWD from test-runner or from the IWD() constructor.
The psk agent is now tracked, and destroyed upon __del__. This is
to fix issues where a test throws an exception and never
unregisters the agent, causing future tests to fail.
The configuration directory was also chaged to /tmp by
default. This was done since all tests which used this used /tmp
anyways.
The GLib mainloop was removed, and instead put into test-runner
itself. Now any mainloop operations can use ctx.mainloop instead
Before hostapd was initialized using the wiphy_map which has now
gone away. Instead we have a global config module which contains
a single 'ctx'. This is the centeral store for all test information.
This patch converts hostapd.py to lookup instances by already
initialized Hostapd object. The interface parameter was removed
since all tests have been converted to use config= instead.
In addition HostapdCLI was changed to allow no parameters if there
is only a single hostapd instance.
This patch completely re-writes test-runner in Python. This was done
because the existing C test-runner had some clunky work arounds and
maintaining or adding new features was starting to become a huge pain.
There were a few aspects of test-runner which continually had to
be dealt with when adding any new functionality:
* Argument parsing: Adding new arguments to test-runner wasn't so
bad, but if you wanted those arguments passed into the VM it
became a huge pain. Arguments needed to be parsed, then re-formatted
into the qemu command line, then re-parsed in a special order
(backwards) once in the VM. The burden for adding new arguments was
quite high so it was avoided (at least by me) at all costs.
* The separation between C and Python: The tests are all written in
python, but the executables, radios, and interfaces were all created
from C. The way we solved this was by encoding the require info as
environment variables, then parsing those from Python. It worked,
but it was, again, a huge pain.
* Process management: It started with all processes being launched
from C, but eventually tests required the ability to start IWD, or
kill hostapd ungracefully in order to test certain functionality.
Since the processes were tracked in C, Python had no way of
signalling that it killed a process and when it started one C had
no idea. This was mitigated (basically by killall), but it was
no where close to an elegant solution.
Re-writing test-runner in python solves all these problems and will
be much easier to maintain.
* Argument parsing: Now all arguments are forwarded automatically
to the VM. The ArgParse library takes care of parsing and each
argument is stored in a dictionary.
* Separation between C and Python: No more C, so no more separation.
* Process management: Python will now manage all processes. This
allows a test to kill, restart, or start a new process and not
have to remember the PID or to kill it after the test.
There are a few more important aspects of the python implementation
that should now be considered when writing new tests:
* The IWD constructor now has different default arugments. IWD
will always be started unless specified and the configuration
directory will always be /tmp
* Any non *.py file in the test directory will be copied to /tmp.
This avoids the need for 'tmpfs_extra_stuff' completely.
* ctrl_interface will automatically be appended to every hostapd
config. There is no need to include this in a config file from
now on.
* Test cleanup is extremely important. All tests get run in the
same interpreter now and the tests themselves are actually loaded
as python modules. This means e.g. if you somehow kept a reference
to IWD() any subsequent tests would not start since IWD is still
running.
* For debugging, the test context can be printed which shows running
processes, radios, and interfaces.
Three non-native python modules were used: PrettyTable, colored, and
pyroute2
$ pip3 install prettytable
$ pip3 install termcolor
$ pip3 install pyroute2
Restarting hostapd from python was actually leaking memory and
causing the hostapd object to stay referenced in python. The
GLib timeout in wait_for_event was the ultimate cause, but this
had no come to light because no tests restarted hostapd then
used wait_for_event.
In addition, any use of wait_for_event after a restart would
cause an exception because the event socket was never re-attached
after hostapd restarted.
Now we properly clean up the timeout in wait_for_event and
re-initialize the hostapd object on restart.
There is a very common block of code inside many autotests
which goes something like:
device.scan()
condition = 'obj.scanning'
wd.wait_for_object_condition(device, condition)
condition = 'not obj.scanning'
wd.wait_for_object_condition(device, condition)
network = device.get_ordered_network('an-ssid')
When you see the same pattern in nearly all the tests this shows
we need a helper. Basic autotests which merely check that a
connection succeeded should not need to write the same code again
and again. This code ends up being copy-pasted which can lead to
bugs.
There is also a code pattern which attempts to get ordered
networks, and if this fails it scans and tries again. This, while
not optimal, does prevent unneeded scanning by first checking if
any networks already exist.
This patch solves both the code reuse issue as well as the recovery
if get_ordered_network(s) fails. A new optional parameter was
added to get_ordered_network(s) which is False by default. If True
get_ordered_network(s) will perform a scan if the initial call
yields no networks. Tests will now be able to simply call
get_ordered_network(s) without performing a scan before hand.
Hostapd has a feature where you can connect to its control socket and
receive events it generates. Currently we only send commands via this
socket.
First we open the socket (/var/run/hostapd/<iface>) and send the
ATTACH command. This tells hostapd we are ready and after this any
events will be sent over this socket.
A new API, wait_for_event, was added which takes an event string and
waits for some timeout. The glib event loop has been integrated into
this, though its not technically async since we are selecting over a
socket which blocks. To mitigate this a small timeout was chosen for
each select call and then wrapped in a while loop which waits for the
full timeout.
If no networks are found, return None rather than an empty
array. This is easier to check by the caller (and was assumed
in some cases). Also add an exception to get_ordered_network
if no network is found.
If the config file passed in is not found we would continue and
eventually something else would fail. Instead immediately raise an
exception to be more clear on what is actually failing.
Initially the solution to copying files to .hotspot was to use the
existing copy_to_storage, but allow full directory copying. Doing it
this way does not allow us to copy single files into .hotspot which
makes it difficult to test single configurations in several consecutive
tests.
This adds a new API, copy_to_hotspot, where a single hotspot config
can be provided. clear_storage was also modified to clear out the
.hotspot directory in addition to the regular storage directory.
There is a common interface lookup in many tests in order to initialize
the HostapdCLI object e.g.:
for intf in hostapd_map.values():
if intf.config == 'ssidOWE.conf':
hapd = HostapdCLI(intf)
break
Instead of having to do this in every test, HostapdCLI will now
optionally take a config file (config=<file>). The interface object
will still be prefered (i.e. supplying an interface will not even
check the config file) as to not break existing tests. But if only
a config file is supplied the lookup is done internally.
There are some tests that do still need the interface, as they do
an interface lookup to initialize both hostapd and hwsim at the
same time.
The start_ap method was raising potential dbus errors before converting
them to an IWD error type. This is due to dbus.Set() not taking an error
handler. The only way to address this is to catch the error, convert it
and raise the converted error.
For the interface connectivity tests obtain the lists of interfaces in
use directly from the IWD class, which has the current list from DBus
properties.
The hostapd_map dictionary is indexed by the interface name so there's
no point iterating over it to find that entry whose name matches, we can
look up by the name directly. Simplify code.
In the test utilties updated the wiphy_map struct built from the
TEST_WIPHY_LIST variable to parse the new format and to use a new
structure where each wiphy is a namedtuple and each interface under it
also contains a reference to that wiphy. The 'use' field is now
assigned to the wiphy instead of to the interface.
The AdHoc methods used to miss the change in properties
on AdHoc interface. To address the race condition, we
subscribe 'PropertiesChanged' signal first and then do
GetAll properties call. This way we are not missing
'PropertiesChanged' signal in between these calls.
When using --valgrind, you must also use --verbose iwd, and, depending
on the tests you may also need to include pytests in the verbose flag.
Since anyone using --valgrind definitely wants to see valgrind info
printed they should not need to enable verbose printing. Also, manually
parsing valgrind prints with IWD prints mixed throughout is a nightmare
even for a single test.
This patch uses valgrind's --log-file flag, which is directed to
/tmp/valgrind.log. After the tests runs we can print out this file.
This is a helper/shortcut to get_ordered_networks (plural). In nearly
all the autotests we had (roughly) the same block of code:
ordered_network = get_ordered_networks()[0]
self.assertNotEqual(ordered_network, None)
self.assertEqual(ordered_network.name, "someSsid")
Rather than having to do this, we can simplify and just have a single
call to get_ordered_network, which takes the SSID. If the SSID is not
found, we raise an exception. This avoids needing both asserts since
we are guarenteed that the return is valid and the SSID matches.
This also avoids possible issues with multiple networks showing up in
the GetOrderedNetworks call. Eventually test-runner will support running
tests on real wireless hardware, so its possible we could pick up
unexpected networks in the scan.
At some point a stray ';' got added into an autotest in a section
of code that is heavily copy pasted. So in turn nearly all the autotests
have this stray ';' after list_devices (and a few in other places).
This is a temporary fix to address the recent split of
the Device interface. This patch contains a workaround that
re-enables the auto-tests while the test framework is being
reworked to satisfy the need of the new API and should not
be considered as a permanent solution.
The default behavior of NetworkObject.connect() is to wait for the
Connect dbus method to reply before returning back to the test. This
change makes it possible to connect, but not wait for a reply and
continue on with the test (by specifying wait=False). This is
specifically required to test SAE anti-clogging, where the AP needs
to have several simultaneous connections at once for the anti-clogging
logic to trigger. This change also adds Device.wait_for_connected()
which waits for the device interface State variable to be "connected".
The list_devices API has a race condition where sometimes it will
return zero or less than the expected number of devices and fail
the test. A fix is in place for when only a single devices is
expected, but some tests expect more than one device. This changes
wait_to_appear to an integer, and the caller can specify the number
of devices they expect to get back. The default stays as it was,
zero or "return cached devices".
Previously, we had to wait for an arbitrary amount of time after
iwd was started form the python scripts to make sure that the
radio objects are available on the D-bus. This patch allows to
wait inside of list_devices() method and get back a list of the
devices once they are available.
Allow passing a list of passphrases for subsequent agent requests to the
PSKAgent constructor. This also makes existing tests stricter because
a spurious agent request will not receive the same passphrase.
If --gdb is used with test-runner, all the timeouts in the
IWD class must be turned off otherwise the test will fail.
Inside test-runner, a environment variable (IWD_TEST_TIMEOUTS)
is set to either 'on' or 'off'. Then the IWD class (and any
others) can handle the timeouts accordingly. Note that this
does not turn off dbus timeouts, rather it ignores timeout
failures. This does mean that ultimately the test will most
likely fail due to a dbus timeout, but it at least gives you
unlimited debugging time.
Changed disassociate reason to 0x07 when spoofing a disassociate
frame. This along with 0x06 are the only two reason codes that
should be accepted in an unprotected disassociate frame.
Using the hwsim dbus interface ".Interface" under the radios
object you can now send an arbitrary frame out from that radio.
Two methods have been added, spoof_frame and spoof_disassociate.
The hwsim SendFrame method requires the radio frequency which
is obtained from the hostapd config file. This adds a generic
API to get any config value from the hostapd config, as well
as a get_freq API that converts the channel number to a
frequency.
For testing SA Query, the autotest needs the ablility to force
kill (and restart) hostapd without giving it time to deauth its
stations gracefully. A method was added to the HostapdCLI class
which does a killall -9 hostapd, resets the wlnX interface,
and restarts hostapd with the same arguments as it had before.
The AuthCenter will now wait for the RX thread to start before
continuing with the test.
Also removed the non blocking option and fixed the loop to
handle a blocking recvfrom call.
If the peer detects a sync error, it sends back AUTS. The
authentication center must then re-synchronize and update
the SQN it has saved for the given ISMI.
For testing purposes, it is useful to run hlrauc.py by itself
not including it from another python script like autotests do.
Better error checking was also added as testing can result in
badly formatted data.
We need to reset self._exception after _wait_for_async_op raises an
exception, otherwise _wait_for_async_op will report that exception for
every future operation (this wasn't an issue when an exception always
meant that the test was failing and objects were torn down anyway)
In the beacon loss test try to simulate a periodic communication problem
because we don't support roaming if the AP goes away completely.
2 seconds seems to be enough to consistently trigger the beacon_loss
event without triggering a disconnect by the linux kernel or hiding the AP
from the roam scan. Also set the RSSI for that AP lower so that it is
not reselected by iwd.
Implemented milenage algorithm in hlrauc.py. Unlike EAP-SIM, the
authentication center must compute several values to give back
to the server (hostapd). This was already done by IWD as the peer
in EAP-AKA, but was also needed on the server side (HLR AuC).
Test that the AP interface and the station interface managed by iwd
can actually send and receive ethernet traffic when iwd is in the
connected state. Due to linux routing none of the high level utilities
like ping or arping can be easily used to test communication between
two interfaces of the same machine so use a method based on the
mac80211_hwsim/tools/hwsim_test.c utility in the wpa_supplicant tree
that uses a raw socket to inject unicast and broadcast frames.
Add this check in three tests of different security type connections
that simulate a single AP, and the two roaming tests with two APs.
Check that the station can't communicate with the other AP's interface.
Sometimes iwd will take a while to register its dbus name. The python
class already waits for the name to appear on dbus if iwd is being
launched from python, do this also if iwd was already launched by the
test-runner. My use case was when running iwd under valgrind in which
case it runs slower.
Modify AsyncOpAbstract._wait_for_async_op and
IWD.wait_for_object_condition to call context.iteration() in the
blocking mode instead of calling context.pending() every 0.01s. This
gets rid of busy-waiting and also ensures that the condition is checked
after every single dbus (or other) event. This way a condition that
potentially occurs for less than 0.01s can be reliably waited for.
Make the HostapdCLI class non-static so that each objects corresponds
to a hostapd instance on one network interface (this is independent of
whether the AP instances use one or separate hostapd processes)