This patch completely re-writes test-runner in Python. This was done
because the existing C test-runner had some clunky work arounds and
maintaining or adding new features was starting to become a huge pain.
There were a few aspects of test-runner which continually had to
be dealt with when adding any new functionality:
* Argument parsing: Adding new arguments to test-runner wasn't so
bad, but if you wanted those arguments passed into the VM it
became a huge pain. Arguments needed to be parsed, then re-formatted
into the qemu command line, then re-parsed in a special order
(backwards) once in the VM. The burden for adding new arguments was
quite high so it was avoided (at least by me) at all costs.
* The separation between C and Python: The tests are all written in
python, but the executables, radios, and interfaces were all created
from C. The way we solved this was by encoding the require info as
environment variables, then parsing those from Python. It worked,
but it was, again, a huge pain.
* Process management: It started with all processes being launched
from C, but eventually tests required the ability to start IWD, or
kill hostapd ungracefully in order to test certain functionality.
Since the processes were tracked in C, Python had no way of
signalling that it killed a process and when it started one C had
no idea. This was mitigated (basically by killall), but it was
no where close to an elegant solution.
Re-writing test-runner in python solves all these problems and will
be much easier to maintain.
* Argument parsing: Now all arguments are forwarded automatically
to the VM. The ArgParse library takes care of parsing and each
argument is stored in a dictionary.
* Separation between C and Python: No more C, so no more separation.
* Process management: Python will now manage all processes. This
allows a test to kill, restart, or start a new process and not
have to remember the PID or to kill it after the test.
There are a few more important aspects of the python implementation
that should now be considered when writing new tests:
* The IWD constructor now has different default arugments. IWD
will always be started unless specified and the configuration
directory will always be /tmp
* Any non *.py file in the test directory will be copied to /tmp.
This avoids the need for 'tmpfs_extra_stuff' completely.
* ctrl_interface will automatically be appended to every hostapd
config. There is no need to include this in a config file from
now on.
* Test cleanup is extremely important. All tests get run in the
same interpreter now and the tests themselves are actually loaded
as python modules. This means e.g. if you somehow kept a reference
to IWD() any subsequent tests would not start since IWD is still
running.
* For debugging, the test context can be printed which shows running
processes, radios, and interfaces.
Three non-native python modules were used: PrettyTable, colored, and
pyroute2
$ pip3 install prettytable
$ pip3 install termcolor
$ pip3 install pyroute2
The tests basically remained the same with a few minor changes.
The wiphy_map and in turn hostapd_map are no longer used. This
was already partially converted a long time ago when the 'config'
parameter was added to HostapdCLI. This patch fully converts all
autotests to use 'config' rather than looking up by interface.
Some test scripts were named 'test.py' which was fine before but
the new rewrite actually loads each python test as a module. The
name 'test' is too ambiguous and causes issues due to a native
python module with the same name. All of these files were
renamed to 'connection_test.py'.
First, looking for DeviceState.connected gives a much better indication
if we are actually connected vs the connected property on the network
object. Second, its good practice to also check that hostapd sees that
the station is connected.
Restarting hostapd from python was actually leaking memory and
causing the hostapd object to stay referenced in python. The
GLib timeout in wait_for_event was the ultimate cause, but this
had no come to light because no tests restarted hostapd then
used wait_for_event.
In addition, any use of wait_for_event after a restart would
cause an exception because the event socket was never re-attached
after hostapd restarted.
Now we properly clean up the timeout in wait_for_event and
re-initialize the hostapd object on restart.
Many tests force a reauth after the initial connection. When the tests
were written there was no way of ensuring the reauth completed except
waiting (IWD.wait()). Now we can wait for hostapd events in the tests,
which is faster and more reliable than busy waiting.
This test was not reliably passing. Busy waiting is not really reliable,
but in this specific case its really the only option as the blacklist
must expire based on time.
In certain cases the autoconnect portion of each subtest was connecting
to the network so fast that the check for obj.scanning was never successful
since IWD was already connected (and in turn not scanning). Since the
autoconnect path will wait for the device to be connected there really isn't
a reason to wait for any scanning conditions. The normal connect path does
need to wait for scanning though, and for this we can now use the new
scan_if_needed parameter to get_ordered_networks.
There is a very common block of code inside many autotests
which goes something like:
device.scan()
condition = 'obj.scanning'
wd.wait_for_object_condition(device, condition)
condition = 'not obj.scanning'
wd.wait_for_object_condition(device, condition)
network = device.get_ordered_network('an-ssid')
When you see the same pattern in nearly all the tests this shows
we need a helper. Basic autotests which merely check that a
connection succeeded should not need to write the same code again
and again. This code ends up being copy-pasted which can lead to
bugs.
There is also a code pattern which attempts to get ordered
networks, and if this fails it scans and tries again. This, while
not optimal, does prevent unneeded scanning by first checking if
any networks already exist.
This patch solves both the code reuse issue as well as the recovery
if get_ordered_network(s) fails. A new optional parameter was
added to get_ordered_network(s) which is False by default. If True
get_ordered_network(s) will perform a scan if the initial call
yields no networks. Tests will now be able to simply call
get_ordered_network(s) without performing a scan before hand.
These values were meant only to force IWD's BSS preference but
since the RSSI's were so low in some cases this caused a roam
immediately after connecting. This patch changes the RSSI values
to prevent a roam from happening.
'Connected' property of the network object is set before the connection
attempt is made and does not indicate a connection success. Therefore,
use device status property to identify the connection status of the device.
This test made it past the initial refactor to use HostapdCLI with the
'config' parameter. This avoids the need to iterate the hostapd map in
the actual test.
This test merely verifies hostapd receieved our measurement reports
and verified they were valid. Hostapd does not verify the actual
beacon report body. Really, the only way to test this is on an
actual network which makes these requests.
Hostapd has a feature where you can connect to its control socket and
receive events it generates. Currently we only send commands via this
socket.
First we open the socket (/var/run/hostapd/<iface>) and send the
ATTACH command. This tells hostapd we are ready and after this any
events will be sent over this socket.
A new API, wait_for_event, was added which takes an event string and
waits for some timeout. The glib event loop has been integrated into
this, though its not technically async since we are selecting over a
socket which blocks. To mitigate this a small timeout was chosen for
each select call and then wrapped in a while loop which waits for the
full timeout.
Its difficult to know 100%, but this random test failures appeared
to be caused by two issues. One was that get_ordered_network is being
checked for None, when it was returning a zero length array. Because
of this the scanning block was never executed in any cases. This was
fixed in the previous commit. The other issue was the disconnect at
the start of the tests. The disconnect will cause all pending scans
to cancel, which appeared to cause the scanning block below to be
skipped over quickly if the timing was right. Then, afterwards,
getting a single network failed because scanning was not complete.
If no networks are found, return None rather than an empty
array. This is easier to check by the caller (and was assumed
in some cases). Also add an exception to get_ordered_network
if no network is found.
If the config file passed in is not found we would continue and
eventually something else would fail. Instead immediately raise an
exception to be more clear on what is actually failing.
This autotest was manually creating the .known_network.freq file so
the UUID needed to be manually generated and updated for the test
to function correctly.
This is merely an empty test that can act as a sandbox for the new
--shell command. It was not named with 'test' so that autotesting
will skip it.
This test is not very useful for virtual hardware testing
(mac80211_hwsim), but very useful for USB/PCI passthrough. When
setup correctly, you can now pass through a single device and test
against real networks with a minimal kernel.
Doing this scan causes issues in the test. Like with other autoconnect
tests we can just use the fact that IWD will always be doing a periodic
scan during start up, so we only need to wait for that to finish before
querying the network list.
Initially the solution to copying files to .hotspot was to use the
existing copy_to_storage, but allow full directory copying. Doing it
this way does not allow us to copy single files into .hotspot which
makes it difficult to test single configurations in several consecutive
tests.
This adds a new API, copy_to_hotspot, where a single hotspot config
can be provided. clear_storage was also modified to clear out the
.hotspot directory in addition to the regular storage directory.
This removes all the duplicated code where the interfaces are iterated
and the radio/hostapd instances are created. Instead the two new APIs
are used to get each instance, e.g.:
hapd = HostapdCLI(config='ssid.conf')
radio = hwsim.get_radio('radX')
There is a common interface lookup in many tests in order to initialize
the HostapdCLI object e.g.:
for intf in hostapd_map.values():
if intf.config == 'ssidOWE.conf':
hapd = HostapdCLI(intf)
break
Instead of having to do this in every test, HostapdCLI will now
optionally take a config file (config=<file>). The interface object
will still be prefered (i.e. supplying an interface will not even
check the config file) as to not break existing tests. But if only
a config file is supplied the lookup is done internally.
There are some tests that do still need the interface, as they do
an interface lookup to initialize both hostapd and hwsim at the
same time.
The start_ap method was raising potential dbus errors before converting
them to an IWD error type. This is due to dbus.Set() not taking an error
handler. The only way to address this is to catch the error, convert it
and raise the converted error.
Running autotests with native hardware will not work on tests which
depend on the hwsim python API (since hwsim will not be running).
For these tests, it will now be required that they specify:
needs_hwsim=1
This allows the test to be skipped when running with native hardware
rather than the test failing with a python exception.
This new test was merged during the time when testutil was not working
properly, so it was never verified to work with respect to testutil
(testing for 'connected' has always worked).
Since testFILS has 2 hostapd interfaces test_interface_connected was
defaulting to the incorrect interface for the SHA384 test. Now, the
explicit interfaces are passed in when checking for connectivity.
Don't use del wd to dereference the IWD instance at the end of the function
where it has been defined in the first place as at this point wd is about
to have its reference count decreased anyway (the variable's scope is
ending) so it's pointless (but didn't hurt).
Relying on the __del__ destructor to kill the IWD process in those tests
it has been started in the constructor is a bit of a hack in the first
place, because the destructor is called on garbage collection and even
through CPython does this on the refcount reaching 0, this is not
documented and there's no guideline on when it should happen or if it
should happen at all. So it could be argued that we should keep the del
wd statemenets to be able to easily replace all of them with a call to a
new method. But most of them are not placed so that they're guaranteed
to happen on test success or failure. It would probably be easier to do
this and other housekeeping in a base class and make the tests its
subclasses. Also some of these tests don't really need to launch iwd
themselves, since IWD now tracks changes in the known network files I
think IWD only really needs to be killed between tests when main.conf
changes.
In the tests that only want to iterate over the hostapd interfaces,
simplify the pattern of walking through the whole wiphy_map tree by
instead using the hostapd_map variable which is already filtered to only
contain hostapd interfaces.
For the interface connectivity tests obtain the lists of interfaces in
use directly from the IWD class, which has the current list from DBus
properties.
The hostapd_map dictionary is indexed by the interface name so there's
no point iterating over it to find that entry whose name matches, we can
look up by the name directly. Simplify code.
In the test utilties updated the wiphy_map struct built from the
TEST_WIPHY_LIST variable to parse the new format and to use a new
structure where each wiphy is a namedtuple and each interface under it
also contains a reference to that wiphy. The 'use' field is now
assigned to the wiphy instead of to the interface.
The AdHoc methods used to miss the change in properties
on AdHoc interface. To address the race condition, we
subscribe 'PropertiesChanged' signal first and then do
GetAll properties call. This way we are not missing
'PropertiesChanged' signal in between these calls.
Previously, the WPS tests have shared a single instance of iwd
among themselves. This approach didn’t allow to identify which
tests have passed and which failed. The new solution makes WPS
tests independent from each other by creating a new instance
of iwd for each one of them.
The simplest way to test this was to create a new AP, where
max_num_sta=1. This only allows a single STA to connect to this AP.
We connect a device to this AP, then try and connect with another.
This results in hostapd failing with DENIED_NO_MORE_STAS, which will
cause a temporary blacklist. We can then disconnect both devices,
and reconnect the device that previously got denied. If it connects
then we know the blacklist only persisted for that earlier connection.
This is a VERY simple test for HT/VHT. Since there are so many potential
options in the IE this really just tests that drops in RSSI will cause
IWD to choose a different BSS, even if that means choosing HT over VHT,
or even basic rates over HT/VHT.
SAE has a clogging test which requires 4 radios to all simultaneously
connect. All the other tests are only using one of these radios, so
in these tests we explicitly disconnect these devices preventing them
from autoconnecting.
Since the EAP-PWD fragmentation test uses group 19 there is test
coverage there for that group. This changes connection_test to use
group 20 instead of 19.
When using --valgrind, you must also use --verbose iwd, and, depending
on the tests you may also need to include pytests in the verbose flag.
Since anyone using --valgrind definitely wants to see valgrind info
printed they should not need to enable verbose printing. Also, manually
parsing valgrind prints with IWD prints mixed throughout is a nightmare
even for a single test.
This patch uses valgrind's --log-file flag, which is directed to
/tmp/valgrind.log. After the tests runs we can print out this file.
This is a helper/shortcut to get_ordered_networks (plural). In nearly
all the autotests we had (roughly) the same block of code:
ordered_network = get_ordered_networks()[0]
self.assertNotEqual(ordered_network, None)
self.assertEqual(ordered_network.name, "someSsid")
Rather than having to do this, we can simplify and just have a single
call to get_ordered_network, which takes the SSID. If the SSID is not
found, we raise an exception. This avoids needing both asserts since
we are guarenteed that the return is valid and the SSID matches.
This also avoids possible issues with multiple networks showing up in
the GetOrderedNetworks call. Eventually test-runner will support running
tests on real wireless hardware, so its possible we could pick up
unexpected networks in the scan.
At some point a stray ';' got added into an autotest in a section
of code that is heavily copy pasted. So in turn nearly all the autotests
have this stray ';' after list_devices (and a few in other places).
testWPA was not verifying connectivity between the two interfaces. Funny
enough, doing this resulted in the same problems that adhoc had where
we were setting the connection as complete before the gtk/igtk were set.
This is fixed now so we can now use testutil in this test.
Curiously this test started failing. The problem was incorrect KC/SRES
values in the sim.db file. I noticed no direct changes to this file,
but changes inside ofono, phonesim, and hostapd could have potentially
caused this.
This test was copied from testFT-PSK-roam, but for SAE. The test behaves
as follows:
- Connect to SAE network (full authentication)
- Fast transition to another SAE AP
- Fast transition to a PSK/WPA2 AP
This is a temporary fix to address the recent split of
the Device interface. This patch contains a workaround that
re-enables the auto-tests while the test framework is being
reworked to satisfy the need of the new API and should not
be considered as a permanent solution.
Fixed two issues:
1. There is no longer a dbus exception when switching to AP mode when
connected in station mode so that assert was removed.
2. After the device/station change the timing must have changed, causing
autoconnect to take over before an explicit connect call. Added a
psk provisioning file that disables autoconnect.
Make sure stop_ap is called on success and on failure in both tests so
that one can succeed after the other has failed. Also make sure to move
both interfaces out of autoconnect state.
The default behavior of NetworkObject.connect() is to wait for the
Connect dbus method to reply before returning back to the test. This
change makes it possible to connect, but not wait for a reply and
continue on with the test (by specifying wait=False). This is
specifically required to test SAE anti-clogging, where the AP needs
to have several simultaneous connections at once for the anti-clogging
logic to trigger. This change also adds Device.wait_for_connected()
which waits for the device interface State variable to be "connected".
1) wait for a device to become available
2) add try, except block for the clean termination of iwd in
the case of a failure
3) increase the max execution time to help with valgrind
1) wait for a device to become available
2) add try, except block for the clean termination of iwd in
the case of a failure
3) remove waits
4) eliminate a race condition on get_ordered_networks()
list_devices() was updated to take an integer rather than a bool
for the wait_to_appear argument. This updates any tests that
explicily passed True/False as the argument to list_devices.
The list_devices API has a race condition where sometimes it will
return zero or less than the expected number of devices and fail
the test. A fix is in place for when only a single devices is
expected, but some tests expect more than one device. This changes
wait_to_appear to an integer, and the caller can specify the number
of devices they expect to get back. The default stays as it was,
zero or "return cached devices".
The single AP test worked fine, but adding a failure test caused some
problems. Since the kernel is never restarted between tests it maintains
old stale scan results from the previous test. This was causing an
assert to sometimes fail in the second test being run because it was
returning > 1 ordered networks. This change iterates through the ordered
network list and chooses the appropriate network rather than assuming
get_ordered_networks() will always return only one network object
1) Renamed the test to reflect the usage of PEAP
2) Prevented the creation of an extra instance of iwd
3) Refactored to start catching the exceptions and properly
dispose an instance of iwd
4) Switched to list_devices with wait option
1) Removed duplicated entries form .conf
2) Refactored to start catching the exceptions and properly
dispose an instance of iwd
3) Switched to list_devices with wait option
Previously, we had to wait for an arbitrary amount of time after
iwd was started form the python scripts to make sure that the
radio objects are available on the D-bus. This patch allows to
wait inside of list_devices() method and get back a list of the
devices once they are available.
These tests were failing (both with/without ofono) because iwd
was trying to autoconnect before the autotest had issued a
connect request (causing iwd to return a busy response). To fix
this, autoconnect was explicitly disabled in the config file.
Update the expected DBus exception in the manual connect case, affected
by recent EAP changes. Also slightly improve the comment in the file
although it's still not 100% correct.
This also tests multiple agent requests for one network connection
because the TTLS client private key is not in the config file and the
MSCHAPV2 password is not in the config file.
Make 3 connections in test EAP-TLS, one with an unencrypted private key,
one with the private key passphrase provided in the provisioning file
and one with the passphrase provided through the agent. Also improve
the scanning logic at the beginning.
Allow passing a list of passphrases for subsequent agent requests to the
PSKAgent constructor. This also makes existing tests stricter because
a spurious agent request will not receive the same passphrase.
If --gdb is used with test-runner, all the timeouts in the
IWD class must be turned off otherwise the test will fail.
Inside test-runner, a environment variable (IWD_TEST_TIMEOUTS)
is set to either 'on' or 'off'. Then the IWD class (and any
others) can handle the timeouts accordingly. Note that this
does not turn off dbus timeouts, rather it ignores timeout
failures. This does mean that ultimately the test will most
likely fail due to a dbus timeout, but it at least gives you
unlimited debugging time.
Two autotests:
1. Tests SA Query procedure when the AP goes down. In this case the AP
goes down ungracefully, now allowing it to send out any deauth
frames. When the AP comes back up, IWD still thinks its connected.
The AP will then send unprotected disassociate frames so the client
can re-connect. This kicks off the SA Query procedure, which the AP
will not respond to. At this point we can deauth and reconnect to
the AP.
2. Test SA Query procedure when a disassociate frame has been spoofed.
In this case we receive an unprotected disassociate frame and start
SA Query. The AP should then respond to the SA query within the
timeout. We then know the frame was spoofed and can remain
connected.
Changed disassociate reason to 0x07 when spoofing a disassociate
frame. This along with 0x06 are the only two reason codes that
should be accepted in an unprotected disassociate frame.
Using the hwsim dbus interface ".Interface" under the radios
object you can now send an arbitrary frame out from that radio.
Two methods have been added, spoof_frame and spoof_disassociate.
The hwsim SendFrame method requires the radio frequency which
is obtained from the hostapd config file. This adds a generic
API to get any config value from the hostapd config, as well
as a get_freq API that converts the channel number to a
frequency.
For testing SA Query, the autotest needs the ablility to force
kill (and restart) hostapd without giving it time to deauth its
stations gracefully. A method was added to the HostapdCLI class
which does a killall -9 hostapd, resets the wlnX interface,
and restarts hostapd with the same arguments as it had before.
The AuthCenter will now wait for the RX thread to start before
continuing with the test.
Also removed the non blocking option and fixed the loop to
handle a blocking recvfrom call.
If the peer detects a sync error, it sends back AUTS. The
authentication center must then re-synchronize and update
the SQN it has saved for the given ISMI.
For testing purposes, it is useful to run hlrauc.py by itself
not including it from another python script like autotests do.
Better error checking was also added as testing can result in
badly formatted data.
We need to reset self._exception after _wait_for_async_op raises an
exception, otherwise _wait_for_async_op will report that exception for
every future operation (this wasn't an issue when an exception always
meant that the test was failing and objects were torn down anyway)
In the beacon loss test try to simulate a periodic communication problem
because we don't support roaming if the AP goes away completely.
2 seconds seems to be enough to consistently trigger the beacon_loss
event without triggering a disconnect by the linux kernel or hiding the AP
from the roam scan. Also set the RSSI for that AP lower so that it is
not reselected by iwd.
Implemented milenage algorithm in hlrauc.py. Unlike EAP-SIM, the
authentication center must compute several values to give back
to the server (hostapd). This was already done by IWD as the peer
in EAP-AKA, but was also needed on the server side (HLR AuC).
Test that the AP interface and the station interface managed by iwd
can actually send and receive ethernet traffic when iwd is in the
connected state. Due to linux routing none of the high level utilities
like ping or arping can be easily used to test communication between
two interfaces of the same machine so use a method based on the
mac80211_hwsim/tools/hwsim_test.c utility in the wpa_supplicant tree
that uses a raw socket to inject unicast and broadcast frames.
Add this check in three tests of different security type connections
that simulate a single AP, and the two roaming tests with two APs.
Check that the station can't communicate with the other AP's interface.
Unfortunately this doesn't currently ensure that the preauthentication
has succeeded and that later the PMKSA from the preauthentication was
used in the transition, only that the preauthentication process doesn't
break the transition. For now this can be confirmed by looking at the
testrunner -v output to see that the line "EAP completed with eapSuccess"
appears before the following line, and not after:
src/device.c:device_enter_state() Old State: connected, new state: roaming
Sometimes iwd will take a while to register its dbus name. The python
class already waits for the name to appear on dbus if iwd is being
launched from python, do this also if iwd was already launched by the
test-runner. My use case was when running iwd under valgrind in which
case it runs slower.
Modify AsyncOpAbstract._wait_for_async_op and
IWD.wait_for_object_condition to call context.iteration() in the
blocking mode instead of calling context.pending() every 0.01s. This
gets rid of busy-waiting and also ensures that the condition is checked
after every single dbus (or other) event. This way a condition that
potentially occurs for less than 0.01s can be reliably waited for.
Make the HostapdCLI class non-static so that each objects corresponds
to a hostapd instance on one network interface (this is independent of
whether the AP instances use one or separate hostapd processes)
APShutdownTest - shutdown the AP after network connection. This
will be replaced when hwsim supports signal
strength modification
InvalidPassphraseTest - Provide an invalid passphrase.
Where the code just needs to find all of the network objects, don't look
for the device objects first because this only adds overhead. With
the structure now having three levels it can be even more confusing,
especially in getCurrentlyConnectedNetworkName where the outer loop
didn't check for the Device interface.
In getNetworkList don't return after the first device's networks are
listed.