Dbus should be started as a multi-test process from the
TestContext, which leaves the dbus address file around for
the full test run. For Namespaces dbus-daemon should be
closed when the Namespace closes.
There were some major problems related to logging and process
output. Tests which required output from start_process would
break if used with '--log/--verbose'. This is because we relied
on 'communicate' to retrieve the process output, but Popen does
not store process output when stdout/stderr are anything other
than PIPE.
Intead, in the case of logging or outfiles, we can simply read
from the file we just wrote to.
For an explicit --verbose application we must handle things
slightly different. A keyword argument was added to Process,
'need_out' which will ensure the process output is kept
regardless of --log or --verbose.
Now a user should be able to use --log/--verbose without any
tests failing.
The verbose arguments come in from the QEMU command line as a
single string. This should have been split into an array immediately
but was not. This led to issues like hostapd debug being enabled
when "-v hostapd_cli" was passed in.
Since the list of files copied to /tmp was part of the return value from
pre_test(), if an exception occurred inside pre_test(), "copied" would
be undefined and the post_test(ctx, copied) call in the finally clause
cause another exception:
raceback (most recent call last):
File "/home/balrog/repos/iwd/tools/test-runner", line 1508, in <module>
run_tests()
File "/home/balrog/repos/iwd/tools/test-runner", line 1242, in run_tests
run_auto_tests(config.ctx, args)
File "/home/balrog/repos/iwd/tools/test-runner", line 1166, in run_auto_tests
post_test(ctx, copied)
UnboundLocalError: local variable 'copied' referenced before assignment
(apart from not being able to clean up the files). Pass "copied" as a
paremeter to pre_test instead.
Use the hwsim DBus API rather than command line. This both is
faster and more dynamic than doing so with the command line.
This also avoids tracking the radio ID since we can just hang
on to the radio Dbus object directly.
The Create() API was limited to only taking a Name and boolean
(for p2p enabling). The actual hwsim nl80211 API can take more
attributes than this (which are actually utilized when creating
from the command line). To get the DBus API up to the same
functionality the two arguments in Create were replaced with
a single dictionary. This allows for extending later if more
arguments are needed.
In the NEW_RADIO callback hwsim was assuming that DBus had no
yet replied to the Create() method. In some cases the NEW_RADIO
event fires before the actual callback which will respond to
DBus. This causes a crash in the create callback.
Starts hwsim but does not register to mac80211_hwsim. This is to
allow autotests to disable hwsim, while still having the ability
to create/destroy radios over DBus.
For better reliability the processor count is now set to qemu.
In cases of low CPU count (< 2) hosts the processor count is
limited to 1. Otherwise half of the host cores will be used for
the VM.
Allow the storage directory (default /tmp/iwd) to be configured
just like the state directory. This is in order to support multiple
IWD instances which require separate storage directories for network
provisioning files.
Our simulated environment was really only meant to test air-to-air
communication by using mac80211_hwsim. Protocols like DHCP use IP
communication which starts to fall apart when using hwsim radios.
Mainly unicast sockets do not work since there is no underlying
network infrastructure.
In order to simulate a more realistic environment network namespaces
are introduced in this patch. This allows wireless phy's to be added
to a network namespace and unique IWD instances manage those phys.
This is done automatically when 'NameSpaces' entries are configured
in hw.conf:
[SETUP]
num_radios=2
[NameSpaces]
ns0=rad1,...
This will create a namespace named ns0, and add rad1 to that
namespace. rad1 will not appear as a phy in what's being called the
'root' namespace (the default namespace).
As far as a test is concerned you can create a new IWD() class and
pass the namespace in. This will start a new IWD instance in that
namespace:
ns0 = ctx.get_namespace('ns0')
wd_ns0 = IWD(start_iwd=True, namespace=ns0)
'wd_ns0' can now be used to interact with IWD in that namespace, just
like any other IWD class object.
Sometimes improperly written tests can end up causing future tests
to fail. For faster debugging you can now add a '+' after a given
autotest which will start that test and run all tests which come
alphabetically after it (as if you are running a full autotest suite).
Example:
./test-runner -A testWPA+
This will run testWPA, testWPA2, testWPA2-no-CCMP, testWPA2-SHA256,
and testWPA2withMFP.
This can result in strange test results since there was no less
than zero checks before subtracting the total tests from failed
tests. In case of an internal exception we can just set all values
to zero. This will be handled specially as we do for timeout
errors.
You can now specify a limited list of subtests to run out of a
full auto-test using --sub-tests,-S. This option is limited in
that it is only meant to be used with a single autotest (since
it doesn't make much sense otherwise).
The subtest can be specified both with or without the file
extension.
Example usage:
./test-runner -A testAP -S failure_test,dhcp_test.py
This will only run the two subtests and exclude any other *.py
tests present in the test directory.
Code was added with commit 04487f575b which passes a radio object
to the Interface class constructor and stores it in the Interface
object. The radio class also stores each Interface object which
creates a circular reference and causes the Radio to stick around
long after the tests finishes.
I cannot see why the Interface needs to keep track of the Radio
object. None of the wpa_supplicant utilities use this so it has
been removed.
Add support for a WPA_SUPPLICANT section in hw.conf where
'radN=<config_path>' lines will only reserve radios and create
interfaces for the autotest to be able to start wpa_supplicant on them,
i.e. this prevents iwd or hostapd from being started on them but doesn't
start a wpa_supplicant instance by itself.
The host systems configuration directories for IWD/EAD were
being mounted in the virtual machine. This required that the
host create these directories before hand. Instead we can
just set up the system and IWD/EAD to use directories in /tmp
that we create when we start the VM. This avoids the need for
any host configuration.
Allow the "hwsim_medium=no" setting in hw.conf's SETUP section to
disable starting hwsim. It looks like the packets going through
userspace add enough latency that active scans don't work, probe
responses don't arrive within the "dwell time" or probe requests are not
ACKed on time. I've tried modifying tools/hwsim.c to respond with the
HWSIM_CMD_TX_INFO_FRAME cmd as the first thing after receiving a
HWSIM_CMD_FRAME and even skipping the queue in ell/genl.c by writing the
command synchronously, but neither helped enough to make the scans work.
This does not rule out that hwsim or the way our scans are done can be
fixed and that would obviously be better than what I did in this patch.
This extends test-runner to also use iwmon if --log is enabled.
For this case the iwmon log will be found inside each test
log directory.
A new option, --monitor <file> was added in case full logging isn't
desired (potentially for timing issues) but a iwmon log is needed.
Be aware that when --monitor is used test-runner will mount the
entire parent directory. test-runner itself will only write to the
file specified, but just know that the parent directory is available
as read-write inside the VM.
--log takes precedence over --monitor, meaning the iwmon log will
be written to <logdir>/<test>/iwmon instead of the file specified
with --monitor if both options are provided.
The virtual environment changed slightly adding two network adatpers
which are connected to the same backend so they can communicate with
each other (basically connected to a switch). The hostapd command
line was modified to allow no interfaces to be passed in which lets
us create zero radios but still specify a radius_config file.
This is just a more concise/pythonic way of doing function arguments.
Since Process/start_process have basically the same argument names
we can simplify and use **kwargs which will pass the named arguments
directly to Process(). This also allows us to add arguments to Process
without touching start_process if we need.
Slower systems may not be able to make some timeouts that tests
mandated. All timeouts were increased significantly to allow tests
to pass on slow systems.
Removed test-runner.c, and renamed py_runner to test-runner. Removed
tools/test-runner from .gitignore.
This was done as a separate commit to avoid a nasty diff between the
existing test runner, and the new python version
This patch completely re-writes test-runner in Python. This was done
because the existing C test-runner had some clunky work arounds and
maintaining or adding new features was starting to become a huge pain.
There were a few aspects of test-runner which continually had to
be dealt with when adding any new functionality:
* Argument parsing: Adding new arguments to test-runner wasn't so
bad, but if you wanted those arguments passed into the VM it
became a huge pain. Arguments needed to be parsed, then re-formatted
into the qemu command line, then re-parsed in a special order
(backwards) once in the VM. The burden for adding new arguments was
quite high so it was avoided (at least by me) at all costs.
* The separation between C and Python: The tests are all written in
python, but the executables, radios, and interfaces were all created
from C. The way we solved this was by encoding the require info as
environment variables, then parsing those from Python. It worked,
but it was, again, a huge pain.
* Process management: It started with all processes being launched
from C, but eventually tests required the ability to start IWD, or
kill hostapd ungracefully in order to test certain functionality.
Since the processes were tracked in C, Python had no way of
signalling that it killed a process and when it started one C had
no idea. This was mitigated (basically by killall), but it was
no where close to an elegant solution.
Re-writing test-runner in python solves all these problems and will
be much easier to maintain.
* Argument parsing: Now all arguments are forwarded automatically
to the VM. The ArgParse library takes care of parsing and each
argument is stored in a dictionary.
* Separation between C and Python: No more C, so no more separation.
* Process management: Python will now manage all processes. This
allows a test to kill, restart, or start a new process and not
have to remember the PID or to kill it after the test.
There are a few more important aspects of the python implementation
that should now be considered when writing new tests:
* The IWD constructor now has different default arugments. IWD
will always be started unless specified and the configuration
directory will always be /tmp
* Any non *.py file in the test directory will be copied to /tmp.
This avoids the need for 'tmpfs_extra_stuff' completely.
* ctrl_interface will automatically be appended to every hostapd
config. There is no need to include this in a config file from
now on.
* Test cleanup is extremely important. All tests get run in the
same interpreter now and the tests themselves are actually loaded
as python modules. This means e.g. if you somehow kept a reference
to IWD() any subsequent tests would not start since IWD is still
running.
* For debugging, the test context can be printed which shows running
processes, radios, and interfaces.
Three non-native python modules were used: PrettyTable, colored, and
pyroute2
$ pip3 install prettytable
$ pip3 install termcolor
$ pip3 install pyroute2
Besides being undefined behaviour, signed integer overflow can cause
unexpected comparison results. In the case of network_rank_compare(),
a connected network with rank INT_MAX would cause newly inserted
networks with negative rank to be inserted earlier in the ordered
network list. This is reflected in the GetOrderedMethods() DBus method
as can be seen in the following iwctl output:
[iwd]# station wlan0 get-networks
Network name Security Signal
----------------------------------------------------
BEOLAN 8021x **** }
BeoBlue psk *** } all unknown,
UI_Test_Network psk *** } hence assigned
deneb_2G psk *** } negative rank
BEOGUEST open **** }
> titan psk ****
Linksys05274_5GHz_dmt psk ****
Lyngby-4G-4 5GHz psk ****
If an application has a bug and hangs on SIGTERM this causes
test-runner to hang as well. This is obviously an issue with
the application in question, but test-runner should have a way
of continuing onto the next test rather than hanging.
Instead we can use WNOHANG and a sleep to allow applications
some amount of time to exit, and if they haven't use SIGKILL
instead as well as print an error. Similar to how
wait_for_socket works. The timeout is hard coded to 2 seconds
(100ms sleep + 20 iterations).
Previously iwmon was running per-test, which would jumble any subtests
together into the same log file making it hard to parse. Now create
a separate directory for each subtest and put the monitor log and
pcap there.
Using mac80211_hwsim can sometimes result in out of order messages
coming from the kernel. Since mac80211_hwsim immediately sends out
frames and the kernel keeps command responses in a separate queue,
bad scheduling can result in these messages being out of order.
In some cases we receive Auth/Assoc frames before the response to
our original CMD_CONNECT. This causes autotests to fail randomly,
some more often than others.
To fix this we can introduce a small delay into hwsim. Just a 1ms
delay makes the random failures disappear in the tests. This delay
is also makes hwsim more realistic since actual hardware will always
introduce some kind of delay when sending or receiving frames.