3
0
mirror of https://git.kernel.org/pub/scm/network/wireless/iwd.git synced 2024-12-29 13:12:50 +01:00
Commit Graph

119 Commits

Author SHA1 Message Date
James Prestwood
2e173d4523 test-runner: fix OOM issues (hopefully)
For quite a while test-runner has run into frequent OOM exceptions when
running many tests in a row. Its not completely known exactly why, but
seems to point to the 9p driver which is used for sharing the root fs
between the test-runner VM and the host.

With debugging enabled (-d) one can see the available memory available
relatively stable. If a test fails it may spike ~3-4kb but this quickly
recovers as python garbage collects.

At some point the kernel faults failing to allocate which (usually) is
shown by a python OOM exception. At this point there is plenty of
available memory.

Dumping the kernel trace its seen that the 9p driver is involved:

[  248.962949] test-runner: page allocation failure: order:7, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[  248.962958] CPU: 2 PID: 477 Comm: test-runner Not tainted 5.16.0 #91
[  248.962960] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
[  248.962961] Call Trace:
[  248.962964]  <TASK>
[  248.962965]  dump_stack_lvl+0x34/0x44
[  248.962971]  warn_alloc.cold+0x78/0xdc
[  248.962975]  ? __alloc_pages_direct_compact+0x14c/0x1e0
[  248.962979]  __alloc_pages_slowpath.constprop.0+0xbfe/0xc60
[  248.962982]  __alloc_pages+0x2d5/0x2f0
[  248.962984]  kmalloc_order+0x23/0x80
[  248.962988]  kmalloc_order_trace+0x14/0x80
[  248.962990]  v9fs_alloc_rdir_buf.isra.0+0x1f/0x30
[  248.962994]  v9fs_dir_readdir+0x51/0x1d0
[  248.962996]  ? __handle_mm_fault+0x6e0/0xb40
[  248.962999]  ? inode_security+0x1d/0x50
[  248.963009]  ? selinux_file_permission+0xff/0x140
[  248.963011]  iterate_dir+0x16f/0x1c0
[  248.963014]  __x64_sys_getdents64+0x7b/0x120
[  248.963016]  ? compat_fillonedir+0x150/0x150
[  248.963019]  do_syscall_64+0x3b/0x90
[  248.963021]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  248.963024] RIP: 0033:0x7fedd7c6d8c7
[  248.963026] Code: 00 00 0f 05 eb b7 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa b8 ff ff ff 7f 48 39 c2 48 0f 47 d0 b8 d9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 81 a5 0f 00 f7 d8 64 89 02 48
[  248.963028] RSP: 002b:00007ffd06cd87e8 EFLAGS: 00000293 ORIG_RAX: 00000000000000d9
[  248.963031] RAX: ffffffffffffffda RBX: 000056090d87dd20 RCX: 00007fedd7c6d8c7
[  248.963032] RDX: 0000000000080000 RSI: 000056090d87dd50 RDI: 000000000000000f
[  248.963033] RBP: 000056090d87dd50 R08: 0000000000000030 R09: 00007fedc7d37af0
[  248.963035] R10: 00007fedc7d7d730 R11: 0000000000000293 R12: ffffffffffffff88
[  248.963038] R13: 000056090d87dd24 R14: 0000000000000000 R15: 000056090d0485e8

Here its seen an allocation of 512k is being requested (order:7), but faults.
In this run it there was ~35MB of available memory on the system.

Available Memory: 35268 kB
Last Test Delta: -2624 kB
Per-test Usage:
[  0] **        			37016
[  1] ********* 			41584
[  2] *         			36280
[  3] ********* 			41452
[  4] ********  			40940
[  5] ******    			39284
[  6] ****      			38348
[  7] ***       			37496
[  8] ****      			37892
[  9]           			35268

This can be reproduced by running all autotests (changing the ram down to
~128MB helps trigger it faster):

./tools/test-runner -k <kernel> -d

After many attempts to fix this it was finally found that simply removing the
explicit 9p2000.u version from the kernel command line 'fixed' the problem.
This even allows decreasing the RAM down to 256MB from 384MB and so far no
OOM's have been seen.
2022-03-28 12:38:15 -05:00
James Prestwood
6ada150026 test-runner: add memory usage for debugging
In debug mode the test context is printed before each test. This
adds some additional information in there:

Available Memory: /proc/meminfo: MemAvailable
Last Test Delta: Change in usage between current and last test
Per-test Usage: Graph of usage relative to all past tests. This is
                useful for seeing a trend down/up of usage.
2022-03-28 12:38:15 -05:00
James Prestwood
064b98e27f test-runner: add option to write final status to file
Running the tests inside a VM makes it difficult for the host to figure
out if the test actually failed or succeeded. For a human its easy to
read the results table, but for an automated system parsing this would
be fragile. This adds a new option --result <file> which writes PASS/FAIL
to the provided file once all tests are completed. Any failures results in
'FAIL' being written to the file.
2022-03-16 17:50:01 -05:00
James Prestwood
69a5ccbe5c test-runner: start iwmon first
This aids in debugging if iwd/hostapd/etc fail to start correctly.
2022-02-25 13:11:37 -06:00
James Prestwood
86cfa25910 test-runner: allow IWD to start with no radios
This is useful for testing hotplug scenarios
2022-02-14 16:02:14 -06:00
James Prestwood
e500511490 test-runner: set --show-leak-kinds=all
This enables leak checks starting in main() which were previously
ignored.
2022-01-19 17:17:26 -06:00
James Prestwood
eb84b2a6e8 test-runner: don't copy __pycache__ in tests
This is created by the python interpreter for speed optimization
but poses problems if copied to /tmp since previous tests may
have already copied it leading to an exception.
2022-01-04 11:40:52 -06:00
James Prestwood
d6d481210e test-runner: only include comitted tests for full test runs
If specific tests are not specified with -A, only run tests tracked by
git for full test runs.
2022-01-04 11:40:52 -06:00
James Prestwood
9fc53cfa7b test-runner: catch exception on test file removal
Without catching this can result in a fatal error, ending the
test run.
2021-12-22 19:10:43 -06:00
Torsten Schmitz
22c77cc037 auto-t: replace ifconfig with ip commands
ifconfig has long been deprecated in favor of ip from iproute2.
It is usually no longer installed by default.
2021-11-11 14:29:54 -06:00
Torsten Schmitz
d9cd657135 auto-t: fix testP2P
testP2P was failing with
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/wpa_supplicant.conf'
2021-11-09 14:26:17 -06:00
James Prestwood
2c7998e8db test-runner: increase test timeout maximum
The OWE transition tests takes quite a while and sometimes hits the
maximum timeout.
2021-11-02 16:15:08 -05:00
James Prestwood
762f9f2533 test-runner: fix qemu warning for newer versions
warning: short-form boolean option 'readonly' deprecated
Please use readonly=on instead
2021-10-28 12:25:52 -05:00
James Prestwood
89407089cd test-runner: handle pyroute2 IW() between versions
It appears different versions of pyroute2 may or may not have
iwutil, and instead use pyroute2.IW() directly. Try the iwutil
way first, then pyroute2.IW()
2021-10-28 12:25:52 -05:00
James Prestwood
4d66e11b0c test-runner: add support for PCI passthrough
Adding back in PCI passthrough support
2021-10-28 12:25:52 -05:00
James Prestwood
39ba0a9ebd test-runner: allow wpa_supplicant to be used in --hw mode
This lets test-runner's physical adapter pass-through to be used with
wpa_supplicant.
2021-10-26 17:16:34 -05:00
James Prestwood
cf0f6ebddf test-runner: set DBUS_SYSTEM_BUS_ADDRESS for --shell
After namespaces were added, the dbus address was customized to
be /tmp/dbus{0..N}. This prevented any dbus applications started
in the shell from working properly.

Set DBUS_SYSTEM_BUS_ADDRESS to the environment prior to entering
the shell.
2021-09-23 17:46:26 -05:00
James Prestwood
11271cd967 test-runner: move process tracking out of Namespace
Since Process.processes is a weak reference dictionary any process
put in this dict will disappear if all references are lost. This
is much better than keeping a list in the Namespace which will hold
the references forever until test-runner manually kills them all at
the end of the test. This does still need to be done for daemon
processes but everything else can just go away when it is no longer
needed.
2021-09-07 12:45:26 -05:00
James Prestwood
92a3d8f498 test-runner: write out separators in log files
The test-runner logging is very basic and just dumps everything into files
per-test. This means any subtests are just appended to existing log files
which can be difficult to parse after the fact. This is especially hard
when IWD/Hostapd runs once for the entirety of the test (as opposed to
killing between tests).

This patch writes out a separator between each subtests in the form:
===== <file>:<function> =====

To do this all processes are now kept as weak references inside the
Process class itself. Process.write_separators() can be called which
will iterate through all running processes and write the provided
separator.

This also paves the way to remove the ctx.processes array which is more
trouble than its worth due to reference issues.

Note: For tests which start IWD this will have no effect as the separator
is written prior to the test running. For these tests though, it is
much easier to read the log files because you can clearly see when
IWD starts and exits.
2021-09-07 12:45:26 -05:00
James Prestwood
ac395525c8 test-runner: use Process to start hostapd
Since the hostapd process object is tracked by the Hostapd class there
is no sense of keeping it in the process list as well.
2021-09-07 12:45:26 -05:00
James Prestwood
165557070e test-runner: fix process cleanup
Processes which were not explicitly killed ended up staying around
forever because they internally held references to other objects
such as GLib IO watches or write FDs.

This shuffles some code so these objects get cleaned up both when
explititly killed and after being waited for.
2021-09-07 12:45:26 -05:00
James Prestwood
920dc5b087 test-runner: don't use start_process for transient processes
Any process which is short lived and  waited for should just use
Process directly as to not add to the process queue.
2021-09-07 12:45:26 -05:00
James Prestwood
7123f47f84 test-runner: remove special case for "root" namespace
This was a placeholder at one point but modules grew to depend on it
being a string. Fix these dependencies and set the root namespace
name to None so there is no more special case needed to handle both
a named namespace and the original 'root' namespace.
2021-09-07 12:45:26 -05:00
James Prestwood
1b44527511 test-runner: fix --monitor option
Two issues:
 - log-gid/uid was not being set
 - the log file was being appended if it already existed
2021-09-03 13:19:46 -05:00
Andrew Zaborowski
c79f583e0c test-runner: Fix verbose check for processes in namespaces
Check whether verbose output is enabled for process name arg[0] before
prepending the "ip netns exec" part to arg since arg[0] is going to be
"ip" after that.
2021-08-31 10:11:01 -05:00
James Prestwood
5de6af9f4a test-runner: only import Hwsim/HostapdCLI once
These modules only needed to be imported a single time for the entire
run of tests. This is significantly cheaper in terms of memory and
should prevent random OOM exceptions.
2021-08-27 20:07:26 -05:00
James Prestwood
cb7e9b5885 auto-t: hostapd.py: properly implement singleton (per-config)
This shouldn't change any functionality but it is much more convenient
in the cleanup path i.e. nothing special needs to be done.
2021-08-27 20:07:26 -05:00
James Prestwood
7d94aee5b6 test-runner: inherit Popen by Process class
The Procss class was doing quite a bit of what Popen already does like
storing the return code and process arguments. In addition the Process
class ended up storing a Popen object which was frequently accessed.

For both simplicity and memory savings have Process inherit Popen and
add the additional functionality test-runner needs like stdout
processing to output files and the console.

To do this Popen.wait() needed to be overridden to to prevent blocking
as well as wait for the HUP signal so we are sure all the process
output was written. kill() was also overritten to perform cleanup.

The most intrusive change was removing wait as a kwarg, and instead
requiring the caller to call wait(). This doesn't change much in
terms of complexity to the caller, but simplifies the __init__
routine of Process.

Some convenient improvements:
 - Separate multiple process instance output (Terminate: <args> will
   be written to outfiles each time a process dies.)
 - Append to outfile if the same process is started again
 - Wait for HUP before returning from wait(). This allows any remaining
   output to be written without the need to manually call process_io.
 - Store ctx as a class variable so callers don't need to pass it in
   (e.g. when using Process directly rather than start_process)
2021-08-26 08:52:48 -05:00
James Prestwood
6f9dd97f50 test-runner: fix subtest option
This was not working correctly and would run all subtests when only a single
was requested
2021-08-18 16:57:38 -05:00
James Prestwood
81cf11b0f5 test-runner: fix exception in Hostapd __del__
If Hostapd did not start there was no CLI instance created and
that member should be be accessed.
2021-08-17 11:44:36 -05:00
James Prestwood
fd43a3938f test-runner: extend -S option
The -S/--sub-tests option allows the user to specify a test file
from inside an autotest. Inside this file there may also be many
test functions. This option is being extended to allow running
a single test function inside a test file. For example:

* Runs all test functions inside connection_test.py *
./test-runner -A some_test -S connection_test

* Runs only connection_test.py test_connect_success() *
./test-runner -A some_test -S connection_test.test_connect_success
2021-08-17 11:44:36 -05:00
James Prestwood
7b98a6ed9c test-runner: only remove GLib timeout if it exists
There was a race condition here where the GLib timeout could have
fired but the test function returned successfully prior to the
end of the while loop. This would end up causing source_remove to
print a warning that the source did not exist.

Instead check if the timeout fired prior to removing it.
2021-08-13 20:39:13 -05:00
James Prestwood
973ee269d3 auto-t: return existing instance from HostapdCLI
This addresses the TODO where HostapdCLI was creating separate
objects each time HostapdCLI was called. This was worked around
by manually setting the important members but instead the class
can be re-worked to act as somewhat of a singleton, per-config
at least.

If there is no HostapdCLI instance for a given config one is
created and initialized. Subsequent HostapdCLI calls (for the
same config) will be returned the same object rather than a
new one.
2021-08-13 20:39:13 -05:00
James Prestwood
dbd8842d51 test-runner: fix test skipping
Tests that called skipTest would result in an exception which would
hault execution as it was uncaught. In addition this wouldn't result
in an skipped test.

Now the actual test run is surrounded in a try/except block, skipped
exceptions are handled specifically, and a stack trace is printed if
some other exception occurs.
2021-08-13 16:44:14 -05:00
Denis Kenzior
6fee25af9b test-runner: Make run 2021-08-13 15:19:28 -05:00
James Prestwood
790bb44785 test-runner: don't always print failure to remove ctrl interface
This can happen, particularly with SAQuery when hostapd is forcefully
killed. Better to only print when running in debug mode.
2021-08-13 15:18:40 -05:00
James Prestwood
c03afda96f test-runner: start dmesg early with --follow
dmesg was being called at the very end of testing and dumped into
a log file. If many tests were run this could take quite a long
time and was timing out the default process wait. Instead --follow
can be used (basically like 'tail') which prints messages as they
come and avoids the time consuming full dump at the end.
2021-08-13 15:18:40 -05:00
James Prestwood
4e3cedc901 test-runner: wait for individual hostapd control sockets 2021-08-13 11:16:19 -05:00
James Prestwood
23cf6cb2e4 test-runner: implement non_block_wait
There was a common bit of code all over test-runner and utilities
which would wait for 'something' in a loop. At best these loops
would do the right thing and use the GLib.iteration call as to not
block the main loop, and at worst would not use it and just busy
wait.

Namespace.non_block_wait unifies all these into a single API to
a) do the wait correctly and b) prevent duplicate code.
2021-08-13 11:16:19 -05:00
James Prestwood
57933425ca test-runner: start HostapdCLI from test-runner
This gives the benefit of test-runner itself having access to
the CLI, e.g. for getting status information.
2021-08-13 10:46:13 -05:00
James Prestwood
ba6c8174d2 test-runner: run individual test functions manually
While losing the convenience of unittest this patch breaks out
each individual test function in order to run it manually and
get results. This vastly improves the user experience by seeing
which test file and function is being executed rather than simply
seeing "PASSED" for the entire test set.

In addition exceptions/failures are printed out as they happen
rather than at the end.
2021-08-13 10:43:47 -05:00
James Prestwood
89bddf551a test-runner: pass **kwargs in dbg() 2021-08-13 10:43:47 -05:00
James Prestwood
c23c37c07c test-runner: print uncaught test exceptions always 2021-08-13 10:43:47 -05:00
James Prestwood
89c1ec3d79 auto-t: make hwsim registration off by default
With the addition of connect_bssid/roam very few tests actually
require hwsim. Since hwsim can lead to problems with scan results
its best to have it off by default and have each test that needs
it explicitly turn it on.

Tests which previously turned it off have had that option removed.

Tests that do require hwsim still are vulnerable to scan result
problems, so for these tests beacon_int was added to the hostapd
config which seems to help with reliability somewhat.
2021-08-12 16:59:01 -05:00
James Prestwood
30158fe89c test-runner: don't run duplicate tests
If the user provides duplicate tests in the list only append one
of them to the list. This can happen accidentally when using
glob matches.
2021-08-12 15:14:07 -05:00
James Prestwood
9a15a46df5 test-runner: don't fatally exit on bad test configuration
If a test has no hw.conf file test-runner was fully exiting and not
running any additional tests. This shouldn't happen in practice
since all upstreamed tests should run, but if any locally created
tests existed like this, it would cause the entire test run to exit
early.

Instead raise an exception which bails out of only that test, and
allows the rest to continue.
2021-08-12 15:14:07 -05:00
James Prestwood
8203b1d542 test-runner: start IWD in developer mode
This is being done to utilize the Roam() developer method
2021-08-12 15:14:07 -05:00
James Prestwood
e2e625fa18 test-runner: refactor process output code
The Process class requires the ability to write out any processes
output to stdout, logging, or an explicit file, as well as store
it inside python for processing by test utilities. To accomplish
this each process was given a temporary file to write to, and that
file had an IO watch set on it. Any data that was written was then
read, and re-written out to where it needed to go. This ended up
being very buggy and quite complex due to needing to mess with
read/write pointers inside the file.

Popen already creates pipes to stdout if told, and they are accessable
via the p.stdout. Its then as simple as setting an IO watch on that
pipe and keeping the same code for reading out new data and writing
it to any files we want. This greatly reduces the complexity.
2021-07-29 17:51:25 -05:00
James Prestwood
52189d99f6 test-runner: don't reset dbus object on reset()
Doing this is redundant since the dbus object is initialized
anyways when a new test starts.
2021-07-29 17:51:25 -05:00
James Prestwood
b5c3fc1222 test-runner: increase RAM to 384MB
Occationally python will fatally terminate trying to load a test
using importlib with an out of memory exception. Increasing RAM
allows reliable exection of all tests.
2021-07-29 17:51:25 -05:00