3
0
mirror of https://git.kernel.org/pub/scm/network/wireless/iwd.git synced 2024-12-23 06:02:37 +01:00
Commit Graph

435 Commits

Author SHA1 Message Date
James Prestwood
25db380833 test-runner: fix kernel panic on exit for UML
UML requires RB_POWER_OFF rather than RB_AUTOBOOT (Qemu) in order
to avoid a kernel panic from killing init.
2022-04-04 09:12:50 -05:00
James Prestwood
31b5275c1f auto-t: hostapd.py: use IO watch for hostapd events
With how fast UML is hostapd events were being sent out prior to
ever calling wait_for_event. Instead set an IO watch on the control
socket and cache all events as they come. Then, when wait_for_event
is called, it can reference this list. If the event is found any
older events are purged from the list.

The AP-ENABLED event needed a special case because hostapd gets
started before the IO watch can be registered. To fix this an
enabled property was added which queries the state directly. This
is checked first, and if not enabled wait_for_event continues normally.
2022-03-31 18:12:59 -05:00
James Prestwood
9e2b0e75b1 test-runner: add time-travel to kernel config
This lets UML work with time-travel[=inf-cpu] options.
2022-03-31 18:12:46 -05:00
James Prestwood
b342dfd8d5 test-runner: don't kill dmesg after individual tests
This prevents any kernel logging from being available after the first
test is finished.
2022-03-31 18:12:43 -05:00
James Prestwood
5a14daf9b8 test-runner: use may_block=True for context iteration (and move location)
This allows the callers condition to be checked immediately without
the mainloop running. In addition may_block=True allows the mainloop
to poll/sleep rather than immediately return back to the caller. This
handles async IO much better than may_block=False, at least for our
use-case.
2022-03-31 18:12:40 -05:00
James Prestwood
54552db7ba test-runner: fix logging for namespaces and pre-test processes
Namespace process logs were appearing under 'ip' (and also overwriting
actual 'ip' logs) since they were executed with 'ip netns exec <namespace>'.
Instead special case this and append '-<namespace>' to the log file name.

In addition processes executed prior to any tests were being put under
a folder (name of testhome directory). Now this case is detected and these
logs are put at the top level log directory.
2022-03-31 18:12:37 -05:00
James Prestwood
b5df2e27be test-runner: add initial UmlRunner implementation
This allows test-runner to run inside a UML binary which has some
advantages, specifically time-travel/infinite CPU speed. This should
fix any scheduler related failures we have on slower systems.

Currently this runner does not suppor the same features as the Qemu
runner, specifically:

 - No hardware passthrough
 - No logging/monitor (UML -> host mounting isn't implemented yet)
2022-03-31 18:12:34 -05:00
James Prestwood
2894f2e3eb test-runner: rename test-runner, add run-tests
In order to keep all test-runner dev scripts working and to work with
the new runner.py system some file renaming was required.

test-runner was renamed to run-tests
A new test-runner was added which only creates the Runner() class.
2022-03-31 18:12:31 -05:00
James Prestwood
8fa2b7de45 test-runner: remove environment specific code
This removes all the Qemu/environment related code as this has been
moved into runner.py.
2022-03-31 18:11:13 -05:00
James Prestwood
e753e867f3 test-runner: Move environment setup into own module
This (as well as subsequent commits) will separate test-runner into two
parts:

1. Environment setup
2. Running tests

Spurred by interest in adding UML/host support, test-runner was in need
of a refactor to separate out the environment setup and actually running
the tests.

The environment (currently only Qemu) requires quite a bit of special
handling (ctypes mounting/reboot, 9p mounts, tons of kernel options etc)
which nobody writing tests should need to see or care about. This has all
been moved into 'runner.py'.

Running the tests (inside test-runner) won't change much.

The new 'runner.py' module adds an abstraction class which allows different
Runner's to be implemented, and setup their own environment as they see
fit. This is in preparation for UML and Host runners.
2022-03-31 18:11:09 -05:00
James Prestwood
f97b53608d tools: add UML specific options to the kernel config 2022-03-30 15:25:53 -05:00
James Prestwood
8d5e64e90d tools: add some required options to kernel config
It looks like some architectures defconfig were adding these in
automatically, but not others. Explicitly add these to make sure
the kernel is built correctly.
2022-03-30 15:25:51 -05:00
Andrew Zaborowski
45f86d7148 test-runner: Replace exit with sys.exit
exit comes from the site module which is "useful for the interactive
interpreter shell and should not be used in programs."
(https://docs.python.org/3/library/constants.html#constants-added-by-the-site-module)
Replace with sys.exit().  I for an undefined error for exit in
exit_vm().
2022-03-30 14:43:49 -05:00
Andrew Zaborowski
83299ef6aa test-runner: Don't require SUDO_GID to be set for logs
Base the root user check on os.getuid() instead of SUDO_GID so as not to
implicitly require sudo.  SUDO_GID being set doesn't guarantee that the
effective user is root either since you can sudo to non-root accounts.
2022-03-30 14:43:46 -05:00
Andrew Zaborowski
0201cde7ce test-runner: Fix checks in exit_vm
We check that config is not None but then access config.ctx outside of
that if block anyway.  Then we do the same for config.ctx and
config.ctx.args.  Nest the if blocks for the checks to be useful.
2022-03-30 14:43:44 -05:00
James Prestwood
2e173d4523 test-runner: fix OOM issues (hopefully)
For quite a while test-runner has run into frequent OOM exceptions when
running many tests in a row. Its not completely known exactly why, but
seems to point to the 9p driver which is used for sharing the root fs
between the test-runner VM and the host.

With debugging enabled (-d) one can see the available memory available
relatively stable. If a test fails it may spike ~3-4kb but this quickly
recovers as python garbage collects.

At some point the kernel faults failing to allocate which (usually) is
shown by a python OOM exception. At this point there is plenty of
available memory.

Dumping the kernel trace its seen that the 9p driver is involved:

[  248.962949] test-runner: page allocation failure: order:7, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[  248.962958] CPU: 2 PID: 477 Comm: test-runner Not tainted 5.16.0 #91
[  248.962960] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
[  248.962961] Call Trace:
[  248.962964]  <TASK>
[  248.962965]  dump_stack_lvl+0x34/0x44
[  248.962971]  warn_alloc.cold+0x78/0xdc
[  248.962975]  ? __alloc_pages_direct_compact+0x14c/0x1e0
[  248.962979]  __alloc_pages_slowpath.constprop.0+0xbfe/0xc60
[  248.962982]  __alloc_pages+0x2d5/0x2f0
[  248.962984]  kmalloc_order+0x23/0x80
[  248.962988]  kmalloc_order_trace+0x14/0x80
[  248.962990]  v9fs_alloc_rdir_buf.isra.0+0x1f/0x30
[  248.962994]  v9fs_dir_readdir+0x51/0x1d0
[  248.962996]  ? __handle_mm_fault+0x6e0/0xb40
[  248.962999]  ? inode_security+0x1d/0x50
[  248.963009]  ? selinux_file_permission+0xff/0x140
[  248.963011]  iterate_dir+0x16f/0x1c0
[  248.963014]  __x64_sys_getdents64+0x7b/0x120
[  248.963016]  ? compat_fillonedir+0x150/0x150
[  248.963019]  do_syscall_64+0x3b/0x90
[  248.963021]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  248.963024] RIP: 0033:0x7fedd7c6d8c7
[  248.963026] Code: 00 00 0f 05 eb b7 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa b8 ff ff ff 7f 48 39 c2 48 0f 47 d0 b8 d9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 81 a5 0f 00 f7 d8 64 89 02 48
[  248.963028] RSP: 002b:00007ffd06cd87e8 EFLAGS: 00000293 ORIG_RAX: 00000000000000d9
[  248.963031] RAX: ffffffffffffffda RBX: 000056090d87dd20 RCX: 00007fedd7c6d8c7
[  248.963032] RDX: 0000000000080000 RSI: 000056090d87dd50 RDI: 000000000000000f
[  248.963033] RBP: 000056090d87dd50 R08: 0000000000000030 R09: 00007fedc7d37af0
[  248.963035] R10: 00007fedc7d7d730 R11: 0000000000000293 R12: ffffffffffffff88
[  248.963038] R13: 000056090d87dd24 R14: 0000000000000000 R15: 000056090d0485e8

Here its seen an allocation of 512k is being requested (order:7), but faults.
In this run it there was ~35MB of available memory on the system.

Available Memory: 35268 kB
Last Test Delta: -2624 kB
Per-test Usage:
[  0] **        			37016
[  1] ********* 			41584
[  2] *         			36280
[  3] ********* 			41452
[  4] ********  			40940
[  5] ******    			39284
[  6] ****      			38348
[  7] ***       			37496
[  8] ****      			37892
[  9]           			35268

This can be reproduced by running all autotests (changing the ram down to
~128MB helps trigger it faster):

./tools/test-runner -k <kernel> -d

After many attempts to fix this it was finally found that simply removing the
explicit 9p2000.u version from the kernel command line 'fixed' the problem.
This even allows decreasing the RAM down to 256MB from 384MB and so far no
OOM's have been seen.
2022-03-28 12:38:15 -05:00
James Prestwood
6ada150026 test-runner: add memory usage for debugging
In debug mode the test context is printed before each test. This
adds some additional information in there:

Available Memory: /proc/meminfo: MemAvailable
Last Test Delta: Change in usage between current and last test
Per-test Usage: Graph of usage relative to all past tests. This is
                useful for seeing a trend down/up of usage.
2022-03-28 12:38:15 -05:00
James Prestwood
064b98e27f test-runner: add option to write final status to file
Running the tests inside a VM makes it difficult for the host to figure
out if the test actually failed or succeeded. For a human its easy to
read the results table, but for an automated system parsing this would
be fragile. This adds a new option --result <file> which writes PASS/FAIL
to the provided file once all tests are completed. Any failures results in
'FAIL' being written to the file.
2022-03-16 17:50:01 -05:00
James Prestwood
69a5ccbe5c test-runner: start iwmon first
This aids in debugging if iwd/hostapd/etc fail to start correctly.
2022-02-25 13:11:37 -06:00
James Prestwood
1de7ef0afd tools: change print to %zd for ssize_t
iwd-decrypt-profile was using %ld which isn't portable.
2022-02-24 12:14:42 -06:00
James Prestwood
cd3857f810 hwsim: check if radio name was already set
This was caught by static analysis and shouldn't ever happen.
2022-02-18 14:44:12 -06:00
James Prestwood
4ebc79c466 hwsim: allow concurrent radio creations
Currently CreateRadio only allows a single outstanding DBus message
until the radio is fully created. 99% of the time this is just fine
but in order to test dual phy cards there needs to be support for
phy's appearing at the same time.

This required storing the pending DBus message inside the radio object
rather than a single static variable.

The code was refactored to handle the internal radio info objects better
for the various cases:
 - Creation from CreateRadio()
 - Radio already existed before hwsim started, or created externally
 - Existing radio changed name, address, etc.

First, Name is now a required option to CreateRadio(). This allows
the radio info to be pushed to the queue immediately (also allowing the
pending DBus message to be tracked). Then, when the NEW_RADIO event
fires the pending radio can be looked up (by name) and filled with the
remaining info.

If the radio was not found by name but a matching ID was found this is
the 'changed' case and the radio is re-initialized with the changed
values.

If neither name or ID matches the radio was created externally, or
prior to hwsim starting. A radio info object is created at this time
and initialized.

The ID was changed to a signed integer in order to initialize it to an
invalid number -1. Doing this was required since a pending uninitalized
radio ID (0) could match an existing radio ID. This required some
bounds checks in case the kernels counter reaches an extremely high value.
This isn't likely to ever happen in practice.
2022-02-16 16:20:43 -06:00
James Prestwood
15b5385e71 tools: add decrypt-profile tool
This tool will decrypt an IWD network profile which was previously
encrypted using a systemd provided key. Either a text passphrase
can be provided (--pass) or a file containing the secret (--file).

This can be useful for debugging, or recovering an encrypted
profile after enabling SystemdEncrypt.
2022-02-16 16:10:55 -06:00
James Prestwood
b1c4a505b2 hwsim: don't print on when send frame fails
This happens quite often and spams the console with this error.
2022-02-14 16:03:51 -06:00
James Prestwood
86cfa25910 test-runner: allow IWD to start with no radios
This is useful for testing hotplug scenarios
2022-02-14 16:02:14 -06:00
James Prestwood
e500511490 test-runner: set --show-leak-kinds=all
This enables leak checks starting in main() which were previously
ignored.
2022-01-19 17:17:26 -06:00
James Prestwood
eb84b2a6e8 test-runner: don't copy __pycache__ in tests
This is created by the python interpreter for speed optimization
but poses problems if copied to /tmp since previous tests may
have already copied it leading to an exception.
2022-01-04 11:40:52 -06:00
James Prestwood
d6d481210e test-runner: only include comitted tests for full test runs
If specific tests are not specified with -A, only run tests tracked by
git for full test runs.
2022-01-04 11:40:52 -06:00
Denis Kenzior
b7f873bbbc hwsim: Optimize frame forwarding
Right now hwsim blindly tries to forward broadcast/multicast frames to
all interfaces it knows about and relies on the kernel to reject the
forwarding attempt if the frequency does not match.  This results in
multiple copies of the same message being added to the genl transmit
queue.

On slower systems this can cause a run-away memory consumption effect
where the queued messages are not processed in time prior to a new
message being received for forwarding.  The likelyhood of this effect
manifesting itself is directly related to the number of hostapd
instances that are created and are beaconing simultaneously.

Try to optimize frame forwarding by not sending beacon frames
to those interfaces that are in AP mode (i.e. pure hostapd instances)
since such interfaces are going to be operating on a different frequency
and would not be interested in processing beacon frames anyway.

This optimization cuts down peak memory use during certain tests by 30x
or more (~33mb to ~1mb) when profiled with 'valgrind --tool=massif'
2021-12-27 23:25:24 -06:00
Denis Kenzior
ea3fd01ebb hwsim: Use nl80211_parse_attrs
Simplify the code by using nl80211_parse_attrs utility instead of open
coding the attribute parsing.
2021-12-27 23:25:24 -06:00
Denis Kenzior
5333207207 hwsim: Pretty-print command name
Instead of just printing the command id, print the human readable name.
2021-12-27 23:25:24 -06:00
Denis Kenzior
1dcab170b6 hwsim: Keep track of interface types 2021-12-27 23:25:24 -06:00
Denis Kenzior
d676f159d3 hwsim: Enable debug output 2021-12-27 23:25:24 -06:00
James Prestwood
9fc53cfa7b test-runner: catch exception on test file removal
Without catching this can result in a fatal error, ending the
test run.
2021-12-22 19:10:43 -06:00
Torsten Schmitz
22c77cc037 auto-t: replace ifconfig with ip commands
ifconfig has long been deprecated in favor of ip from iproute2.
It is usually no longer installed by default.
2021-11-11 14:29:54 -06:00
Torsten Schmitz
d9cd657135 auto-t: fix testP2P
testP2P was failing with
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/wpa_supplicant.conf'
2021-11-09 14:26:17 -06:00
James Prestwood
2c7998e8db test-runner: increase test timeout maximum
The OWE transition tests takes quite a while and sometimes hits the
maximum timeout.
2021-11-02 16:15:08 -05:00
James Prestwood
762f9f2533 test-runner: fix qemu warning for newer versions
warning: short-form boolean option 'readonly' deprecated
Please use readonly=on instead
2021-10-28 12:25:52 -05:00
James Prestwood
89407089cd test-runner: handle pyroute2 IW() between versions
It appears different versions of pyroute2 may or may not have
iwutil, and instead use pyroute2.IW() directly. Try the iwutil
way first, then pyroute2.IW()
2021-10-28 12:25:52 -05:00
James Prestwood
4d66e11b0c test-runner: add support for PCI passthrough
Adding back in PCI passthrough support
2021-10-28 12:25:52 -05:00
James Prestwood
39ba0a9ebd test-runner: allow wpa_supplicant to be used in --hw mode
This lets test-runner's physical adapter pass-through to be used with
wpa_supplicant.
2021-10-26 17:16:34 -05:00
James Prestwood
cf0f6ebddf test-runner: set DBUS_SYSTEM_BUS_ADDRESS for --shell
After namespaces were added, the dbus address was customized to
be /tmp/dbus{0..N}. This prevented any dbus applications started
in the shell from working properly.

Set DBUS_SYSTEM_BUS_ADDRESS to the environment prior to entering
the shell.
2021-09-23 17:46:26 -05:00
James Prestwood
3f4cafe135 hwsim: add MatchBytes/MatchBytesOffset rule properties
If set, a rule will start matching 'MatchBytes' some number of bytes
into the frame (MatchBytesOffset). This is useful since header
information, addresses, and sequence numbers may be unpredictable
between test runs.

To avoid unintended matches the Prefix property is left unchanged
and will match starting at the beginning of the frame.
2021-09-09 16:57:38 -05:00
James Prestwood
bedd20b08e hwsim: add DropAck rule property
The hwsim rules did not treat frames and ACKs any differently which
can mislead the developer especially when setting a rule prefix.
If a prefix was used the frame ACK was actually being matched against
the original frame payload which seems wrong because the ACK is not
the original frame.

Though strange, matching the frame prefix on an ACK has its place if
the developer wants to block just the ACK rather than the frame so
to make this case more clear 'DropAck' was added as a rule property.
And only if this is true will an ACK be checked and potentially
dropped.

To maintain the current hwsim behavior DropAck will default to true.
2021-09-07 19:02:54 -05:00
James Prestwood
f2197fa06b hwsim: add MatchTimes property
This integer property can be set to only match a rule a number of
times rather than all packets. This is useful for testing behavior
of a single dropped frame or ack. Once the rule has been matched
'MatchTimes' the rules will no longer be applied (unless set again
to some integer greater than zero).
2021-09-07 16:32:42 -05:00
James Prestwood
11271cd967 test-runner: move process tracking out of Namespace
Since Process.processes is a weak reference dictionary any process
put in this dict will disappear if all references are lost. This
is much better than keeping a list in the Namespace which will hold
the references forever until test-runner manually kills them all at
the end of the test. This does still need to be done for daemon
processes but everything else can just go away when it is no longer
needed.
2021-09-07 12:45:26 -05:00
James Prestwood
92a3d8f498 test-runner: write out separators in log files
The test-runner logging is very basic and just dumps everything into files
per-test. This means any subtests are just appended to existing log files
which can be difficult to parse after the fact. This is especially hard
when IWD/Hostapd runs once for the entirety of the test (as opposed to
killing between tests).

This patch writes out a separator between each subtests in the form:
===== <file>:<function> =====

To do this all processes are now kept as weak references inside the
Process class itself. Process.write_separators() can be called which
will iterate through all running processes and write the provided
separator.

This also paves the way to remove the ctx.processes array which is more
trouble than its worth due to reference issues.

Note: For tests which start IWD this will have no effect as the separator
is written prior to the test running. For these tests though, it is
much easier to read the log files because you can clearly see when
IWD starts and exits.
2021-09-07 12:45:26 -05:00
James Prestwood
ac395525c8 test-runner: use Process to start hostapd
Since the hostapd process object is tracked by the Hostapd class there
is no sense of keeping it in the process list as well.
2021-09-07 12:45:26 -05:00
James Prestwood
165557070e test-runner: fix process cleanup
Processes which were not explicitly killed ended up staying around
forever because they internally held references to other objects
such as GLib IO watches or write FDs.

This shuffles some code so these objects get cleaned up both when
explititly killed and after being waited for.
2021-09-07 12:45:26 -05:00
James Prestwood
920dc5b087 test-runner: don't use start_process for transient processes
Any process which is short lived and  waited for should just use
Process directly as to not add to the process queue.
2021-09-07 12:45:26 -05:00