Handle situations where the BSS we're trying to connect to is no longer
in the kernel scan result cache. Normally, the kernel will re-scan the
target frequency if this happens on the CMD_CONNECT path, and retry the
connection.
Unfortunately, CMD_AUTHENTICATE path used for WPA3, OWE and FILS does
not have this scanning behavior. CMD_AUTHENTICATE simply fails with
a -ENOENT error. Work around this by trying a limited scan of the
target frequency and re-trying CMD_AUTHENTICATE once.
In some cases the AP can send a deauthenticate frame right after
accepting our authentication. In this case the kernel never properly
sends a CMD_CONNECT event with a failure, even though CMD_COONNECT was
used to initiate the connection. Try to work around that by detecting
that a Deauthenticate event arrives prior to any Associte or Connect
events and handle this case as a connect failure.
netdev_shutdown calls queue_destroy on the netdev_list, which in turn
calls netdev_free. netdev_free invokes the watches to notify them about
the netdev being removed. Those clients, or anything downstream can
still invoke netdev_find. Unfortunately queue_destroy is not re-entrant
safe, so netdev_find might return stale data. Fix that by using
l_queue_peek_head / l_queue_pop_head instead.
src/station.c:station_enter_state() Old State: connecting, new state:
connected
^CTerminate
src/netdev.c:netdev_free() Freeing netdev wlan1[6]
src/device.c:device_free()
Removing scan context for wdev 100000001
src/scan.c:scan_context_free() sc: 0x4ae9ca0
src/netdev.c:netdev_free() Freeing netdev wlan0[48]
src/device.c:device_free()
src/station.c:station_free()
src/netconfig.c:netconfig_destroy()
==103174== Invalid read of size 8
==103174== at 0x467AA9: l_queue_find (queue.c:346)
==103174== by 0x43ACFF: netconfig_reset (netconfig.c:1027)
==103174== by 0x43AFFC: netconfig_destroy (netconfig.c:1123)
==103174== by 0x414379: station_free (station.c:3369)
==103174== by 0x414379: station_destroy_interface (station.c:3466)
==103174== by 0x47C80C: interface_instance_free (dbus-service.c:510)
==103174== by 0x47C80C: _dbus_object_tree_remove_interface
(dbus-service.c:1694)
==103174== by 0x47C99C: _dbus_object_tree_object_destroy
(dbus-service.c:795)
==103174== by 0x409A87: netdev_free (netdev.c:770)
==103174== by 0x4677AE: l_queue_clear (queue.c:107)
==103174== by 0x4677F8: l_queue_destroy (queue.c:82)
==103174== by 0x40CDC1: netdev_shutdown (netdev.c:5089)
==103174== by 0x404736: iwd_shutdown (main.c:78)
==103174== by 0x404736: iwd_shutdown (main.c:65)
==103174== by 0x46BD61: handle_callback (signal.c:78)
==103174== by 0x46BD61: signalfd_read_cb (signal.c:104)
In the case of module_init failing due to a module that comes after
netdev, the netdev module doesn't clean up netdev_list properly.
==6254== 24 bytes in 1 blocks are still reachable in loss record 1 of 1
==6254== at 0x483777F: malloc (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6254== by 0x4675ED: l_malloc (util.c:61)
==6254== by 0x46909D: l_queue_new (queue.c:63)
==6254== by 0x406AE4: netdev_init (netdev.c:5038)
==6254== by 0x44A7B3: iwd_modules_init (module.c:152)
==6254== by 0x404713: nl80211_appeared (main.c:171)
==6254== by 0x4713DE: process_unicast (genl.c:993)
==6254== by 0x4713DE: received_data (genl.c:1101)
==6254== by 0x46E00B: io_callback (io.c:118)
==6254== by 0x46D20C: l_main_iterate (main.c:477)
==6254== by 0x46D2DB: l_main_run (main.c:524)
==6254== by 0x46D2DB: l_main_run (main.c:506)
==6254== by 0x46D502: l_main_run_with_signal (main.c:656)
==6254== by 0x403EDB: main (main.c:490)
Fix an issue with the recent changes to signal monitoring from commit
f456501b ("station: retry roaming unless notified of a high RSSI"):
1. driver sends NL80211_CQM_RSSI_THRESHOLD_EVENT_LOW
2. netdev->cur_rssi_low changes from FALSE to TRUE
3. netdev sends NETDEV_EVENT_RSSI_THRESHOLD_LOW to station
4. on roam reassociation, cur_rssi_low is reset to FALSE
5. station still assumes RSSI is low, periodically roams
until netdev sends NETDEV_EVENT_RSSI_THRESHOLD_HIGH
6. driver sends NL80211_CQM_RSSI_THRESHOLD_EVENT_HIGH
7. netdev->cur_rssi_low doesn't change (still FALSE)
8. netdev never sends NETDEV_EVENT_RSSI_THRESHOLD_HIGH
9. station remains stuck in an infinite roaming loop
The commit in question introduced the logic in (5). Previously the
assumption in station was - like in netdev - that if the signal was
still low, the driver would send a duplicate LOW event after
reassociation. This change makes netdev follow the same new logic as
station, i.e. assume the same signal state (LOW/HIGH) until told
otherwise by the driver.
With AP now getting its own diagnostic interface it made sense
to move the netdev_station_info struct definition into its own
header which eventually can be accompanied by utilities in
diagnostic.c. These utilities can then be shared with AP and
station as needed.
This is a nl80211 dump version of netdev_get_station aimed at
AP mode. This will dump all stations, parse into
netdev_station_info structs, and call the callback for each
individual station found. Once the dump is completed the destroy
callback is called.
This adds a generalized API for GET_STATION. This API handles
calling and parsing the results into a new structure,
netdev_station_info. This results structure will hold any
data needed by consumers of netdev_get_station. A helper API
(netdev_get_current_station) was added as a convenience which
automatically passes handshake->aa as the MAC.
For now only the RSSI is parsed as this is already being
done for RSSI polling/events. Looking further more info will
be added such as rx/tx rates and estimated throughput.
In both FT or FILS EAPoL isn't used for the initial handshake and only
for the later re-keys. For FT we added the
eapol_sm_set_require_handshake mechanism to tell EAPoL to not require
the initial handshake and we can re-use it for FILS.
build_cmd_ft_authenticate and build_cmd_authenticate were virtually
identical. These have been unified into a single builder.
We were also incorrectly including ATTR_IE to every authenticate
command, which violates the spec for certain protocols, This was
removed and any auth protocols will now add any IEs that they require.
In this situation the kernel is sending a low RSSI event which netdev
picks up, but since we set netdev->connected so early the event is
forwarded to station before IWD has fully connected. Station then
tries to get a neighbor report, which may fail and cause a known
frequency scan. If this is a new network the frequency scan tries to
get any known frequencies in network_info which will be unset and
cause a segfault.
This can be avoided by only sending RSSI events when netdev->operational
is set rather than netdev->connected.
It seems some APs send the IGTK key in big endian format (it is a
uin16). The kernel rightly reports an -EINVAL error when iwd issues a
NEW_KEY with such a value, resulting in the connection being aborted.
Work around this by trying to detect big-endian key indexes and 'fixing'
them up.
In order to support AlwaysRandomizeAddress and AddressOverride, station will
set the desired address into the handshake object. Then, netdev checks if
this was done and will use that address rather than generate one.
For privacy reasons its advantageous to randomize or mask
the MAC address when connecting to networks, especially public
networks.
This patch allows netdev to generate a new MAC address on a
per-network basis. The generated MAC will remain the same when
connecting to the same network. This allows reauthentications
or roaming to work, and not have to fully re-connect (which would
be required if the MAC changed on every connection).
Changing the MAC requires bringing the interface down. This does
lead to potential race conditions with respect to external
processes. There are two potential conditions which are explained
in a TODO comment in this patch.
This API was updated to take an extra boolean which will
automatically power up the device while changing the MAC
address. Since this is what IWD does anyways we can avoid
the need for an intermediate callback and go right into
netdev_initial_up_cb.
mac80211 drivers seem to send the disconnect event which is triggered by
CMD_DISCONNECT prior to the CMD_DISCONNECT response. However, some
drivers, namely brcmfmac, send the response first and then send the
disconnect event. This confused iwd when a connection was immediately
triggered after a disconnection (network switch operation).
Fix this by making sure that connected variable isn't set until the
connect event is actually processed, and ignore disconnect events which
come after CMD_DISCONNECT has alredy succeeded.
Allow netdev_create_from_genl callers to draw a random or non-random MAC
and pass it in the parameter instead of a bool to tell us to generating
the MAC locally. In P2P we are generating the MAC some time before
creating the netdev in order to pass it to the peer during negotiation.
It seems that the kernel uses -EOPNOTSUPP if the change_station
operation is not implemented by the driver. However, some drivers do
implement change_station and choose to report -ENOTSUPP instead of
-EOPNOTSUPP.
To add to the confusion, EOPNOTSUPP and -ENOTSUPP are the same on some
systems (e.g. Gentoo). Be paranoid and allow both errors to be ignored
when sending CMD_SET_STATION.
Fixes: 0238ffb8d9 ("netdev: Use -EOPNOTSUPP instead of -ENOTSUPP")
The kernel uses -EOPNOTSUPP in the case of change_station operation not
being provided. On most systems -EOPNOTSUPP is defined to be the same
as -ENOTSUPP, but seemingly not all systems.
Commit 1057d8aa74 changed the device interface creation logic
from being unconditional inside netdev.c to instead use NETDEV_WATCH_*
events. However, this broke the assumption that the device interface
was created before all others. The effect is that the scan_wdev_add
might no longer be called prior to station interface being created. Fix
this by moving scan_wdev_add/remove calls to netdev.c instead.
Fixes: 1057d8aa74 ("device: Move device creation from netdev.c to event watch")
Create and destroy the device state struct and the DBus interfaces in a
way more similar to the Station, AdHoc and AP interfaces. Drop
netdev_get_device() and the device specific code in netdev that as far
as I can tell wasn't needed.
netdev_connect can achieve the same effect as netdev_connect_wsc but is
more flexible as it allows us to supply additional association IEs. We
will need this capability to make P2P connections. This way we're also
moving the WSC-specific bits to wsc.c from the crowded netdev.c.
Convert the handshake event callback type to use variable argument
list to allow for more flexibility in event-specific arguments
passed to the callbacks.
Note the uint16_t reason code is promoted to an int when using variable
arguments so va_arg(args, int) has to be used.
Since iwd_modules_init is now defered until nl80211_appeared, we can
assume the nl80211 object is available. This removes the need for
netdev_set_nl80211 completely.
The QoS Map can come in either as a management frame or via the
Associate Response. In either case this IE simply needs to be
forwarded back to the kernel.
When performing a fast transition to another OPEN network the RSN
element won't be there and therefore the bss->rsne is gonna be NULL.
Fix crash by not accessing the rsne member when performing a fast
transition to an AP that doe snot advertise any RSN IE.
Crash caught with gdb:
src/station.c:station_transition_start() 186, target 34:8f:27:2f:b8:fc
Program received signal SIGSEGV, Segmentation fault.
handshake_state_set_authenticator_ie (s=0x555555626eb0, ie=0x0) at src/handshake.c:163
163 s->authenticator_ie = l_memdup(ie, ie[1] + 2u);
(gdb) bt
#0 handshake_state_set_authenticator_ie (s=0x555555626eb0, ie=0x0) at src/handshake.c:163
#1 0x0000555555561a98 in fast_transition (netdev=0x55555562fbe0, target_bss=0x55555561f4a0,
over_air=over_air@entry=true, cb=0x55555556d5b0 <station_fast_transition_cb>) at src/netdev.c:3164
#2 0x0000555555565dfd in netdev_fast_transition (netdev=<optimized out>, target_bss=<optimized out>,
cb=<optimized out>) at src/netdev.c:3232
#3 0x000055555556ccbd in station_transition_start (bss=0x55555561f4a0, station=0x555555617da0)
at src/station.c:1261
#4 station_roam_scan_notify (err=<optimized out>, bss_list=<optimized out>, userdata=0x555555617da0)
at src/station.c:1444
#5 0x0000555555579560 in scan_finished (sc=0x55555562bf80, err=err@entry=0, bss_list=0x55555561bd90,
sr=0x555555626b30, wiphy=<optimized out>) at src/scan.c:1234
#6 0x0000555555579620 in get_scan_done (user=0x555555618920) at src/scan.c:1264
#7 0x00005555555abd23 in destroy_request (data=0x55555561b000) at ell/genl.c:673
#8 0x00005555555ac129 in process_unicast (nlmsg=0x7fffffffc310, genl=0x55555560b7a0) at ell/genl.c:940
#9 received_data (io=<optimized out>, user_data=0x55555560b7a0) at ell/genl.c:1039
#10 0x00005555555a8aa3 in io_callback (fd=<optimized out>, events=1, user_data=0x55555560b840)
at ell/io.c:126
#11 0x00005555555a7ccd in l_main_iterate (timeout=<optimized out>) at ell/main.c:473
#12 0x00005555555a7d9c in l_main_run () at ell/main.c:520
#13 l_main_run () at ell/main.c:502
#14 0x00005555555a7fac in l_main_run_with_signal (callback=<optimized out>, user_data=0x0)
at ell/main.c:642
#15 0x000055555555e5b8 in main (argc=<optimized out>, argv=<optimized out>) at src/main.c:519
A not-yet-merged kernel patch will enable the FRAME_WAIT_CANCEL
event to be emitted when a CMD_FRAME duration expires. This can
shortcut the ridiculously long timeout that is required making
GAS requests with no response drastically quicker to handle.
This adds a new API netdev_anqp_request which will send out a GAS
request, parses the GAS portion of the response and forwards the
ANQP response to the callers callback.
The handshake object had 4 setters for authenticator/supplicant IE.
Since the IE ultimately gets put into the same buffer, there really
only needs to be a single setter for authenticator/supplicant. The
handshake object can deal with parsing to decide what kind of IE it
is (WPA or RSN).
This adds some checks for the FT_OVER_FILS AKMs in station and netdev
allowing the FILS-FT AKMs to be selected during a connection.
Inside netdev_connect_event we actually have to skip parsing the IEs
because FILS itself takes care of this (needs to handle them specially)
FT over FILS-SHA384 uses a 24 byte FT MIC rather than the 16 byte MIC
used for all other AKMs. This change allows both the FT builder/parser
to handle both lengths of MIC. The mic length is now passed directly
into ie_parse_fast_bss_transition and ie_build_fast_bss_transition
ifaddr is not guaranteed to be initialized, I'm not sure why there was
no compiler warning. Also replace a | with a || for boolean conditions
and merge the wiphy check with that line.
FT-over-DS is a way to do a Fast BSS Transition using action frames for
the authenticate step. This allows a station to start a fast transition
to a target AP while still being connected to the original AP. This,
in theory, can result in less carrier downtime.
The existing ft_sm_new was removed, and two new constructors were added;
one for over-air, and another for over-ds. The internals of ft.c mostly
remain the same. A flag to distinguish between air/ds was added along
with a new parser to parse the action frames rather than authenticate
frames. The IE parsing is identical.
Netdev now just initializes the auth-proto differently depending on if
its doing over-air or over-ds. A new TX authenticate function was added
and used for over-ds. This will send out the IEs from ft.c with an
FT Request action frame.
The FT Response action frame is then recieved from the AP and fed into
the auth-proto state machine. After this point ft-over-ds behaves the
same as ft-over-air (associate to the target AP).
Some simple code was added in station.c to determine if over-air or
over-ds should be used. FT-over-DS can be beneficial in cases where the
AP is directing us to roam, or if the RSSI falls below a threshold.
It should not be used if we have lost communication to the AP all
(beacon lost) as it only works while we can still talk to the original
AP.
To support FT-over-DS this API needed some slight modifications:
- Instead of setting the DA to netdev->handshake->aa, it is just set to
the same address as the 'to' parameter. The kernel actually requires
and checks for these addresses to match. All occurences were passing
the handshake->aa anyways so this change should have no adverse
affects; and its actually required by ft-over-ds to pass in the
previous BSSID, so hard coding handshake->aa will not work.
- The frequency is is also passed in now, as ft-over-ds needs to use
the frequency of the currently connected AP (netdev->frequency get
set to the new target in netdev_fast_transition. Previous frequency
is also saved now).
- A new vector variant (netdev_send_action_framev) was added as well
to support sending out the FT Request action frame since the FT
TX authenticate function provides an iovec of the IEs. The existing
function was already having to prepend the action frame header to
the body, so its not any more or less copying to do the same thing
with an iovec instead.
Since FT already handles processing the FT IE's (and building for
associate) it didn't make sense to have all the IE building inside
netdev_build_cmd_ft_authenticate. Instead this logic was moved into
ft.c, and an iovec is now passed from FT into
netdev_ft_tx_authenticate. This leaves the netdev command builder
unburdened by the details of FT, as well as prepares for FT-over-DS.
In both netdev_{authenticate,associate}_event there is no need to check
for in_ft at the start since netdev->ap will always be set if in_ft is
set.
There was also no need to set eapol_sm_set_use_eapol_start, as setting
require_handshake implies this and achieves the same result when starting
the SM.
Since FT operates over Authenticate/Associate, it makes the most sense
for it to behave like the other auth-protos.
This change moves all the FT specific processing out of netdev and into
ft.c. The bulk of the changes were strait copy-pastes from netdev into
ft.c with minor API changes (e.g. remove struct netdev).
The 'in_ft' boolean unforunately is still required for a few reasons:
- netdev_disconnect_event relies on this flag so it can ignore the
disconnect which comes in when doing a fast transition. We cannot
simply check netdev->ap because this would cause the other auth-protos
to not handle a disconnect correctly.
- netdev_associate_event needs to correctly setup the eapol_sm when
in FT mode by setting require_handshake and use_eapol_start to false.
This cannot be handled inside eapol by checking the AKM because an AP
may only advertise a FT AKM, and the initial mobility association
does require the 4-way handshake.
Now the 'ft' module, previously ftutil, will be used to drive FT via
the auth-proto virtual class. This renaming is in preparation as
ftutil will become obsolete since all the IE building/processing is
going to be moved out of netdev. The new ft.c module will utilize
the existing ftutil functionality, but since this is now a full blown
auth protocol naming it 'ft' is better suited.
The duplicate/similar code in netdev_associate_event and
netdev_connect_event leads to very hard to follow code, especially
when you throw OWE/SAE/FILS or full mac cards into the mix.
Currently these protocols finish the connection inside
netdev_associate_event, and set ignore_connect_event. But for full
mac cards we must finish the connection in netdev_connect_event.
In attempt to simplify this, all connections will be completed
and/or the 4-way started in netdev_connect_event. This satisfies
both soft/full mac cards as well as simplifies the FT processing
in netdev_associate_event. Since the FT IEs can be processed in
netdev_connect_event (as they already are to support full mac)
we can assume that any FT processing inside netdev_associate_event
is for a fast transition, not initial mobility association. This
simplifies netdev_ft_process_associate by removing all the blocks
that would get hit if transition == false.
Handling FT this way also fixes FT-SAE which was broken after the
auth-proto changes since the initial mobility association was
never processed if there was an auth-proto running.
SAE was a bit trickier than OWE/FILS because the initial implementation
for SAE did not include parsing raw authenticate frames (netdev skipped
the header and passed just the authentication data). OWE/FILS did not
do this and parse the entire frame in the RX callbacks. Because of this
it was not as simple as just setting some RX callbacks. In addition,
the TX functions include some of the authentication header/data, but
not all (thanks NL80211), so this will require an overhaul to test-sae
since the unit test passes frames from one SM to another to test the
protocol end-to-end (essentially the header needs to be prepended to
any data coming from the TX functions for the end-to-end tests).
An unexpected Associate event would cause iwd to crash when accessing
netdev->handshake->mde. netdev->handshake is only set if we're
attempting to connect or connected somewhere so check netdev->connected
first.
SAE was behaving inconsitently with respect to freeing the state.
It was freeing the SM internally on failure, but requiring netdev
free it on success.
This removes the call to sae_sm_free in sae.c upon failure, and
instead netdev frees the SM in the complete callback in all cases
regardless of success or failure.
From netdev's prospective FILS works the same as OWE/SAE where we create
a fils_sm and forward all auth/assoc frames into the FILS module. The
only real difference is we do not start EAPoL once FILS completes.
src/netdev.c:netdev_create_from_genl() Skipping duplicate netdev wlp2s0[3]
Aborting (signal 11) [/home/denkenz/iwd/src/iwd]
++++++++ backtrace ++++++++
#0 0x7fc4c7a4e930 in /lib64/libc.so.6
#1 0x40ea13 in netdev_getlink_cb() at src/netdev.c:4654
#2 0x468cab in process_message() at ell/netlink.c:183
#3 0x4690a3 in can_read_data() at ell/netlink.c:289
#4 0x46681d in io_callback() at ell/io.c:126
#5 0x4651cd in l_main_iterate() at ell/main.c:473
#6 0x46530e in l_main_run() at ell/main.c:516
#7 0x465626 in l_main_run_with_signal() at ell/main.c:642
#8 0x403df8 in main() at src/main.c:513
#9 0x7fc4c7a39bde in /lib64/libc.so.6
The latest refactoring ended up assuming that FT related elements would
be handled in netdev_associate_event. However, FullMac cards (that do
not generate netdev_associate_event) could still connect using FT AKMs
and perform the Initial mobility association. In such cases the FTE
element was required but ended up not being set into the handshake.
This caused the handshake to fail during PTK 1_of_4 processing.
Fix this by making sure that FTE + related info is set into the
handshake, albeit with a lower sanity checking level since the
elements have been processed by the firmware already.
Note that it is currently impossible for actual FTs to be performed on
FullMac cards, so the extra logic and sanity checking to handle these
can be skipped.
Make netdev_create_from_genl public and change signature to return the
created netdev or NULL. Also add netdev_destroy that destroys and
unregisters the created netdevs. Both will be used to move the
whole interface management to a new file.
The associate event is only important for OWE and FT. If neither of
these conditions (or FT initial association) are happening we do
not need to continue further processing the associate event.
In netdev_associate_event the ignore_connect_event was getting set true,
but afterwards there were still potential failure paths. Now, once in
assoc_failed we explicitly set ignore_connect_event to false so the
the failure can be handled properly inside netdev_connect_event