If an application initiates the Connect() operation and
that application has an agent registered, then that
application's agent will be called. Otherwise, the default
agent is called.
==40686== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==40686== at 0x5147037: sendmsg (in /usr/lib64/libc-2.24.so)
==40686== by 0x43957C: operate_cipher (cipher.c:354)
==40686== by 0x439C18: l_cipher_decrypt (cipher.c:415)
==40686== by 0x40FAB8: arc4_skip (crypto.c:181)
Initialize the skip buffer to 0s. This isn't strictly necessary, but
hides the above valgrind warning.
The aim of arc4 skip is simply to seed some data into the RC4 cipher so
it makes it harder for the attacker to decrypt. This 'initialization'
doesn't really care what data is fed.
CMD_DEAUTHENTICATE is not available for FullMAC based cards. We already
use CMD_CONNECT in the non-FT cases, which works on all cards. However,
for some reason we kept using CMD_DEAUTHENTICATE instead of CMD_DISCONNECT.
For FT (error) cases, keep using CMD_DEAUTHENTICATE.
Certain WiFi drivers do not support using CMD_SET_STATION (e.g.
mwifiex). It is not completely clear how such drivers handle the
AUTHORIZED state, but they don't seem to take it into account. So for
such drivers, ignore the -ENOTSUPP error return from CMD_SET_STATION.
These flags are documented in RFC2863 and kernel's
Documentation/networking/operstates.txt. Operstate doesn't have any
siginificant effect on normal connectivity or on our autotests because
it is not used by the kernel except in some rare cases but it is
supposed to affect some userspace daemons that watch for RTM_NEWLINK
events, so I believe we *should* set them according to this
documentation. Changes:
* There's no point setting link_mode or operstate of the netdev when
we're bringing the admin state DOWN as that overrides operstate.
* Instead of numerical values for link_mode use the if.h defines.
* Set IF_OPER_UP when association succeeds also in the Fast Transition
case. The driver will have set carrier off and then on so the
operstate should be IF_OPER_DORMANT at this point and needs to be
reset to UP.
Allow registering and unregistering agent object to receive RSSI level
notifications. The methods are similar to the ones related to the
password agent, including a Release method for the agent.
Add an methods and an event using the new
NL80211_EXT_FEATURE_CQM_RSSI_LIST kernel feature to request RSSI
monitoring with notifications only when RSSI moves from one of the N
intervals requested to another.
device.c will call netdev_set_rssi_report_levels to request
NETDEV_EVENT_RSSI_LEVEL_NOTIFY events every time the RSSI level changes,
level meaning one of the intervals delimited by the threshold values
passed as argument. Inside the event handler it can call
netdev_get_rssi_level to read the new level.
There's no fallback to periodic polling implemented in this patch for
the case of older kernels and/or the driver not supporting
NL80211_EXT_FEATURE_CQM_RSSI_LIST.
==27901== Conditional jump or move depends on uninitialised value(s)
==27901== at 0x41157A: handshake_util_find_pmkid_kde
(handshake.c:537)
==27901== by 0x40E03A: eapol_handle_ptk_1_of_4 (eapol.c:852)
==27901== by 0x40F3CD: eapol_key_handle (eapol.c:1417)
==27901== by 0x40F955: eapol_rx_packet (eapol.c:1607)
==27901== by 0x410321: __eapol_rx_packet (eapol.c:1915)
Agent implementation inside agent.c takes a reference of the trigger
message associated with the request. When the callback is called, the
message is passed as an argument. The callback is responsible for
taking the message reference if necessary. Once the callback returns,
agent releases its reference.
For error paths, our code was using dbus_pending_reply which in turn
uses dbus_message_unref. This caused the agent to try an unref
operation on an already freed object.
Move the calling of the *_shutdown functions from the signal handler to
a new public function, and use that function inside the DBus disconnect
handler to make sure resources are cleanly released.
Skip the matching of the PMKID KDE to the PMKID list in the RSNE if
we've seen a new EAP authentication before the step 1/4 was received.
That would mean that the server had not accepted the PMKIDs we submitted
and we performed a new 8021X authentication, producing a new PMKSA which
won't be on the list in the RSNE.
Currently we'd send EAPOL-Start whenever EAP was configured and we
received an EAPOL-Key before EAP negotiation. Instead only do that if
we know we can't respond to the 4-Way handshake because we don't have
a PMK yet or the PMKID doesn't match. Require a PMKID in step 1/4 if
we'd sent a list of PMKIDs in our RSNE.
Modify the packet filter to also accept frames with ethertype of 0x88c7
and pass the ethertype value to __eapol_rx_packet so it can filter out
the frames where this value doesn't match the sm->preauth flag.
Add a wrapper for eapol_start that sets the sm->preauth flag and sends
the EAPOL-Start frame immediately to skip the timeout since we know
that the supplicant has to initiate the authentication.
Use netdev_reassociate if FT is not available. device_select_akm_suite
is only moved up in the file and the reused code from device_connect is
moved to a separate function.
netdev_reassociate transitions to another BSS without FT. Similar to
netdev_connect but uses reassociation instead of association and
requires and an existing connection.
Pass an additional parameter to the scan results notify functions to
tell them whether the scan was successful. If it wasn't don't bother
passing an empty bss_list queue, pass NULL as bss_list. This way the
callbacks can tell whether the scan indicates there are no BSSes in
range or simply was aborted and the old scan results should be kept.
++++++++ backtrace ++++++++
0 0x7fc0b20ca370 in /lib64/libc.so.6
1 0x4497d5 in l_dbus_message_new_error_valist() at /home/denkenz/iwd/ell/dbus-message.c:372
2 0x44994d in l_dbus_message_new_error() at /home/denkenz/iwd/ell/dbus-message.c:394
3 0x41369b in dbus_error_not_supported() at /home/denkenz/iwd/src/dbus.c:148
4 0x40eaf5 in device_connect_network() at /home/denkenz/iwd/src/device.c:1282
5 0x41f61c in network_autoconnect() at /home/denkenz/iwd/src/network.c:424
6 0x40c1c1 in device_autoconnect_next() at /home/denkenz/iwd/src/device.c:172
7 0x40cabf in device_set_scan_results() at /home/denkenz/iwd/src/device.c:368
8 0x40cb06 in new_scan_results() at /home/denkenz/iwd/src/device.c:376
9 0x41be8a in scan_finished() at /home/denkenz/iwd/src/scan.c:1021
10 0x41bf9e in get_scan_done() at /home/denkenz/iwd/src/scan.c:1048
11 0x43d5ce in destroy_request() at /home/denkenz/iwd/ell/genl.c:136
12 0x43ded1 in process_unicast() at /home/denkenz/iwd/ell/genl.c:395
13 0x43e295 in received_data() at /home/denkenz/iwd/ell/genl.c:502
14 0x43aa62 in io_callback() at /home/denkenz/iwd/ell/io.c:120
15 0x439632 in l_main_run() at /home/denkenz/iwd/ell/main.c:375 (discriminator 2)
16 0x403074 in main() at /home/denkenz/iwd/src/main.c:261
17 0x7fc0b20b7620 in /lib64/libc.so.6
Also handle the case of a periodic scan when handling a
NL80211_CMD_SCAN_ABORTED. The goal is to make sure the supplied callback
is always called if .trigger was called before, but this should also fix
some other corner cases.
* I add a sp.triggered field for periodic scans since sc->state doesn't
tell us whether the scan in progress was triggered by ourselved o
someone else (in that case .trigger has not been called)
* Since the NL80211_CMD_SCAN_ABORTED becomes similar to get_scan_done I
move the common code to scan_finished
* I believe this fixes a situation where we weren't updating sc->state
if we'd not triggered the scan, because both get_scan_done and the
NL80211_CMD_SCAN_ABORTED would return directly.
If the current request is not freed when we receive the
NL80211_CMD_SCAN_ABORTED event, device.c will keep thinking that
we're still scanning and the scan.c logic also gets confused and may
resend the current request at some point and call sr->trigger again
causing a segfault in device.c.
I pass an empty bss_list to the callback, another possibility would be
to pass NULL to let the callback know not to replace old results yet.
The callbacks would need to handle a NULL first.
Handle the changes of interface address in RTNL New Link messages
similarly to the name changes, emit a NETDEV_WATCH_EVENT_ADDRESS_CHANGE
event and a propety change on dbus.
Note this can only happen when the interface is down so it doesn't
break anything but we need to handle it anyway.
DBus has certain rules on what constitutes a valid path. Since the
wiphy name is freeform, it is possible to set it such that the contents
do not contain a valid path.
We fall back to simply using the wiphy index as the path.
DBus strings must be valid utf8. The kernel only enforces that the
wiphy name is null terminated string. It does not validate or otherwise
check the contents in any way. Thus it is possible to have
non-printable or non-utf8 characters inside.
NL80211_CMD_SET_WIPHY can be used to set various attributes on the wiphy
object in the kernel. This includes ATTR_WIPHY_NAME among others. iwd
currently does not parse or store any of the other attributes, so we
react to changes in WIPHY_NAME only.
The wiphy attribute should never be repeated by the kernel, so this
check is ultimately not needed. This condition can also be easily
checked by looking at the iwmon output in case things do go terribly
wrong.
Fix 1a64c4b771 by setting use_eapol_start
by default only when 8021x authentication is configured. Otherwise we'd
be sending EAPOL-Start even for WPA2 Personal possibly after the 4-Way
Handshake success.
This implements very initial support of WPS PIN based connections. The
scanning logic attempts to find an AP in PIN mode and tries to connect
to that AP. We currently do not try multiple APs if available or
implement the WSC 1.0 connection logic.
Right now the code checks for is_rsn to wait for the 4-way handshake and
sends the NETDEV_EVENT_4WAY_HANDSHAKE. However, is_rsn condition is not
true for WSC connections since they do not set an RSN field. Still,
they are EAP based handshakes and should be treated in the same manner.
We relax the is_rsn check to instead check for netdev->sm. Currently
netdev->sm is only non-NULL if handshake->own_ie field is not NULL or in
the case of eap-wsc connections.
Define minimum delay between roam attempts and add automatic retries.
This handles a few situations:
* roam attempt failing, then RSSI going above the threshold and below
again -- in that case we don't want to reattempt too soon, we'll only
reattempt after 60s.
* roam attempt failing then RSSI staying low for longer than 60 -- in
that case we want to reattempt after 60s too.
* signal being low from the moment we connected -- in that case we also
want to attempt a roam every some time.
Fix a leak of the MDE buffer. It is now only needed for the single call
to handshake_state_set_mde which copies the bytes anyway so use a buffer
on stack.
Since caab23f192085e6c8e47c41fc1ae9f795d1cbe86 hostapd is going to set
this bit to zero for RSN networks but both values will obviously be in
use. Only check the value if is_wpa is true - in this case check the
value is exactly 16, see hostapd commit:
commit caab23f192085e6c8e47c41fc1ae9f795d1cbe86
Author: Jouni Malinen <j@w1.fi>
Date: Sun Feb 5 13:52:43 2017 +0200
Set EAPOL-Key Key Length field to 0 for group message 1/2 in RSN
P802.11i/D3.0 described the Key Length as having value 16 for the group
key handshake. However, this was changed to 0 in the published IEEE Std
802.11i-2004 amendment (and still remains 0 in the current standard IEEE
Std 802.11-2016). We need to maintain the non-zero value for WPA (v1)
cases, but the RSN case can be changed to 0 to be closer to the current
standard.
Add sr NULL check before accessing sr->id. Call scan_request_free on
request structure and call the destroy callback. Cancel the netlink
TRIGGER_SCAN command if still running and try starting the next scan
in the queue. It'll probably still fail with EBUSY but it'll be
reattempted later.
Always call start_next_scan_request when a scan request has finished,
with a success or a failure, including a periodic scan attempt. Inside
that function check if there's any work to be done, either for one-off
scan requests or periodic scan, instead of having this check only inside
get_scan_done. Call start_next_scan_request in scan_periodic_start and
scan_periodic_timeout.
Also call the trigger callback with an error code when sending the
netlink command fails after the scan request has been queued because
another scan was in progress when the scan was requested.
Program received signal SIGSEGV, Segmentation fault.
0x0000000000419d38 in scan_done (msg=0x692580, userdata=0x688250)
at src/scan.c:250
250 sc->state = sr->passive ? SCAN_STATE_PASSIVE : SCAN_STATE_ACTIVE;
(gdb) bt
0 0x0000000000419d38 in scan_done (msg=0x692580, userdata=0x688250)
at src/scan.c:250
1 0x000000000043cac0 in process_unicast (genl=0x686d60, nlmsg=0x7fffffffc3b0)
at ell/genl.c:390
2 0x000000000043ceb0 in received_data (io=0x686e60, user_data=0x686d60)
at ell/genl.c:506
3 0x000000000043967d in io_callback (fd=6, events=1, user_data=0x686e60)
at ell/io.c:120
4 0x000000000043824d in l_main_run () at ell/main.c:381
5 0x000000000040303c in main (argc=1, argv=0x7fffffffe668) at src/main.c:259
The reasoning is that the logic inside scan_common is reversed. Instead
of freeing the scan request on error, we always do it. This causes the
trigger_scan callback to receive invalid userdata.
Save the ids of the netlink trigger scan commands that we send and
cancel them in scan_ifindex_remove to fix a race leading to a
segfault. The segfault would happen every time if scan_ifindex_remove
was called in the same main loop iteration in which we sent the
command, on shutdown:
^CTerminate
src/netdev.c:netdev_free() Freeing netdev wlan3[6]
src/device.c:device_disassociated() 6
src/device.c:device_enter_state() Old State: connected, new state:
disconnected
src/device.c:device_enter_state() Old State: disconnected, new state:
autoconnect
src/scan.c:scan_periodic_start() Starting periodic scan for ifindex: 6
src/device.c:device_free()
src/device.c:bss_free() Freeing BSS 02:00:00:00:00:00
src/device.c:bss_free() Freeing BSS 02:00:00:00:01:00
Removing scan context for ifindex: 6
src/scan.c:scan_context_free() sc: 0x5555557ca290
src/scan.c:scan_notify() Scan notification 33
src/netdev.c:netdev_operstate_down_cb() netdev: 6, success: 1
src/scan.c:scan_periodic_done()
src/scan.c:scan_periodic_done() Periodic scan triggered for ifindex:
1434209520
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000064 in ?? ()
(gdb) bt
#0 0x0000000000000064 in ?? ()
#1 0x0000555555583560 in process_unicast (nlmsg=0x7fffffffc1a0,
genl=0x5555557c1d60) at ell/genl.c:390
#2 received_data (io=<optimized out>, user_data=0x5555557c1d60)
at ell/genl.c:506
#3 0x0000555555580d45 in io_callback (fd=<optimized out>,
events=1, user_data=0x5555557c1e60) at ell/io.c:120
#4 0x000055555558005f in l_main_run () at ell/main.c:381
#5 0x00005555555599c1 in main (argc=<optimized out>, argv=<optimized out>)
at src/main.c:259
Parse the contents of the GTK and IGTK subelements in an FT IE instead
of working with buffers containing the whole subelement. Some more
validation of the subelement contents. Drop support for GTK / IGTK when
building the FTE (unused).
Don't start the handshake timeout in eapol_start if either
handshake->ptk_complete is set (handshake already done) or
handshake->have_snonce is set (steps 1&2 done). This accounts for
eapol_start being called after a Fast Transition when a 4-Way handshake
is not expected.
Add a flush flag to scan_parameters to tell the kernel to flush the
cache of scan results before the new scan. Use this flag in the
active scan during roaming.
Split the igtk parameter to handshake_state_install_igtk into one
parameter for the actual IGTK buffer and one for the IPN buffer instead
of requiring the caller to have them both in one continuous buffer.
With FT protocol, one is received encrypted and the other in plain text.
Make sure that the Neighbor Report timeout is cancelled when connection
breaks or device is being destroyed, and call the callback. Add an
errno parameter to the callback to indicate the cause.
With this patch an actual fast transition should happen when the signal
strength goes low but there are still various details to be fixed before
this becomes useful:
* the kernel tends to return cached scan results and won't update the
rssi values,
* there's no timer to prevent too frequent transition attempts or to
retry after some time if the signal is still low,
* no candidate other than the top ranked BSS is tried. With FT it
may be impossible to try another BSS anyway although there isn't
anything in the spec to imply this. It would require keeping the
handshake_state around after netdev gives up on the transition
attempt.
Trigger a scan of the selected channels or all channels if no useful
neighbor list was obtained, then process the scan results to select the
final target BSS.
The actual transition to the new BSS is not included in this patch for
readability.
Trigger a roam attempt when the RSSI level has been low for at least 5
seconds using the netdev RSSI LOW/HIGH events. See if neighbor reports
are supported and if so, request and process the neighbor reports list
to restrict the number of channels to be scanned. The scanning part is
not included in this patch for readability.
Validate the fourth message of the fast transition sequence and save the
new keys and state as current values in the netdev object. The
FT-specific IE validation that was already present in the initial MD
is moved to a new function.
Build and send the FT Authentication Request frame, the initial Fast
Transition message.
In this version the assumption is that once we start a transition attempt
there's no going back so the old handshake_state, scan_bss, etc. can be
replaced by the new objects immediately and there's no point at which both
the old and the new connection states are needed. Also the disconnect
event for the old connection is implicit. At netdev level the state
during a transition is almost the same with a new connection setup.
The first disconnect event on the netlink socket after the FT Authenticate
is assumed to be the one generated by the kernel for the old connection.
The disconnect event doesn't contain the AP bssid (unlike the
deauthenticate event preceding it), otherwise we could check to see if
the bssid is the one we are interested in or could check connect_cmd_id
assuming a disconnect doesn't happen before the connect command finishes.
This adds support for iwd.conf 'ManagementFrameProtection' setting.
This setting has the following semantics, with '1' being the default:
0 - MFP off, even if hardware is capable
1 - Use MFP if available
2 - MFP required. If the hardware is not capable, no connections will
be possible. Use at your own risk.
Despite RFC3748 mandating MSKs to be at least 256 bits some EAP methods
return shorter MSKs. Since we call handshake_failed when the MSK is too
short, EAP methods have to be careful with their calls to set_key_material
because it may result in a call to the method's .remove method.
EAP-TLS and EAP-TTLS can't handle that currently and would be difficult to
adapt because of the TLS internals but they always set msk_len to 64 so
handshake_failed will not be called.