Trigger a roam attempt when the RSSI level has been low for at least 5
seconds using the netdev RSSI LOW/HIGH events. See if neighbor reports
are supported and if so, request and process the neighbor reports list
to restrict the number of channels to be scanned. The scanning part is
not included in this patch for readability.
Validate the fourth message of the fast transition sequence and save the
new keys and state as current values in the netdev object. The
FT-specific IE validation that was already present in the initial MD
is moved to a new function.
Build and send the FT Authentication Request frame, the initial Fast
Transition message.
In this version the assumption is that once we start a transition attempt
there's no going back so the old handshake_state, scan_bss, etc. can be
replaced by the new objects immediately and there's no point at which both
the old and the new connection states are needed. Also the disconnect
event for the old connection is implicit. At netdev level the state
during a transition is almost the same with a new connection setup.
The first disconnect event on the netlink socket after the FT Authenticate
is assumed to be the one generated by the kernel for the old connection.
The disconnect event doesn't contain the AP bssid (unlike the
deauthenticate event preceding it), otherwise we could check to see if
the bssid is the one we are interested in or could check connect_cmd_id
assuming a disconnect doesn't happen before the connect command finishes.
This adds support for iwd.conf 'ManagementFrameProtection' setting.
This setting has the following semantics, with '1' being the default:
0 - MFP off, even if hardware is capable
1 - Use MFP if available
2 - MFP required. If the hardware is not capable, no connections will
be possible. Use at your own risk.
Despite RFC3748 mandating MSKs to be at least 256 bits some EAP methods
return shorter MSKs. Since we call handshake_failed when the MSK is too
short, EAP methods have to be careful with their calls to set_key_material
because it may result in a call to the method's .remove method.
EAP-TLS and EAP-TTLS can't handle that currently and would be difficult to
adapt because of the TLS internals but they always set msk_len to 64 so
handshake_failed will not be called.
Make sure that eap_set_key_material can free the whole EAP method and
EAP state machine before returning, by calling that function last. This
relies on eap_mschapv2_handle_success being the last call in about 5
stack frames above it too.
Action Frames are sent by nl80211 as unicast data. We're not receiving
any other unicast packets in iwd at this time so let netdev directly
handle all unicast data on the genl socket.
Add a version of scan_active that accepts a struct with the scan
parameters so we can more easily add new parameters. Since the genl
message is now built within scan_active_start the extra_ie memory
can be freed by the caller at any time.
clang complains about enum as var_arg type
because of the argument standard conversion.
In a small test I did neither clang nor gcc can
properly warn about out of range values, so it's
purely for documentation either way.
There are situations when a CMD_DISCONNECT or deauthenticate will be
issued locally because of an error detected locally where netdev would
not be able to emit a event to the device object. The CMD_DISCONNECT
handler can only send an event if the disconnect is triggered by the AP
because we don't have an enum value defined for other diconnects. We
have these values defined for the connect callback but those errors may
happen when the connect callback is already NULL because a connection
has been estabilshed. So add an event type for local errors.
These situations may occur in a transition negotiation or in an eapol
handshake failure during rekeying resulting in a call to
netdev_handshake_failed.
The kernel parses NL80211_ATTR_USE_MFP to mean an enumeration
nl80211_mfp. So instead of using a boolean, we should be using the
value NL80211_MFP_REQUIRED.
Make the use of EAPOL-Start the default and send it when configured for
8021x and either we receive no EAPOL-EAP from from the AP before
timeout, or if the AP tries to start a 4-Way Handshake.
On certain routers, the 4-Way handshake message 3 of 4 contains a key iv
field which is not zero as it is supposed to. This causes us to fail
the handshake.
Since the iv field is not utilized in this particular case, it is safe
to simply warn rather than fail the handshake outright.
Use the NLMSG_ALIGN macro on the family header size (struct ifinfomsg in
this case). The ascii graphics in include/net/netlink.h show that both
the netlink header and the family header should be padded. The netlink
header (nlmsghdr) is already padded in ell. To "document" this
requirementin ell what we could do is take two buffers, one for the
family header and one for the attributes.
This doesn't change anything for most people because ifinfomsg is
already 16-byte long on the usual architectures.
Remove the keys and other data from struct eapol_sm, update device.c,
netdev.c and wsc.c to use the handshake_state object instead of
eapol_sm. This also gets rid of eapol_cancel and the ifindex parameter
in some of the eapol functions where sm->handshake->ifindex can be
used instead.
struct handshake_state is an object that stores all the key data and other
authentication state and does the low level operations on the keys. Together
with the next patch this mostly just splits eapol.c into two layers
so that the key operations can also be used in Fast Transitions which don't
use eapol.
If device_select_akm_suite selects Fast Transition association then pass
the MD IE and other bits needed for eapol and netdev to do an FT
association and 4-Way Handshake.
If an MD IE is supplied to netdev_connect, pass that MD IE in the
associate request, then validate and handle the MD IE and FT IE in the
associate response from AP.
Add space in the eapol_sm struct for the pieces of information required
for the FT 4-Way Handshake and add setters for device.c and netdev.c to
be able to provide the data.
Don't decide on the AKM suite to use when the bss entries are received
and processed, instead select the suite when the connection is triggered
using a new function device_select_akm_suite, similar to
wiphy_select_cipher(). Describing the AKM suite through flags will be
more difficult when more than 2 suites per security type are supported.
Also handle the wiphy_select_cipher 0 return value when no cipher can be
selected.
The len parameter was only used so it could be validated against ie[1],
but since it was not checked to be > 2, it must have been validated
already, the check was redundant. In any case all users directly
passed ie[1] as len anyway. This makes it consistent with the ie
parsers and builders which didn't require a length.
In many cases the pairwise and group cipher information is not the only
information needed from the BSS RSN/WPA elements in order to make a
decision. For example, th MFPC/MFPR bits might be needed, or
pre-authentication capability bits, group management ciphers, etc.
This patch refactors bss_get_supported_ciphers into the more general
scan_bss_get_rsn_info function
Split eapol_start into two calls, one to register the state machine so
that the PAE read handler knows not to discard frames for that ifindex,
and eapol_start to actually start processing the frames. This is needed
because, as per the comment in netdev.c, due to scheduling the PAE
socket read handler may trigger before the CMD_CONNECT event handler,
which needs to parse the FTE from the Associate Response frame and
supply it to the eapol SM before it can do anything with the message 1
of 4 of the FT handshake.
Another issue is that depending on the driver or timing, the underlying
link might not be marked as 'ready' by the kernel. In this case, our
response to Message 1 of the 4-way Handshake is written and accepted by
the kernel, but gets dropped on the floor internally. Which leads to
timeouts if the AP doesn't retransmit.
This function takes an Operating Channel and a Country String to convert
it into a band. Using scan_oper_class_to_band and scan_channel_to_freq,
an Operating Channel, a Country String and a Channel Number together can
be converted into an actual frequency. EU and US country codes based on
wpa_supplicant's tables.
It doesn't matter for crypto_derive_pairwise_ptk in non-SHA256 mode
but in the FT PTK derivation function, as well as in SHA256 mode all
bytes of the output do actually change with the PTK size.
Fix autoconnect trying to connect to networks never used before as found
by Tim Kourt. Update the comments to be consistent with the use of the
is_known field and the docs, in that a Known Network is any network that
has a config file in the iwd storage, and an autoconnect candidate is a
network that has been connected to before.
If the handshake fails, we trigger a deauthentication prior to reporting
NETDEV_RESULT_HANDSHAKE_FAILED. If a netdev_disconnect is invoked in
the meantime, then the caller will receive -ENOTCONN. This is
incorrect, since we are in fact logically connected until the connect_cb
is notified.
Tweak the behavior to keep the connected variable as true, but check
whether disconnect_cmd_id has been issued in the netdev_disconnect_event
callback.
If the device is currently connected, we will initiate a disconnection
(or wait for the disconnection to complete) prior to starting the
WSC-EAP association.
Use the org.freedesktop.DBus.Properties interfaces on objects with
properties and drop the old style GetProperty/SetProperty methods on
individual interfaces. Agent and KnownNetworks have no properties at
this time so don't add org.freedesktop.DBus.Properties interfaces.
We send the scan results where we obtained a PushButton target over to
device object. If EAP-WSC transaction is successful, then the scan
results are searched to find a network/bss combination found in the
credentials obtained. If found, the network is connected to
automatically.
This also fixes a potential buffer overflow since the ssid was cast to a
string inside network_create. However, ssid is a buffer of 32 bytes,
and would not be null-terminated in the case of a 32-byte SSID.
==5362== Conditional jump or move depends on uninitialised value(s)
==5362== at 0x419B62: wsc_wfa_ext_iter_next (wscutil.c:52)
==5362== by 0x41B869: wsc_parse_probe_response (wscutil.c:1016)
==5362== by 0x41FD77: scan_results (wsc.c:218)
==5362== by 0x415669: get_scan_done (scan.c:892)
==5362== by 0x432932: destroy_request (genl.c:134)
==5362== by 0x433245: process_unicast (genl.c:394)
==5362== by 0x43361A: received_data (genl.c:506)
==5362== by 0x42FDC2: io_callback (io.c:120)
==5362== by 0x42EABE: l_main_run (main.c:381)
==5362== by 0x402F90: main (main.c:234)
This is used to get arbitrary information out of the EAP method. Needed
for EAP-WSC to signal credential information obtained from the peer.
Other uses include signaling why EAP-WSC failed (e.g. invalid PIN, etc)
and processing of M2D discovery messages. The information in M2Ds might
be useful to external clients.
We used to open a socket for each wireless interface. This patch uses a
single socket with an attached BPF to handle all EAPoL traffic via a
single file descriptor.
When parsing the EAPoL-Key key data field we don't strip the 0xdd /
0x00 padding from the decrypted data so there may be trailing padding
after the IE sequence and valgrind will report an invalid read of the
length byte. Same thing may happen if we're sent garbage.
When we send M5 & M7, we need to generate a random IV. For testing
purposes, the IV can be provided in settings, otherwise it will be
generated randomly.
We need quite a bit of attributes of M2 for the duration of the WSC
handshake. Most importantly, we need to use the peer's public key when
processing M4 and M6. RegistrarNonce is also needed for generating any
ACK/NACK messages as needed.
Also, peer's device attributes such as Model, Manufacturer, etc might be
useful to report upon successful handshake.
AuthKey is already uploaded into auth_key_hmac. KeyWrapKey is now
uploaded into the AES-CBC(128) cipher. We currently have no use for
EMSK.
So we no longer need to keep the wsc_session_key structure around.
Encrypted Settings TLVs are structured similarly to the various WSC
messages. However, they lack a version2 extension field and use a Key
Wrap Authenticator element instead of Authenticator.
DevicePassword is the PIN, either static, dynamically generated or
entered by the user. For PushButton mode, DevicePassword is set to
'00000000'. It can also be provided via external means, such as NFC.
This patch allows DevicePassword to be externally configured into the
EAP-WSC layer. Optionally, the secret nonce values can also be
provided for testing purposes. If omitted, they will be generated using
l_getrandom.
We use the load_settings method to bootstrap the internal state of the
EAP WSC state machine. We require certain information to be provided by
the higher layers, namely:
Global Device parameters
- Manufacturer
- Model Name
- Model Number
- Serial Number
- Device Name
- Primary Device Type
- OS Version
Session specific parameters
- MAC Address
- Configuration Methods
- RF Bands
The following parameters are auto-generated for each new session, but
can be over-ridden if desired
- Private Key
- Enrollee Nonce
Expanded EAP methods should get their packets for handling starting at
the op-code field. They're not really interested in
type/vendor-id/vendor-type fields.
Instead of one global protocol_version, we now store it inside eapol_sm.
This allows us to use the same protocol version for our response as the
request from the authenticator.
For unit tests where we had protocol version mismatches, a new method is
introduced to explicitly set the protocol version to use.
CMD_DISCONNECT fails on some occasions when CMD_CONNECT is still
running. When this happens the DBus disconnect command receives an
error reply but iwd's device state is left as disconnected even though
there's a connection at the kernel level which times out a few seconds
later. If the CMD_CONNECT is cancelled I couldn't reproduce this so far.
src/network.c:network_connect()
src/network.c:network_connect_psk()
src/network.c:network_connect_psk() psk:
69ae3f8b2f84a438cf6a44275913182dd2714510ccb8cbdf8da9dc8b61718560
src/network.c:network_connect_psk() len: 32
src/network.c:network_connect_psk() ask_psk: false
src/device.c:device_enter_state() Old State: disconnected, new state:
connecting
src/scan.c:scan_notify() Scan notification 33
src/device.c:device_netdev_event() Associating
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/device.c:device_dbus_disconnect()
src/device.c:device_connect_cb() 6, result: 5
src/device.c:device_enter_state() Old State: connecting, new state:
disconnecting
src/device.c:device_disconnect_cb() 6, success: 0
src/device.c:device_enter_state() Old State: disconnecting, new state:
disconnected
src/scan.c:scan_notify() Scan notification 34
src/netdev.c:netdev_mlme_notify() MLME notification 19
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 37
src/netdev.c:netdev_authenticate_event()
src/scan.c:get_scan_callback() get_scan_callback
src/scan.c:get_scan_done() get_scan_done
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 19
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 38
src/netdev.c:netdev_associate_event()
src/netdev.c:netdev_mlme_notify() MLME notification 46
src/netdev.c:netdev_connect_event()
<delay>
src/netdev.c:netdev_mlme_notify() MLME notification 60
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 20
MLME notification is missing ifindex attribute
src/netdev.c:netdev_mlme_notify() MLME notification 20
src/netdev.c:netdev_mlme_notify() MLME notification 39
src/netdev.c:netdev_deauthenticate_event()
This is to make sure device_remove and netdev_connect_free are called
early so we don't continue setting up a connection and don't let DBus
clients power device back up after we've called netdev_set_powered.
Calling device_disassociated inside disconnect_cb was mostly pointless.
Most attributes were already cleared by device_disconnect() when
initiating the disconnection procedure.
This patch also modifies the logic for triggering the autoconnect. If
the user initiated the disconnect call, then autoconnect should not be
triggered. If the disconnect was triggered by other means, then iwd
will still enter autoconnect mode.
All of the abortion logic is invoked when device_disconnect is called.
So there's no point calling device_disassociated in this case. This
also prevents us from entering into autoconnect mode too early.
Prevents situations like this:
src/device.c:device_enter_state() Old State: connecting, new state:
connected
src/scan.c:scan_periodic_stop() Stopping periodic scan for ifindex: 3
src/device.c:device_dbus_disconnect()
src/device.c:device_connect_cb() 3
src/device.c:device_disassociated() 3
src/device.c:device_enter_state() Old State: connected, new state:
autoconnect
Also, remove the check for device->state == DEVICE_STATE_CONNECTING.
device_connect_cb should always called when the state is CONNECTING.
If this is not so, it indicates a bug inside the netdev layer.
This was introduced by commit f468fceb02.
However, after commit 2d78f51fac66b9beff03a56f12e5fb8456625f07, the
connect_cb is called from inside netdev_disconnect. This in turn causes
the dbus-reply to be sent out if needed. So by the time we get to the
code in question, connect_pending is always NULL.
Try to make the connect and disconnect operations look more like a
transaction where the callback is always called eventually, also with a
clear indication if the operation is in profress. The connected state
lasts from the start of the connection attempt until the disconnect.
1. Non-null netdev->connected or disconnect_cb indicate that the operation
is active.
2. Every entry-point in netdev.c checks if connected is still set
before executing the next step of the connection setup. CMD_CONNECT and
the subsequent commands may succeed even if CMD_DISCONNECT is called
in the middle so they can't only rely on the error value for that.
3. netdev->connect_cb and other elements of the connection state are
reset by netdev_connect_free which groups the clean-up operations to
make sure we don't miss anything. Since the callback pointers are
reset device.c doesn't need to check that it receives a spurious
event in those callbacks for example after calling netdev_disconnect.
If initial bring up returns ERFKILL proceed and the inteface can be
explicitly brought up by the client once rfkill is disabled.
Also fix the error number returned to netdev_set_powered callback to be
negative as expected by netdev_initial_up_cb.
map_wiphy made the assumption that phy names follow the "phyN" pattern
but phys created or renamed by the "iw" command can have arbitrary
names. It seems that /sys/class/rfkill/rfkill%u/name is not updated on
a phy rename, so we can't use it to subsequently read
/sys/class/ieee80211/<name>/index but both
/sys/class/rfkill/rfkill%u/../index and
/sys/class/rfkill/rfkill%u/device/index point to that file.
==3059== 7 bytes in 1 blocks are still reachable in loss record 1 of 2
==3059== at 0x4C2C970: malloc (vg_replace_malloc.c:296)
==3059== by 0x50BB319: strndup (in /lib64/libc-2.22.so)
==3059== by 0x417B4D: l_strndup (util.c:180)
==3059== by 0x417E1B: l_strsplit (util.c:311)
==3059== by 0x4057FC: netdev_init (netdev.c:1658)
==3059== by 0x402E26: nl80211_appeared (main.c:112)
==3059== by 0x41F577: get_family_callback (genl.c:1038)
==3059== by 0x41EE3F: process_unicast (genl.c:390)
==3059== by 0x41EE3F: received_data (genl.c:506)
==3059== by 0x41C6F4: io_callback (io.c:120)
==3059== by 0x41BAA9: l_main_run (main.c:381)
==3059== by 0x402B9C: main (main.c:234)
Previously device.c would remove the whole object at the path of the
Device and the WSC interfaces but now the watches are called without the
whole object appearing and disappearing.
Change the path for net.connman.iwd.Device objects to /phyX/Y and
register net.connman.iwd.Adapter at /phyX grouping devices of the same
wiphy.
Turns out no changes to the test/* scripts are needed.
The boolean property indicates if a scan is ongoing. Only the scans
triggered by device.c are reflected (not the ones from WSC) because only
those scans affect the list of networks seen by Dbus.
Add rfkill.c/rfkill.h to be used for watching per-wiphy RFkill state.
It uses both /dev/rfkill and /sys because /dev/rfkill is the recommended
way of interfacing with rfkill but at the same time it doesn't provide
the information on mapping to wiphy IDs.
Note that the autoconnect_list may still contain the network. Currently
only the top entry from the list is ever used and only on
new_scan_results(), i.e. at the same time the list is being created.
If at some point it becomes part of actual device state it needs to also
be reset when a network is being forgotten.
If Disconnect is called during an ongoing connection attempt send a
CMD_DEAUTHENTICATE command same as when we're already connected, and
send a reply to potential dbus Connect call.
When a new wiphy is added, the kernel usually adds a default STA
interface as well. This interface is currently not signaled over
nl80211 in any way.
This implements a selective dump of the wiphy interfaces in order to
obtain the newly added netdev. Selective dump is currently not
supported by the kernel, so all netdevs will be returned. A patch on
linux-wireless is pending that implements the selective dump
functionality.
==24934== 16 bytes in 1 blocks are definitely lost in loss record 1 of 1
==24934== at 0x4C2C970: malloc (vg_replace_malloc.c:296)
==24934== by 0x41675D: l_malloc (util.c:62)
==24934== by 0x4033B3: netdev_set_linkmode_and_operstate
(netdev.c:149)
==24934== by 0x4042B9: netdev_free (netdev.c:221)
==24934== by 0x41735D: l_queue_clear (queue.c:107)
==24934== by 0x4173A8: l_queue_destroy (queue.c:82)
==24934== by 0x40543D: netdev_exit (netdev.c:1459)
==24934== by 0x402D6F: nl80211_vanished (main.c:126)
==24934== by 0x41E607: l_genl_family_unref (genl.c:1057)
==24934== by 0x402B50: main (main.c:237)
Instead of calling the device added or removed callback when the
interface is detected, call it when interface goes up or down. This
only affects the addition and removal of the WSC interface now.
During the network_info refactoring the adding of the connected BSS to
device->bss_list in case it is not in the scan results has moved to
after the l_hashmap_foreach_remove call meaning that the network could
be removed even though it is still pointed at by
device->connected networks. Reverse the order to what it was before.
Alternatively network_process network could take not of the fact the
network is connected and not call network_remove on it leaving it with
an empty bss_list.
It is probably rare that a disconnect should fail but if it happens the
device->state is not returned to CONNECTED and I'm not sure if it should
be, so the ConnectedNetwork property and other bits should probably be
reset at the start of the disconnection instead of at the end.
Also check if state is CONNECTED before calling network_disconnected
because network_connected may have not been called yet.
--interfaces (-i) tells iwd which interfaces to manage. If the option
is ommitted, all interfaces will be managed.
--nointerfaces (-I) tells iwd which interfaces to blacklist. If the
option is ommitted, no interfaces will be blacklisted.
When setting operstate to dormant or down, give it a callback for debug
purposes. It looks like that operstate down message does not have a
chance to go out currently.
knownnetworks.c/.h implements the KnownNetworks interface and loads the
known networks from storage on startup. The list of all the networks
including information on whether a network is known is managed in
network.c to avoid having two separate lists of network_info structures
and keeping them in sync. That turns out to be difficult because the
network.c list is sorted by connected_time and connected_time changes
can be triggered in both network.c or knownnetworks.c. Both can also
trigger a network_info to be removed completely.
network_info gets a is_known flag that is used for the
GetOrderedNetworks tracking and to implement the KnownNetworks
interface - loading of the list of known networks on startup and
forgetting networks.
For simplicity and future use (possibly performance), every struct network
gets a pointer to a network_info structure, there's one network_info for
every network being by any interface, not only known networks. The SSID
and security type information is removed from struct network because the
network_info holds that information.
network_info also gets a seen_count field to count how many references
from network.info fields it has, so as to fix the removal of
network_info structures. Previously, once they were added to the
networks list, they'd stay there forever possibly skewing the network
ranking results.
This also fixed the network ranking used by GetOrderedNetwork which
wasn't working due to a missing assignment of *index in
network_find_info also triggering valgrind alerts.