ERP (EAP Reauthentication Protocol) allows a station to quickly
reauthenticate using keys from a previous EAP authentication.
This change both implements ERP as well as moves the key cache into
the ERP module.
ERP in its current form is here to only support FILS. ERP is likely not
widespread and there is no easy way to determine if an AP supports ERP
without trying it. Attempting ERP with a non-ERP enabled AP will actually
result in longer connection times since ERP must fail and then full EAP
is done afterwards. For this reason ERP was separated from EAP and a
separate ERP state machine must be created. As it stands now, ERP cannot
be used on its own, only with FILS.
Quick scan uses a set of frequencies associated with the
known networks. This allows to reduce the scan latency.
At this time, the frequency selection follows a very simple
logic by taking all known frequencies from the top 5 most
recently connected networks.
If connection isn't established after the quick scan attempt,
we fall back to the full periodic scan.
Instead of handling NEW_WIPHY events and WIHPY_DUMP events in a similar
fashion, split up the paths to optimize iwd startup time. There's
fundamentally no reason to wait a second (and eat up file-descriptor
resources for timers unnecessarily) when we can simply start an
interface dump right after the wiphy dump.
In case a new wiphy is added in the middle of a wiphy dump, we will
likely get a new wiphy event anyway, in which case a setup_timeout will
be created and we will ignore this phy during the interface dump
processing.
This also optimizes the case of iwd being re-started, in which case
there are no interfaces present.
Separate out the two types of NEW_WIPHY handlers into separate paths and
factor out the common code into a utility function.
Dumps of CMD_NEW_WIPHY can be split up over several messages, while
CMD_NEW_WIPHY events (generated when a new card is plugged in) are
stuffed into a single message.
This also prepares ground for follow-on commits where we will handle the
two types of events differently.
src/netdev.c:netdev_create_from_genl() Skipping duplicate netdev wlp2s0[3]
Aborting (signal 11) [/home/denkenz/iwd/src/iwd]
++++++++ backtrace ++++++++
#0 0x7fc4c7a4e930 in /lib64/libc.so.6
#1 0x40ea13 in netdev_getlink_cb() at src/netdev.c:4654
#2 0x468cab in process_message() at ell/netlink.c:183
#3 0x4690a3 in can_read_data() at ell/netlink.c:289
#4 0x46681d in io_callback() at ell/io.c:126
#5 0x4651cd in l_main_iterate() at ell/main.c:473
#6 0x46530e in l_main_run() at ell/main.c:516
#7 0x465626 in l_main_run_with_signal() at ell/main.c:642
#8 0x403df8 in main() at src/main.c:513
#9 0x7fc4c7a39bde in /lib64/libc.so.6
Mirror netdev.c white/blacklist logic. If either or both the whitelist
and the blacklist are given also fall back to not touching the existing
interface setup on the wiphy.
If we get an error during DEL_INTERFACE or NEW_INTERFACE we may be
dealing with a driver that doesn't implement virtual interfaces or
doesn't implement deleting the default interface. In this case fall
back to using the first usable interface that we've detected on this
wiphy.
There's at least one full-mac driver that doesn't implement the cfg80211
.del_virtual_intf and .add_virtual_intf methods and at least one that
only allows P2P interfaces to be manipulated. mac80211 drivers seem to
at least implement those methods but I didn't check to see if there are
driver where they'd eventually return EOPNOTSUPP.
This is probably the trickiest part in this patchset. I'm introducing a
new logic where instead of using the interfaces that we find present
when a wiphy is detected, which would normally be the one default
interface per wiphy but could be 0 or more than one, we create one
ourselves with the socket owner attribute and use exactly one for
Station, AP and Ad-Hoc modes. When IWD starts we delete all the
interfaces on existing wiphys that we're going to use (as determined by
the wiphy white/blacklists) or freshly hotplugged ones, and only then we
register the interface we're going to use meaning that the wiphy's
limits on the number of concurrent interfaces of each type should be at
0. Otherwise we'd be unlikely to be abe to create the station interface
as most adapters only allow one. After that we ignore any interfaces
that may be created by other processes as we have no use for multiple
station interfaces.
At this point manager.c only keeps local state for wiphys during
the interface setup although when we start adding P2P code we will be
creating and removing interfaces multiple times during the wiphy's
runtime and may need to track it here or in wiphy.c. We do not
specifically check the interface number limits received during the wiphy
dump, if we need to create any interfaces and we're over the driver's
maximum for that specific iftype we'll still attempt it and report error
if it fails.
I tested this and it seems to work with my laptop's intel card and some
USB hotplug adapters.
The latest refactoring ended up assuming that FT related elements would
be handled in netdev_associate_event. However, FullMac cards (that do
not generate netdev_associate_event) could still connect using FT AKMs
and perform the Initial mobility association. In such cases the FTE
element was required but ended up not being set into the handshake.
This caused the handshake to fail during PTK 1_of_4 processing.
Fix this by making sure that FTE + related info is set into the
handshake, albeit with a lower sanity checking level since the
elements have been processed by the firmware already.
Note that it is currently impossible for actual FTs to be performed on
FullMac cards, so the extra logic and sanity checking to handle these
can be skipped.
Add functionality to read and parse the known frequencies
from permanent storage on start of the service. On service
shutdown, we sync the known frequencies back to the permanent
storage.
Each known network (previously connected) will have a set
of known frequencies associated with it, e.g. a set of
frequencies from all BSSs observed. The list of known
frequencies is sorted with the most recently observed
frequency in the head.
Previously, the scan results were disregarded once the new
ones were available. To enable the scan scenarios where the
new scan results are delivered in parts, we introduce a
concept of aging BSSs and will remove them based on
retention time.
Add manager.c, a new file where the wiphy and interface creation/removal
will be handled and interface use policies will be implemented. Since
not all kernel-side nl80211 interfaces are tied to kernel-side netdevs,
netdev.c can't manage all of the interfaces that we will be using, so
the logic is being moved to a common place where all interfaces on a
wiphy will be managed according to the policy, device support for things
like P2P and user enabling/disabling/connecting with P2P which require
interfaces to be dynamically added and removed.
Add wiphy_create, wiphy_update_from_genl and wiphy_destroy that together
will let a new file command the wiphy creation, updates and deletion
with the same functionality the current config notification handler
implements in wiphy.c.
As mentioned in code comments the name is NUL-terminated so there's no
need to return the length path, which was ignored in some occasions
anyway. Consistently treat it as NUL-terminated but also validate.
Make netdev_create_from_genl public and change signature to return the
created netdev or NULL. Also add netdev_destroy that destroys and
unregisters the created netdevs. Both will be used to move the
whole interface management to a new file.
The handshake_state only holds a single AKM value. FILS depends on the AP
supporting EAP as well as FILS. The first time IWD connects, it will do a
full EAP auth. Subsequent connections (assuming FILS is supported) will use
FILS. But if the AP does not support FILS there is no reason to cache the
ERP keys.
This adds the supp_fils to the handshake_state. Now, station.c can set this
flag while building the handshake. This flag can later be checked when
caching the ERP keys.
This allows IWD to cache ERP keys after a full EAP run. Caching
allows IWD to quickly connect to the network later on using ERP or
FILS.
The cache will contain the EAP Identity, Session ID, EMSK, SSID and
optionally the ERP domain. For the time being, the cache entry
lifetimes are hard coded to 24 hours. Eventually the cache should
be written to disk to allow ERP/FILS to work after a reboot or
IWD restart.
mschaputil already had similar functionality, but ERP will need this
as well. These two functions will also handle identities with either
'@' or '\' to separate the user and domain.
Many operations performed during an error in load_settings were the same
as the ones performed when freeing the eap object. Add eap_free_common
to unify these.
EAP identites are recommended to follow RFC 4282 (The Network Access
Identifier). This RFC recommends a maximum NAI length of 253 octets.
It also mentions that RADIUS is only able to support NAIs of 253
octets.
Because of this, IWD should not allow EAP identities larger than 253
bytes. This change adds a check in eap_load_settings to verify the
identity does not exceed this limit.
The associate event is only important for OWE and FT. If neither of
these conditions (or FT initial association) are happening we do
not need to continue further processing the associate event.
802.11 mandates that IEs inside management frames are presented in a
given order. However, in the real world, many APs seem to ignore the
rules and send their IEs in seemingly arbitrary order, especially when
it comes to VENDOR tags. Change this function to no longer be strict in
enforcing the order.
Also, drop checking of rules specific to Probe Responses. These will
have to be handled separately (most likely by the AP module) since
802.11-2016, Section 11.1.4.3.5 essentially allows just about anything.
In netdev_associate_event the ignore_connect_event was getting set true,
but afterwards there were still potential failure paths. Now, once in
assoc_failed we explicitly set ignore_connect_event to false so the
the failure can be handled properly inside netdev_connect_event
The list of PSK/8021x AKM's in security_determine was getting long,
and difficult to keep under 80 characters. This moves them all into
two new macros, AKM_IS_PSK/AKM_IS_8021X.
It was assumed that the hunt-and-peck loop was guarenteed to find
a PWE. This was incorrect in terms of kernel support. If a system
does not have support for AF_ALG or runs out of file descriptors
the KDFs may fail. The loop continued to run if found == false,
which is also incorrect because we want to stop after 20 iterations
regarless of success.
This changes the loop to a for loop so it will always exit after
the set number of iterations.
CC src/scan.o
src/scan.c: In function ‘scan_bss_compute_rank’:
src/scan.c:1048:4: warning: this decimal constant is unsigned only in ISO C90
factor = factor * data_rate / 2340000000 +
The auto-connect state will now consist of the two phases:
STATION_STATE_AUTOCONNECT_QUICK and STATION_STATE_AUTOCONNECT_FULL.
The auto-connect will always start with STATION_STATE_AUTOCONNECT_QUICK
and then transition into STATION_STATE_AUTOCONNECT_FULL if no
connection has been established. During STATION_STATE_AUTOCONNECT_QUICK
phase we take advantage of the wireless scans with the limited number
of channels on which the known networks have been observed before.
This approach allows to shorten the time required for the network
sweeps, therefore decreases the connection latency if the connection
is possible. Thereafter, if no connection has been established after
the first phase we transition into STATION_STATE_AUTOCONNECT_FULL and
do the periodic scan just like we did before the split in
STATION_STATE_AUTOCONNECT state.
For simplicity 160Mhz and 80+80Mhz were grouped together when
parsing the VHT capabilities, but the 80+80 bits were left in
vht_widht_map. This could cause an overflow when getting the
width map.
wiphy_select_akm will now check if BIP is supported, and if MFPR is
set in the scan_bss before returning either SAE AKMs. This will allow
fallback to another PSK AKM (e.g. hybrid APs) if any of the requirements
are not met.
Replace existing uses of memset to clear secrets with explicit_bzero to
make sure it doesn't get optimized away. This has some side effects as
documented in gcc docs but is still recommended.
In eap_secret_info_free make sure we clear both strings in the case of
EAP_SECRET_REMOTE_USER_PASSWORD secrets.
Environments with several AP's, all at low signal strength may
want to lower the roaming RSSI threshold to prevent IWD from
roaming excessively. This adds an option 'roam_rssi_threshold',
which is still defaulted to -70.
Also printing keys with l_debug conditional on an environment variable
as someone wanting debug logs, or leaving debug on accidentally, does
not necessarily want the keys in the logs and in memory.
At some point the connect command builder was modified, and the
control port over NL80211 check was moved to inside if (is_rsn).
For WPS, no supplicant_ie was set, so CONTROL_PORT_OVER_NL80211
was never set into CMD_CONNECT. This caused IWD to expect WPS
frames over netlink, but the kernel was sending them over the
legacy route.
This commit hardens the iwd.service.in template file for systemd
services. The following is a short explanation for each added directive:
+PrivateTmp=true
If true, sets up a new file system namespace for the executed processes
and mounts private /tmp and /var/tmp directories inside it that is not
shared by processes outside of the namespace.
+NoNewPrivileges=true
If true, ensures that the service process and all its children can never
gain new privileges through execve() (e.g. via setuid or setgid bits, or
filesystem capabilities).
+PrivateDevices=true
If true, sets up a new /dev mount for the executed processes and only
adds API pseudo devices such as /dev/null, /dev/zero or /dev/random (as
well as the pseudo TTY subsystem) to it, but no physical devices such as
/dev/sda, system memory /dev/mem, system ports /dev/port and others.
+ProtectHome=yes
If true, the directories /home, /root and /run/user are made
inaccessible and empty for processes invoked by this unit.
+ProtectSystem=strict
If set to "strict" the entire file system hierarchy is mounted
read-only, except for the API file system subtrees /dev, /proc and /sys
(protect these directories using PrivateDevices=,
ProtectKernelTunables=, ProtectControlGroups=).
+ReadWritePaths=/var/lib/iwd/
Sets up a new file system namespace for executed processes. These
options may be used to limit access a process might have to the file
system hierarchy. Each setting takes a space-separated list of paths
relative to the host's root directory (i.e. the system running the
service manager). Note that if paths contain symlinks, they are resolved
relative to the root directory set with RootDirectory=/RootImage=.
Paths listed in ReadWritePaths= are accessible from within
the namespace with the same access modes as from outside of
it.
+ProtectControlGroups=yes
If true, the Linux Control Groups (cgroups(7)) hierarchies accessible
through /sys/fs/cgroup will be made read-only to all processes of the
unit.
+ProtectKernelModules=yes
If true, explicit module loading will be denied. This allows module
load and unload operations to be turned off on modular kernels.
For further explanation to all directives see `man systemd.directives`
Hostapd has now been updated to include the group number when rejecting
the connection with UNSUPP_FINITE_CYCLIC_GROUP. We still need the existing
len == 0 check because old hostapd versions will still behave this way.
The single-use password is apparently sent in plaintext over the network
but at least try to prevent it from staying in the memory until we know
it's been used.
station.c generates the IEs we will need to use for the
Authenticate/Associate and EAPoL frames and sets them into the
handshake_state object. However the driver may modify some of them
during CMD_CONNECT and we need to use those update values so the AP
isn't confused about differing IEs in diffent frames from us.
Specifically the "wl" driver seems to do this at least for the RSN IE.
The KDF function processes data in 32 byte chunks so for groups which
primes are not divisible by 32 bytes, you will get a buffer overflow
when copying the last chunk of data.
Now l_checksum_get_digest is limited to the bytes remaining in the
buffer, or 32, whichever is the smallest.
Since eapol_encrypt_key_data already calculates the key data length and
encodes it into the key frame, we can just return this length and avoid
having to obtain it again from the frame.
Similar to SAE, EAP-PWD derives an ECC point (PWE). It is possible
for information to be gathered from the timing of this derivation,
which could be used to to recover the password.
This change adapts EAP-PWD to use the same mitigation technique as
SAE where we continue to derive ECC points even after we have found
a valid point. This derivation loop continues for a set number of
iterations (20 in this case), so anyone timing it will always see
the same timings for every run of the protocol.
This is not used by any of the scan notify callback implementations and
for P2P we're going to need to scan on an interface without an ifindex
so without this the other changes should be mostly contained in scan.
Also add a mask parameter to wiphy_get_supported_iftypes to make sure
the SupportedModes property only contains the values that can be used
as Device.Mode.
dbus_iftype_to_string returns NULL for unknown iftypes, the strdup will
also return NULL and ret[i] will be assigned a NULL. As a result
the l_strjoinv will not print the known iftypes that might have come
after that and will the l_strfreev will leak the strduped strings.
sc->state would get set when the TRIGGERED event arrived or when the
triggered callback for our own SCAN_TRIGGER command is received.
However it would not get reset to NOT_RUNNING when the NEW_SCAN_RESULTS
event is received, instead we'd first request the results with GET_SCAN
and only reset sc->state when that returns. If during that command a
new scan gets triggered, the GET_SCAN callback would still reset
sc->state and clobber the value set by the new scan.
To fix that repurpose sc->state to only track that period from the
TRIGGERED signal to the NEW_SCAN_RESULTS signal. sc->triggered can be
used to check if we're still waiting for the GET_SCAN command and
sc->start_cmd_id to check if we're waiting for the scan to get
triggered, so one of these three variables will now always indicate if
a scan is in progress.
We can crash if we abort the connection, but the connect command has
already gone through. In this case we will get a sequence of
authenticate_event, associate_event, connect_event. The first and last
events don't crash since they check whether netdev->connected is true.
However, this causes an annoying warning to be printed.
Fix this by introducing an 'aborting' flag and ignore all connection
related events if it is set.
++++++++ backtrace ++++++++
Now that the OWE failure/retry is handled in netdev, we can catch
all associate error status' inside owe_rx_associate rather than only
catching UNSUPP_FINITE_CYCLIC_GROUP.
Apart from OWE, the association event was disregarded and all association
processing was done in netdev_connect_event. This led to
netdev_connect_event having to handle all the logic of both success and
failure, as well as parsing the association for FT and OWE. Also, without
checking the status code in the associate frame there is the potential
for the kernel to think we are connected even if association failed
(e.g. rogue AP).
This change introduces two flags into netdev, expect_connect_failure and
ignore_connect_event. All the FT processing that was once in
netdev_connect_event has now been moved into netdev_associate_event, as
well as non-FT associate frame processing. The connect event now only
handles failure cases for soft/half MAC cards.
Note: Since fullmac cards rely on the connect event, the eapol_start
and netdev_connect_ok were left in netdev_connect_event. Since neither
auth/assoc events come in on fullmac we shouldn't have any conflict with
the new flags.
Once a connection has completed association, EAPoL is started from
netdev_associate_event (if required) and the ignore_connect_event flag can
be set. This will bypass the connect event.
If a connection has failed during association for whatever reason, we can
set expect_connect_failure, the netdev reason, and the MPDU status code.
This allows netdev_connect_event to both handle the error, and, if required,
send a deauth telling the kernel that we have failed (protecting against the
rogue AP situation).
OWE processing can be completely taken care of inside
netdev_authenticate_event and netdev_associate_event. This removes
the need for OWE specific checks inside netdev_connect_event. We can
now return early out of the connect event if OWE is in progress.
Several Auth/Assoc failure status codes indicate that the connection
failed for reasons such as bandwidth issues, poor channel conditions
etc. These conditions should not result in the BSS being blacklisted
since its likely only a temporary issue and the AP is not actually
"broken" per-se.
This adds support in station.c to temporarily blacklist these BSS's
on a per-network basis. After the connection has completed we clear
out these blacklist entries.
Certain error conditions require that a BSS be blacklisted only for
the duration of the current connection. The existing blacklist
does not allow for this, and since this blacklist is shared between
all interfaces it doesnt make sense to use it for this purpose.
Instead, each network object can contain its own blacklist of
scan_bss elements. New elements can be added with network_blacklist_add.
The blacklist is cleared when the connection completes, either
successfully or not.
Now inside network_bss_select both the per-network blacklist as well as
the global blacklist will be checked before returning a BSS.
Several netdev events benefit from including event data in the callback.
This is similar to how the connect callback works as well. The content
of the event data is documented in netdev.h (netdev_event_func_t).
By including event data for the two disconnect events, we can pass the
reason code to better handle the failure in station.c. Now, inside
station_disconnect_event, we still check if there is a pending connection,
and if so we can call the connect callback directly with HANDSHAKE_FAILED.
Doing it this way unifies the code path into a single switch statment to
handle all failures.
In addition, we pass the RSSI level index as event data to
RSSI_LEVEL_NOTIFY. This removes the need for a getter to be exposed in
netdev.h.
On successful send, scan_send_start(..) used to set msg to NULL,
therefore the further management of the command by the caller was
impossible. This patch removes wrapper around l_genl_family_send()
and lets the callers to take responsibility for the command.
This change cleans up the mess of status vs reason codes. The two
types of codes have already been separated into different enumerations,
but netdev was still treating them the same (with last_status_code).
A new 'event_data' argument was added to the connect callback, which
has a different meaning depending on the result of the connection
(described inside netdev.h, netdev_connect_cb_t). This allows for the
removal of netdev_get_last_status_code since the status or reason
code is now passed via event_data.
Inside the netdev object last_status_code was renamed to last_code, for
the purpose of storing either status or reason. This is only used when
a disconnect needs to be emitted before failing the connection. In all
other cases we just pass the code directly into the connect_cb and do
not store it.
All ocurrences of netdev_connect_failed were updated to use the proper
code depending on the netdev result. Most of these simply changed from
REASON_CODE_UNSPECIFIED to STATUS_CODE_UNSPECIFIED. This was simply for
consistency (both codes have the same value).
netdev_[authenticate|associate]_event's were updated to parse the
status code and, if present, use that if their was a failure rather
than defaulting to UNSPECIFIED.
Even though .check_settings in our EAP method implementations does the
settings validation, .load_settings also has minimum sanity checks to
rule out segfaults if the settings have changed since the last
.check_settings call.
If OWE fails in association there is no reason to send a disconnect
since its already known that we failed. Instead we can directly
call netdev_connect_failed
Instead of sending a reason_code to netdev_setting_keys_failed, make it
take an errno (negative) instead. Since key setting failures are
entirely a system / software issue, and not a protocol issue, it makes
no sense to use a protocol error code.
Some users may need their own control over 2.4/5GHz preference. This
adds a new user option, 'rank_5g_factor', which allows users to increase
or decrease their 5G preference.
This adds support for parsing the VHT IE, which allows a BSS supporting
VHT (80211ac) to be ranked higher than a BSS supporting only HT/basic
rates. Now, with basic/HT/VHT parsing we can calculate the theoretical
maximum data rate for all three and rank the BSS based on that.
This adds HT IE parsing and data rate calculation for HT (80211n)
rates. Now, a BSS supporting HT rates will be ranked higher than
a basic rate BSS, assuming the RSSI is at an acceptable level.
The spec dictates RSSI thresholds for different modulation schemes, which
correlate to different data rates. Until now were were ranking a BSS with
only looking at its advertised data rate, which may not even be possible
if the RSSI does not meet the threshold.
Now, RSSI is taken into consideration and the data rate returned from
parsing (Ext) Supported Rates IE(s) will reflect that.
All over the place we do "ie[1] + 2" for getting the IE length. It
is much clearer to use a macro to do this. The macro also checks
for NULL, and returns zero in this case.
Supported rates will soon be parsed along with HT/VHT capabilities
to determine the best data rate. This will remove the need for the
supported_rates uintset element in scan_bss, as well as the single
API to only parse the supported rates IE. AP still does rely on
this though (since it only supports basic rates), so the parsing
function was moved into ap.c.
In the methods' check_settings do a more complete early check for
possible certificate / private key misconfiguration, including check
that the certificate and the private key are always present or absent
together and that they actually match each other. Do this by encrypting
and decrypting a small buffer because we have no better API for that.
A method's .check_settings method checks for inconsistent setting files
and prints readable errors so there's no need to do that again in
.load_settings, although at some point after removing the duplicate
error messages from the load_settings methods we agreed to keep minimum
checks that could cause a crash e.g. in a corner case like when the
setting file got modified between the check_settings and the
load_settings call. Some error messages have been re-added to
load_settings after that (e.g. in
bb4e1ebd4f) but they're incomplete and not
useful so remove them.
Previously, the storage dir has only been created after a successful
network connection, causing removal of Known Network interface from
Dbus and failure to register dir watcher until daemon is restarted.
A length check was still assuming the 256 bit ECC group. This
was updated to scale with the group. The commit buffer was also
not properly sized. This was changed to allow for the largest
ECC group supported.
SAE was hardcoded to work only with group 19. This change fixes up the
hard coded lengths to allow it to work with group 20 since ELL supports
it. There was also good amount of logic added to support negotiating
groups. Before, since we only supported group 19, we would just reject
the connection to an AP unless it only supported group 19.
This did lead to a discovery of a potential bug in hostapd, which was
worked around in SAE in order to properly support group negotiation.
If an AP receives a commit request with a group it does not support it
should reject the authentication with code 77. According to the spec
it should also include the group number which it is rejecting. This is
not the case with hostapd. To fix this we needed to special case a
length check where we would otherwise fail the connection.
Most of this work was already done after moving ECC into ELL, but
there were still a few places where the 256-bit group was assumed.
This allows the 384-bit group to be used, and theoretically any
other group added to ELL in the future.
If we have a BSS list where all BSS's have been blacklisted we still
need a way to force a connection to that network, instead of having
to wait for the blacklist entry to expire. network_bss_select now
takes a boolean 'fallback_to_blacklist' which causes the selection
to still return a connectable BSS even if the entire list was
blacklisted.
In most cases this is set to true, as these cases are initiated by
DBus calls. The only case where this is not true is inside
station_try_next_bss, where we do want to honor the blacklist.
This both prevents an explicit connect call (where all BSS's are
blacklisted) from trying all the blacklisted BSS's, as well as the
autoconnect case where we simply should not try to connect if all
the BSS's are blacklisted.
There are is some implied behavior here that may not be obvious:
On an explicit DBus connect call IWD will attempt to connect to
any non-blacklisted BSS found under the network. If unsuccessful,
the current BSS will be blacklisted and IWD will try the next
in the list. This will repeat until all BSS's are blacklisted,
and in this case the connect call will fail.
If a connect is tried again when all BSS's are blacklisted IWD
will attempt to connect to the first connectable blacklisted
BSS, and if this fails the connect call will fail. No more
connection attempts will happen until the next DBus call.
If IWD fails to connect to a BSS we can attempt to connect to a different
BSS under the same network and blacklist the first BSS. In the case of an
incorrect PSK (MMPDU code 2 or 23) we will still fail the connection.
station_connect_cb was refactored to better handle the dbus case. Now the
netdev result switch statement is handled before deciding whether to send
a dbus reply. This allows for both cases where we are trying to connect
to the next BSS in autoconnect, as well as in the dbus case.
This makes __station_connect_network even less intelligent by JUST
making it connect to a network, without any state changes. This makes
the rekey logic much cleaner.
We were also changing dbus properties when setting the state to
CONNECTING, so those dbus property change calls were moved into
station_enter_state.
A new driver extended feature bit was added signifying if the driver
supports PTK replacement/rekeying. During a connect, netdev checks
for the driver feature and sets the handshakes 'no_rekey' flag
accordingly.
At some point the AP will decide to rekey which is handled inside
eapol. If no_rekey is unset we rekey as normal and the connection
remains open. If we have set no_rekey eapol will emit
HANDSHAKE_EVENT_REKEY_FAILED, which is now caught inside station. If
this happens our only choice is to fully disconnect and reconnect.
If we receive handshake message 1/4 after we are already connected
the AP is attempting to rekey. This may not be allowed and if not
we do not process the rekey and emit HANDSHAKE_EVENT_REKEY_FAILED
so any listeners can handle accordingly.
The AP structure was getting cleaned up twice. When the DBus stop method came
in we do AP_STOP on nl80211. In this callback the AP was getting freed in
ap_reset. Also when the DBus interface was cleaned up it triggered ap_reset.
Since ap->started gets set to false in ap_reset, we now check this and bail
out if the AP is already stopped.
Fixes:
++++++++ backtrace ++++++++
0 0x7f099c11ef20 in /lib/x86_64-linux-gnu/libc.so.6
1 0x43fed0 in l_queue_foreach() at ell/queue.c:441 (discriminator 3)
2 0x423a6c in ap_reset() at src/ap.c:140
3 0x423b69 in ap_free() at src/ap.c:162
4 0x44ee86 in interface_instance_free() at ell/dbus-service.c:513
5 0x451730 in _dbus_object_tree_remove_interface() at ell/dbus-service.c:1650
6 0x405c07 in netdev_newlink_notify() at src/netdev.c:4449 (discriminator 9)
7 0x440775 in l_hashmap_foreach() at ell/hashmap.c:534
8 0x4455d3 in process_broadcast() at ell/netlink.c:158
9 0x4439b3 in io_callback() at ell/io.c:126
10 0x442c4e in l_main_iterate() at ell/main.c:473
11 0x442d1c in l_main_run() at ell/main.c:516
12 0x442f2b in l_main_run_with_signal() at ell/main.c:644
13 0x403ab3 in main() at src/main.c:504
14 0x7f099c101b97 in /lib/x86_64-linux-gnu/libc.so.6
+++++++++++++++++++++++++++
This will allow for blacklisting a BSS if the connection fails. The
actual blacklist module is simple and must be driven by station. All
it does is add BSS addresses, a timestamp, and a timeout to a queue.
Entries can also be removed, or checked if they exist. The blacklist
timeout is configuratble in main.conf, as well as the blacklist
timeout multiplier and maximum timeout. The multiplier is used after
a blacklisted BSS timeout expires but we still fail to connect on the
next connection attempt. We multiply the current timeout by the
multiplier so the BSS remains in the blacklist for a larger growing
amount of time until it reaches the maximum (24 hours by default).
Soon BSS blacklisting will be added, and in order to properly decide if
a BSS should be blacklisted we need the status code on a failed
connection. This change stores the status code when there is a failure
in netdev and adds a getter to retrieve later. In many cases we have
the actual status code from the AP, but in some corner cases its not
obtainable (e.g. an error sending an NL80211 command) in which case we
just default to MMPDU_REASON_CODE_UNSPECIFIED.
Rather than continue with the pattern of setting netdev->result and
now netdev->last_status_code, the netdev_connect_failed function was
redefined so its no longer used as both a NL80211 callback and called
directly. Instead a new function was added, netdev_disconnect_cb which
just calls netdev_connect_failed. netdev_disconnect_cb should not be
used for all the NL80211 disconnect commands. Now netdev_connect_failed
takes both a result and status code which it sets in the netdev object.
In the case where we were using netdev_connect_failed as a callback we
still need to set the result and last_status_code but at least this is
better than having to set those in all cases.
Remove an unneeded buffer and its memcpy, remove the now unneeded use of
l_checksum_digest_length and use l_checksum_reset instead of creating a
new l_checksum for each chunk.
ELL ECC supports group 20 (P384) so OWE can also support it. This also
adds group negotiation, where OWE can choose a different group than the
default if the AP requests it.
A check needed to be added in netdev in order for the negotiation to work.
The RFC says that if a group is not supported association should be rejected
with code 77 (unsupported finite cyclic group) and association should be
started again. This rejection was causing a connect event to be emitted by
the kernel (in addition to an associate event) which would result in netdev
terminating the connection, which we didn't want. Since OWE receives the
rejected associate event it can intelligently decide whether it really wants
to terminate (out of supported groups) or try the next available group.
This also utilizes the new MIC/KEK/KCK length changes, since OWE dictates
the lengths of those keys.
Rather than hard coding to SHA256, we can pass in l_checksum_type
and use that SHA. This will allow for OWE/SAE/PWD to support more
curves that use different SHA algorithms for hashing.
OWE defines KEK/KCK lengths depending on group. This change adds a
case into handshake_get_key_sizes. With OWE we can determine the
key lengths based on the PMK length in the handshake.
In preparation for OWE supporting multiple groups eapol needed some
additional cases to handle the OWE AKM since OWE dictates the KEK,
KCK and MIC key lengths (depending on group).
Right now the PMK is hard coded to 32 bytes, which works for the vast
majority of cases. The only outlier is OWE which can generate a PMK
of 32, 48 or 64 bytes depending on the ECC group used. The PMK length
is already stored in the handshake, so now we can just pass that to
crypto_derive_pairwise_ptk
The crypto_ptk was hard coded for 16 byte KCK/KEK. Depending on the
AKM these can be up to 32 bytes. This changes completely removes the
crypto_ptk struct and adds getters to the handshake object for the
kck and kek. Like before the PTK is derived into a continuous buffer,
and the kck/kek getters take care of returning the proper key offset
depending on AKM.
To allow for larger than 16 byte keys aes_unwrap needed to be
modified to take the kek length.
The MIC length was hard coded to 16 bytes everywhere, and since several
AKMs require larger MIC's (24/32) this needed to change. The main issue
was that the MIC was hard coded to 16 bytes inside eapol_key. Instead
of doing this, the MIC, key_data_length, and key_data elements were all
bundled into key_data[0]. In order to retrieve the MIC, key_data_len,
or key_data several macros were introduced which account for the MIC
length provided.
A consequence of this is that all the verify functions inside eapol now
require the MIC length as a parameter because without it they cannot
determine the byte offset of key_data or key_data_length.
The MIC length for a given handshake is set inside the SM when starting
EAPoL. This length is determined by the AKM for the handshake.
Non-802.11 AKMs can define their own key lengths. Currently only OWE does
this, and the MIC/KEK/KCK lengths will be determined by the PMK length so
we need to save it.
Make sure we don't pass NULLs to memcmp or l_memdup when the prefix
buffer is NULL. There's no point having callers pass dummy buffers if
they need to watch frames independent of the frame data.
Start using l_key_generate_dh_private and l_key_validate_dh_payload to
check for the disallowed corner case values in the DH private/public
values generated/received.
Some of the EAP methods don't require a clear-text identity to
be sent with the Identity Response packet. The mandatory identity
filed has resulted in unnecessary transmission of the garbage
values. This patch makes the Identity field to be optional and
shift responsibility to ensure its existence to the individual
methods if the field is required. All necessary identity checks
have been previously propagated to individual methods.
If a network is being forgotten, then make sure to reset connected_time.
Otherwise the rank logic thinks that the network is known which can
result in network_find_rank_index returning -1.
Found by sanitizer:
src/network.c:1329:23: runtime error: index -1 out of bounds for type
'double [64]'
==25412==ERROR: AddressSanitizer: global-buffer-overflow on address 0x000000421ab0 at pc 0x000000402faf bp 0x7fffffffdb00 sp 0x7fffffffdaf0
READ of size 4 at 0x000000421ab0 thread T0
#0 0x402fae in validate_mgmt_ies src/mpdu.c:128
#1 0x403ce8 in validate_probe_request_mmpdu src/mpdu.c:370
#2 0x404ef2 in validate_mgmt_mpdu src/mpdu.c:662
#3 0x405166 in mpdu_validate src/mpdu.c:706
#4 0x402529 in ie_order_test unit/test-mpdu.c:156
#5 0x418f49 in l_test_run ell/test.c:83
#6 0x402715 in main unit/test-mpdu.c:171
#7 0x7ffff5d43ed9 in __libc_start_main (/lib64/libc.so.6+0x20ed9)
#8 0x4019a9 in _start (/home/denkenz/iwd-master/unit/test-mpdu+0x4019a9)
This fixes the valgrind warning:
==14804== Conditional jump or move depends on uninitialised value(s)
==14804== at 0x402E56: sae_is_quadradic_residue (sae.c:218)
==14804== by 0x402E56: sae_compute_pwe (sae.c:272)
==14804== by 0x402E56: sae_build_commit (sae.c:333)
==14804== by 0x402E56: sae_send_commit (sae.c:591)
==14804== by 0x401CC3: test_confirm_after_accept (test-sae.c:454)
==14804== by 0x408A28: l_test_run (test.c:83)
==14804== by 0x401427: main (test-sae.c:566)