The initial ANQP parser design did not work well with how the hotspot
implementation was turning out. For one, much care was taken into parsing
the EAP credentials which are not really required. The assumption is
that any hotspot network will already be provisioned, so checking that
the EAP parameters match is a bit overkill. Instead only the NAI Realms
will be checked. This greatly simplifies the NAI realm parser, as now it
can just return a string list of realms instead of the full EAP
credential info.
This module will be in charge of managing Hotspot provisioning files
stored under the .hotspot/ directory. This includes a dir watch to
handle file changes/removal as well as an API to match a network
object to a hotspot provisioning file.
Hotspot networks are supposed to include an HESSID in the scan
results. This is more or less an identifier for the overall
network. In addition, the NAI Realms can be obtained via ANQP
and should be the same for each BSS. Since both HESSID and NAI
realms should be the same for a given network in range we can
store these values in the network object itself. This also allows
us to easily find hotspot configuration files by looking at
the HESSID/NAI Realms directly in the network object as opposed
to individual scan_bss's.
In order to do ANQP efficiently IWD needs the ability to suspend scanning
temporarily. This is because both scanning and ANQP go offchannel and must
remain off channel for some amount of time. This cannot be done
simultaneously and if e.g. ANQP is requested after a scan is already
pending, the kernel will wait till that scan finishes before sending out
the frame.
Use memset instead. explicit_bzero should only be used when we're
wiping a secret just prior to the encopassing storage being freed. The
compiler would usually optimize away the memset, leaving the secrets
around.
In rtnlutil we're simply zeroing the structure prior to filling it, so
the use of explicit_bzero is not needed and brings confusion to the
reader since no secrets are being wiped.
netconfig is interested in three station states: connected,
disconnected and connected after it has roamed. On connected
it tries to obtain a new DHCP lease, on disconnected it stops
the DHCP client and discards all addresses from interface, on
connected after roaming it will try to request a previously
issued address.
iwd keeps track of the addresses assigned to the managed
interfaces. The list of assigned IPv4/IPv6 addresses is stored
in ifaddr_list inside of netconfig. The tracking of the IP
addresses will help to remove them from an interface once they
are no longer valid.
netconfig module will be responsible for the orchestration
of the network configuration with the IP addresses.
iwd creates one netconfig structure per interface index.
The purpose of this struct is to hold all of the interface
related addressing states such as: assigned dhcp
clients, known addresses, routes, etc.
A not-yet-merged kernel patch will enable the FRAME_WAIT_CANCEL
event to be emitted when a CMD_FRAME duration expires. This can
shortcut the ridiculously long timeout that is required making
GAS requests with no response drastically quicker to handle.
This adds a new API netdev_anqp_request which will send out a GAS
request, parses the GAS portion of the response and forwards the
ANQP response to the callers callback.
This IE tells us what Advertisement Protocols the AP supports. This
is only here to look for ANQP support, so all this does is iterate
through all other Advertisement Protocol tuples looking for ANQP.
If found, anqp_capable is set in the scan_bss
Currently these are geared to support the WiFi Alliance Hotspot 2.0
ANQP elements, which all fall under the vendor specific ANQP element.
anqp_iter_next behaves similar to the genl parsers, where the id, length
and data will be returned as out parameters. Currently there is only
vendor support for Hotspot 2.0. anqp_iter_is_hs20 can be used to setup
the subtype, length, and data pointer to parse any Hotspot 2.0 ANQP
elements. From here the subtype can be checked and a vendor specific
parser for that subtype can be used to parse the data, e.g.
hs20_parse_osu_provider_nai.
The vendor specific IE was being parsed only to check if the AP supported
WPA, which used a Microsoft OUI. Hotspot/OSEN uses neither WPA or RSN
(although its nearly identical to RSN) so the we also need to check for
this Wifi-Alliance OUI and set bss->osen (new) if found.
The OSEN AKM uses the vendor specific IE, so when finding the RSNE
element we need to handle it specially to ensure that its both
a vendor specific element and it matches the WFA OUI since other
vendor specific elements may be included.
The OSEN AKM is nearly identical to the RSN IE, but differs slightly.
For one, OSEN is encapsulated into the vendor specific IE, and includes
the WFA OUI before the 'normal' RSN elements. OSEN also does not include
a WPA version, since its not technically WPA/WPA2.
Some of the RSN parsing was made common so both RSN/OSEN parsing could
use it.
The handshake object had 4 setters for authenticator/supplicant IE.
Since the IE ultimately gets put into the same buffer, there really
only needs to be a single setter for authenticator/supplicant. The
handshake object can deal with parsing to decide what kind of IE it
is (WPA or RSN).
The Hotspot 2.0 spec introduces 'Anonymous EAP-TLS' as a new EAP method
to be used with OSEN/Hotspot. The protocol details of this aren't
relevant to this patch, but one major difference is that it uses the
expanded EAP type rather than the TLS type. Since the common TLS code
was written with only EAP_TYPE_TLS in mind the vendor ID/type cause the
EAP packet to be malformed when using the expanded EAP type.
To handle this the common TLS code now checks the EAP type, and if its
expanded we shift the payload 7 bytes further to account for the extra
header data.
802.11 defines GAS (generic advertisement service) which can be used
to query supported advertisement protocols from an AP before
authentication/association. Hotspot/OSEN only care about the ANQP
protocol, but the way the IE is structured potentially requires
iterating through several tuples before you reach the ANQP protocol
identifier. Because of this we define all protocol identifiers.
This adds some checks for the FT_OVER_FILS AKMs in station and netdev
allowing the FILS-FT AKMs to be selected during a connection.
Inside netdev_connect_event we actually have to skip parsing the IEs
because FILS itself takes care of this (needs to handle them specially)
FILS unfortunately is a special case when it comes to fast transition.
We have to process the FT IEs internally since we cannot trigger the
same initial mobility association code path (via netdev).
FT over FILS-SHA384 uses a 24 byte FT MIC rather than the 16 byte MIC
used for all other AKMs. This change allows both the FT builder/parser
to handle both lengths of MIC. The mic length is now passed directly
into ie_parse_fast_bss_transition and ie_build_fast_bss_transition
FILS-FT is a special case with respect to the PTK keys. The KCK getter
was updated to handle both FT-FILS AKMs, by returning the offset in
the PTK to the special KCK generated during FILS. A getter for the KCK
length was added, which handles the SHA384 variant. The PTK size was
also updated since FILS-FT can generate an additional 56 bytes of PTK
ifaddr is not guaranteed to be initialized, I'm not sure why there was
no compiler warning. Also replace a | with a || for boolean conditions
and merge the wiphy check with that line.
When handling a scan finished event for a scan we haven't started check
that we were not halfway through a scan request that would have its
results flushed by the external scan.
FT-over-DS is a way to do a Fast BSS Transition using action frames for
the authenticate step. This allows a station to start a fast transition
to a target AP while still being connected to the original AP. This,
in theory, can result in less carrier downtime.
The existing ft_sm_new was removed, and two new constructors were added;
one for over-air, and another for over-ds. The internals of ft.c mostly
remain the same. A flag to distinguish between air/ds was added along
with a new parser to parse the action frames rather than authenticate
frames. The IE parsing is identical.
Netdev now just initializes the auth-proto differently depending on if
its doing over-air or over-ds. A new TX authenticate function was added
and used for over-ds. This will send out the IEs from ft.c with an
FT Request action frame.
The FT Response action frame is then recieved from the AP and fed into
the auth-proto state machine. After this point ft-over-ds behaves the
same as ft-over-air (associate to the target AP).
Some simple code was added in station.c to determine if over-air or
over-ds should be used. FT-over-DS can be beneficial in cases where the
AP is directing us to roam, or if the RSSI falls below a threshold.
It should not be used if we have lost communication to the AP all
(beacon lost) as it only works while we can still talk to the original
AP.
To support FT-over-DS this API needed some slight modifications:
- Instead of setting the DA to netdev->handshake->aa, it is just set to
the same address as the 'to' parameter. The kernel actually requires
and checks for these addresses to match. All occurences were passing
the handshake->aa anyways so this change should have no adverse
affects; and its actually required by ft-over-ds to pass in the
previous BSSID, so hard coding handshake->aa will not work.
- The frequency is is also passed in now, as ft-over-ds needs to use
the frequency of the currently connected AP (netdev->frequency get
set to the new target in netdev_fast_transition. Previous frequency
is also saved now).
- A new vector variant (netdev_send_action_framev) was added as well
to support sending out the FT Request action frame since the FT
TX authenticate function provides an iovec of the IEs. The existing
function was already having to prepend the action frame header to
the body, so its not any more or less copying to do the same thing
with an iovec instead.
Since FT already handles processing the FT IE's (and building for
associate) it didn't make sense to have all the IE building inside
netdev_build_cmd_ft_authenticate. Instead this logic was moved into
ft.c, and an iovec is now passed from FT into
netdev_ft_tx_authenticate. This leaves the netdev command builder
unburdened by the details of FT, as well as prepares for FT-over-DS.
Blacklist some drivers known to crash when interfaces are deleted or
created so that we don't even attempt that before falling back to using
the default interface.
Read the driver name for each wiphy from sysfs if available. I didn't
find a better way to obtain the driver name for a phy than by reading
the dir name that the "driver" symlink points at. For an existing
netdev this can be done using the SIOCETHTOOL ioctl.
manager_interface_dump_done would use manager_create_interfaces() at the
end of the loop iterating over pending_wiphys. To prevent it from
crashing make sure manager_create_interfaces never frees the pending
wiphy state and instead make the caller check whether it needs to be
freed so it can be done safely inside loops.
Instead of having two separate types of scans make the periodic scan
logic a layer on top of the one-off scan requests, with minimum code to
account for the lower priority of those scans and the fact that periodic
scans also receive results from external scans. Also try to simplify
the code for both the periodic and one-off scans. In the SCAN_RESULTS
and SCAN_ABORT add more complete checks of the current request's state
so we avoid some existing crashes related to external scans.
scan_send_next_cmd and start_next_scan_request are now just one function
since their funcionality was similar and start_next_scan_request is used
everywhere. Also the state after the trigger command receives an EBUSY
is now the same as when a new scan is on top of the queue so we have
fewer situations to consider.
This code still does not account for fragmented scans where an external
scan between two or our fragments flushes the results and we lose some
of the results, or for fragmented scans that take over 30s and the
kernel expires some results (both situations are unlikely.)
In both netdev_{authenticate,associate}_event there is no need to check
for in_ft at the start since netdev->ap will always be set if in_ft is
set.
There was also no need to set eapol_sm_set_use_eapol_start, as setting
require_handshake implies this and achieves the same result when starting
the SM.
Since FT operates over Authenticate/Associate, it makes the most sense
for it to behave like the other auth-protos.
This change moves all the FT specific processing out of netdev and into
ft.c. The bulk of the changes were strait copy-pastes from netdev into
ft.c with minor API changes (e.g. remove struct netdev).
The 'in_ft' boolean unforunately is still required for a few reasons:
- netdev_disconnect_event relies on this flag so it can ignore the
disconnect which comes in when doing a fast transition. We cannot
simply check netdev->ap because this would cause the other auth-protos
to not handle a disconnect correctly.
- netdev_associate_event needs to correctly setup the eapol_sm when
in FT mode by setting require_handshake and use_eapol_start to false.
This cannot be handled inside eapol by checking the AKM because an AP
may only advertise a FT AKM, and the initial mobility association
does require the 4-way handshake.
Now the 'ft' module, previously ftutil, will be used to drive FT via
the auth-proto virtual class. This renaming is in preparation as
ftutil will become obsolete since all the IE building/processing is
going to be moved out of netdev. The new ft.c module will utilize
the existing ftutil functionality, but since this is now a full blown
auth protocol naming it 'ft' is better suited.
The duplicate/similar code in netdev_associate_event and
netdev_connect_event leads to very hard to follow code, especially
when you throw OWE/SAE/FILS or full mac cards into the mix.
Currently these protocols finish the connection inside
netdev_associate_event, and set ignore_connect_event. But for full
mac cards we must finish the connection in netdev_connect_event.
In attempt to simplify this, all connections will be completed
and/or the 4-way started in netdev_connect_event. This satisfies
both soft/full mac cards as well as simplifies the FT processing
in netdev_associate_event. Since the FT IEs can be processed in
netdev_connect_event (as they already are to support full mac)
we can assume that any FT processing inside netdev_associate_event
is for a fast transition, not initial mobility association. This
simplifies netdev_ft_process_associate by removing all the blocks
that would get hit if transition == false.
Handling FT this way also fixes FT-SAE which was broken after the
auth-proto changes since the initial mobility association was
never processed if there was an auth-proto running.
SAE was a bit trickier than OWE/FILS because the initial implementation
for SAE did not include parsing raw authenticate frames (netdev skipped
the header and passed just the authentication data). OWE/FILS did not
do this and parse the entire frame in the RX callbacks. Because of this
it was not as simple as just setting some RX callbacks. In addition,
the TX functions include some of the authentication header/data, but
not all (thanks NL80211), so this will require an overhaul to test-sae
since the unit test passes frames from one SM to another to test the
protocol end-to-end (essentially the header needs to be prepended to
any data coming from the TX functions for the end-to-end tests).
Since ERP is only used for FILS and not behaving in the 'normal' ERP
fashion (dealing with actual EAP data, timeouts etc.) we can structure
ERP as a more synchronous protocol, removing the need for a complete
callback.
Now, erp_rx_packet returns a status, so FILS can decide how to handle
any failures. The complete callback was also removed in favor of a
getter for the RMSK (erp_get_rmsk). This allows FILS to syncronously
handle ERP, and potentially fail directly in fils_rx_authenticate.
A new eapol API was added specifically for FILS (eapol_set_started). Since
either way is special cased for FILS, its a bit cleaner to just check the
AKM inside eapol_start and, if FILS, dont start any timeouts or start the
handshake (effectively what eapol_set_started was doing).
This is a new concept applying to any protocol working over authenticate
and/or associate frames (OWE/SAE/FILS). All these protocols behave
similarly enough that they can be unified into a handshake driver
structure.
Now, each protocol will initialize this auth_proto structure inside
their own internal data. The auth_proto will be returned from
the initializer which netdev can then use to manage the protocol by
forwarding authenticate/associate frames into the individual drivers.
The auth_proto consists only of function pointers:
start - starts the protocol
free - frees the driver data
rx_authenticate - receive authenticate frame
rx_associate - receive associate frame
auth_timeout - authenticate frame timed out
assoc_timeout - associate frame timed out
If the setting is true we'll not attempt to remove or create
interfaces on any wiphys and will only use the default interface
(if it exists). If false, force us managing the interfaces. Both
values override the auto logic.
An unexpected Associate event would cause iwd to crash when accessing
netdev->handshake->mde. netdev->handshake is only set if we're
attempting to connect or connected somewhere so check netdev->connected
first.
FILS needs to allocate an extra 16 bytes of key data for the AES-SIV
vector. Instead of leaving it up to the caller to figure this out (as
was done with the GTK builder) eapol_create_common can allocate the
extra space since it knows the MIC length.
This also updates _create_gtk_2_of_2 as it no longer needs to create
an extra data array.
Since FILS does not use a MIC, the 1/4 handler would always get called
for FILS PTK rekeys. We can use the fact that message 1/4 has no MIC as
well as no encrypted data to determine which packet it is. Both no MIC
and no encrypted data means its message 1/4. Anything else is 3/4.
For FILS rekeys, we still derive the PTK using the 4-way handshake.
And for FILS-SHA384 we need the SHA384 KDF variant when deriving.
This change adds both FILS-SHA256 and FILS-SHA384 to the checks
for determining the SHA variant.
crypto_derive_pairwise_ptk was taking a boolean to decide whether to
use SHA1 or SHA256, but for FILS SHA384 may also be required for
rekeys depending on the AKM.
crypto_derive_pairwise_ptk was changed to take l_checksum_type instead
of a boolean to allow for all 3 SHA types.
AP still relies on the get_data/set_length semantics. Its more convenient
to still use these since it avoids the need for extra temporary buffers
when building the rates IE.
The TLV builder APIs were not very intuative, and in some (or all)
cases required access to the builder structure directly, either to
set the TLV buffer or to get the buffer at the end.
This change adds a new API, ie_tlv_builder_set_data, which both sets
the length for the current TLV and copies the TLV data in one go.
This will avoid the need for memcpy(ie_tlv_builder_get_data(...),...)
ie_tlv_builder_finalize was also changed to return a pointer to the
start of the build buffer. This will eliminate the need to access
builder.tlv after building the TLVs.
ie_tlv_builder_init was changed to take an optional buffer to hold
the TLV data. Passing NULL/0 will build the TLV in the internal
buffer. Passing in a pointer and length will build into the passed
in buffer.
Let manager.c signal to wiphy.c when the wiphy parsing from the genl
messages is complete. When we query for existing wiphy using the
GET_WIPHY dump command we get many genl messages per wiphy, on a
notification we only get one message. So after wiphy_create there may
be one or many calls to wiphy_update_from_genl. wiphy_create_complete
is called after all of them, so wiphy.c can be sure it's done with
parsing the wiphy attributes when in prints the new wiphy summary log
message, like it did before manager.c was added.
I had wrongly assumed that all the important wiphy attributes were in
the first message in the dump, but NL80211_ATTR_EXT_FEATURES was not and
wasn't being parsed which was breaking at least testRSSIAgent.
SAE was behaving inconsitently with respect to freeing the state.
It was freeing the SM internally on failure, but requiring netdev
free it on success.
This removes the call to sae_sm_free in sae.c upon failure, and
instead netdev frees the SM in the complete callback in all cases
regardless of success or failure.
From netdev's prospective FILS works the same as OWE/SAE where we create
a fils_sm and forward all auth/assoc frames into the FILS module. The
only real difference is we do not start EAPoL once FILS completes.
FILS (Fast Initial Link Setup) allows a station to negotiate a PTK during
authentication and association. This allows for a faster connection as
opposed to doing full EAP and the 4-way. FILS uses ERP (EAP Reauth Protocol)
to achieve this, but encapsulates the ERP data into an IE inside
authenticate frames. Association is then used to verify both sides have
valid keys, as well as delivering the GTK/IGTK.
FILS will work similar to SAE/OWE/FT where netdev registers a fils_sm, and
then forwards all Auth/Assoc frame data to and from the FILS module.
Keeping the ERP cache on the handshake object allows station.c to
handle all the ERP details and encapsulate them into a handshake.
FILS can then use the ERP cache right from the handshake rather
than getting it itself.
wiphy_select_akm needed to be updated to take a flag, which can be
set to true if there are known reauth keys for this connection. If
we have reauth keys, and FILS is available we will choose it.
If the AP send an associate with an unsupported group status, OWE
was completely starting over and sending out an authenticate frame
when it could instead just resend the associate frame with a
different group.
With FILS support coming there needs to be a way to set the PTK directly.
Other AKMs derive the PTK via the 4-way handshake, but FILS computes the
PTK on its own.
This reverts commit 1e337259ce.
Using util_get_username was wrong in this context. MSCHAPv2 expects us
to only strip the domain name from identities of the form
domain\identity. util_get_username would also strip identities of the
form username@domain.com.
FILS-SHA384 got overlooked and the kek length was being hard coded
to 32 bytes when encrypting the key data. There was also one occurence
where the kek_len was just being set incorrectly.
If the input length is 16 bytes, this means aes_siv_decrypt should
only be verifying the 16 byte SIV and not decrypting any data. If
this is the case, we can skip over the whole AES-CTR portion of
AES-SIV and only verify the SIV.
In eapol_key_handle, 'have_snonce' is checked before decrypting the
key data. For FILS, there will be no snonce so this check can be
skipped if mic_len == 0.
The GTK handshake for FILS uses AES-SIV to encrypt the key data, and
does away with the MIC completely. Now, when finalizing the 2/2 GTK
packet we check the MIC length, and if zero we assume FILS is being
used and we use AES-SIV to encrypt the key data.
For FILS, there is no actual data being encrypted for GTK 2/2 (hence
why the input data length is zero). This results in only the SIV
being generated, which essentially serves the same purpose as a MIC.
FILS does not use a MIC, as well as requires encrypted data on GTK 2/2.
This updates eapol_create_gtk_2_of_2 to pass in extra data to
eapol_create_common, which will reserve room for this encrypted data.
Extra data is only reserved if mic_len == 0.
FILS does not use a MIC in EAPoL frames and also requires encrypted
data on all EAPoL frames. In the common builder the mic_len is now
checked and the flags are set appropriately.
FILS authentication does away with the MIC, so checking for key_mic
in the eapol key frame does not allow FILS to work. Now we pass in
the mic_len to eapol_verify_gtk_1_of_2, and if it is non-zero we can
check that the MIC is present in the frame.
FILS does not require an eapol_sm for authentication, but rekeys
are still performed using the 4-way handshake. Because of this
FILS needs to create a eapol_sm in a 'started' state, but without
calling eapol_start as this will initialize EAP and create handshake
timeouts.
This allows EAPoL to wait for any 4-way packets, and handle them
as rekeys.
ERP (EAP Reauthentication Protocol) allows a station to quickly
reauthenticate using keys from a previous EAP authentication.
This change both implements ERP as well as moves the key cache into
the ERP module.
ERP in its current form is here to only support FILS. ERP is likely not
widespread and there is no easy way to determine if an AP supports ERP
without trying it. Attempting ERP with a non-ERP enabled AP will actually
result in longer connection times since ERP must fail and then full EAP
is done afterwards. For this reason ERP was separated from EAP and a
separate ERP state machine must be created. As it stands now, ERP cannot
be used on its own, only with FILS.
Quick scan uses a set of frequencies associated with the
known networks. This allows to reduce the scan latency.
At this time, the frequency selection follows a very simple
logic by taking all known frequencies from the top 5 most
recently connected networks.
If connection isn't established after the quick scan attempt,
we fall back to the full periodic scan.
Instead of handling NEW_WIPHY events and WIHPY_DUMP events in a similar
fashion, split up the paths to optimize iwd startup time. There's
fundamentally no reason to wait a second (and eat up file-descriptor
resources for timers unnecessarily) when we can simply start an
interface dump right after the wiphy dump.
In case a new wiphy is added in the middle of a wiphy dump, we will
likely get a new wiphy event anyway, in which case a setup_timeout will
be created and we will ignore this phy during the interface dump
processing.
This also optimizes the case of iwd being re-started, in which case
there are no interfaces present.
Separate out the two types of NEW_WIPHY handlers into separate paths and
factor out the common code into a utility function.
Dumps of CMD_NEW_WIPHY can be split up over several messages, while
CMD_NEW_WIPHY events (generated when a new card is plugged in) are
stuffed into a single message.
This also prepares ground for follow-on commits where we will handle the
two types of events differently.
src/netdev.c:netdev_create_from_genl() Skipping duplicate netdev wlp2s0[3]
Aborting (signal 11) [/home/denkenz/iwd/src/iwd]
++++++++ backtrace ++++++++
#0 0x7fc4c7a4e930 in /lib64/libc.so.6
#1 0x40ea13 in netdev_getlink_cb() at src/netdev.c:4654
#2 0x468cab in process_message() at ell/netlink.c:183
#3 0x4690a3 in can_read_data() at ell/netlink.c:289
#4 0x46681d in io_callback() at ell/io.c:126
#5 0x4651cd in l_main_iterate() at ell/main.c:473
#6 0x46530e in l_main_run() at ell/main.c:516
#7 0x465626 in l_main_run_with_signal() at ell/main.c:642
#8 0x403df8 in main() at src/main.c:513
#9 0x7fc4c7a39bde in /lib64/libc.so.6
Mirror netdev.c white/blacklist logic. If either or both the whitelist
and the blacklist are given also fall back to not touching the existing
interface setup on the wiphy.
If we get an error during DEL_INTERFACE or NEW_INTERFACE we may be
dealing with a driver that doesn't implement virtual interfaces or
doesn't implement deleting the default interface. In this case fall
back to using the first usable interface that we've detected on this
wiphy.
There's at least one full-mac driver that doesn't implement the cfg80211
.del_virtual_intf and .add_virtual_intf methods and at least one that
only allows P2P interfaces to be manipulated. mac80211 drivers seem to
at least implement those methods but I didn't check to see if there are
driver where they'd eventually return EOPNOTSUPP.
This is probably the trickiest part in this patchset. I'm introducing a
new logic where instead of using the interfaces that we find present
when a wiphy is detected, which would normally be the one default
interface per wiphy but could be 0 or more than one, we create one
ourselves with the socket owner attribute and use exactly one for
Station, AP and Ad-Hoc modes. When IWD starts we delete all the
interfaces on existing wiphys that we're going to use (as determined by
the wiphy white/blacklists) or freshly hotplugged ones, and only then we
register the interface we're going to use meaning that the wiphy's
limits on the number of concurrent interfaces of each type should be at
0. Otherwise we'd be unlikely to be abe to create the station interface
as most adapters only allow one. After that we ignore any interfaces
that may be created by other processes as we have no use for multiple
station interfaces.
At this point manager.c only keeps local state for wiphys during
the interface setup although when we start adding P2P code we will be
creating and removing interfaces multiple times during the wiphy's
runtime and may need to track it here or in wiphy.c. We do not
specifically check the interface number limits received during the wiphy
dump, if we need to create any interfaces and we're over the driver's
maximum for that specific iftype we'll still attempt it and report error
if it fails.
I tested this and it seems to work with my laptop's intel card and some
USB hotplug adapters.
The latest refactoring ended up assuming that FT related elements would
be handled in netdev_associate_event. However, FullMac cards (that do
not generate netdev_associate_event) could still connect using FT AKMs
and perform the Initial mobility association. In such cases the FTE
element was required but ended up not being set into the handshake.
This caused the handshake to fail during PTK 1_of_4 processing.
Fix this by making sure that FTE + related info is set into the
handshake, albeit with a lower sanity checking level since the
elements have been processed by the firmware already.
Note that it is currently impossible for actual FTs to be performed on
FullMac cards, so the extra logic and sanity checking to handle these
can be skipped.
Add functionality to read and parse the known frequencies
from permanent storage on start of the service. On service
shutdown, we sync the known frequencies back to the permanent
storage.
Each known network (previously connected) will have a set
of known frequencies associated with it, e.g. a set of
frequencies from all BSSs observed. The list of known
frequencies is sorted with the most recently observed
frequency in the head.
Previously, the scan results were disregarded once the new
ones were available. To enable the scan scenarios where the
new scan results are delivered in parts, we introduce a
concept of aging BSSs and will remove them based on
retention time.
Add manager.c, a new file where the wiphy and interface creation/removal
will be handled and interface use policies will be implemented. Since
not all kernel-side nl80211 interfaces are tied to kernel-side netdevs,
netdev.c can't manage all of the interfaces that we will be using, so
the logic is being moved to a common place where all interfaces on a
wiphy will be managed according to the policy, device support for things
like P2P and user enabling/disabling/connecting with P2P which require
interfaces to be dynamically added and removed.
Add wiphy_create, wiphy_update_from_genl and wiphy_destroy that together
will let a new file command the wiphy creation, updates and deletion
with the same functionality the current config notification handler
implements in wiphy.c.
As mentioned in code comments the name is NUL-terminated so there's no
need to return the length path, which was ignored in some occasions
anyway. Consistently treat it as NUL-terminated but also validate.
Make netdev_create_from_genl public and change signature to return the
created netdev or NULL. Also add netdev_destroy that destroys and
unregisters the created netdevs. Both will be used to move the
whole interface management to a new file.
The handshake_state only holds a single AKM value. FILS depends on the AP
supporting EAP as well as FILS. The first time IWD connects, it will do a
full EAP auth. Subsequent connections (assuming FILS is supported) will use
FILS. But if the AP does not support FILS there is no reason to cache the
ERP keys.
This adds the supp_fils to the handshake_state. Now, station.c can set this
flag while building the handshake. This flag can later be checked when
caching the ERP keys.
This allows IWD to cache ERP keys after a full EAP run. Caching
allows IWD to quickly connect to the network later on using ERP or
FILS.
The cache will contain the EAP Identity, Session ID, EMSK, SSID and
optionally the ERP domain. For the time being, the cache entry
lifetimes are hard coded to 24 hours. Eventually the cache should
be written to disk to allow ERP/FILS to work after a reboot or
IWD restart.
mschaputil already had similar functionality, but ERP will need this
as well. These two functions will also handle identities with either
'@' or '\' to separate the user and domain.
Many operations performed during an error in load_settings were the same
as the ones performed when freeing the eap object. Add eap_free_common
to unify these.
EAP identites are recommended to follow RFC 4282 (The Network Access
Identifier). This RFC recommends a maximum NAI length of 253 octets.
It also mentions that RADIUS is only able to support NAIs of 253
octets.
Because of this, IWD should not allow EAP identities larger than 253
bytes. This change adds a check in eap_load_settings to verify the
identity does not exceed this limit.
The associate event is only important for OWE and FT. If neither of
these conditions (or FT initial association) are happening we do
not need to continue further processing the associate event.
802.11 mandates that IEs inside management frames are presented in a
given order. However, in the real world, many APs seem to ignore the
rules and send their IEs in seemingly arbitrary order, especially when
it comes to VENDOR tags. Change this function to no longer be strict in
enforcing the order.
Also, drop checking of rules specific to Probe Responses. These will
have to be handled separately (most likely by the AP module) since
802.11-2016, Section 11.1.4.3.5 essentially allows just about anything.
In netdev_associate_event the ignore_connect_event was getting set true,
but afterwards there were still potential failure paths. Now, once in
assoc_failed we explicitly set ignore_connect_event to false so the
the failure can be handled properly inside netdev_connect_event
The list of PSK/8021x AKM's in security_determine was getting long,
and difficult to keep under 80 characters. This moves them all into
two new macros, AKM_IS_PSK/AKM_IS_8021X.
It was assumed that the hunt-and-peck loop was guarenteed to find
a PWE. This was incorrect in terms of kernel support. If a system
does not have support for AF_ALG or runs out of file descriptors
the KDFs may fail. The loop continued to run if found == false,
which is also incorrect because we want to stop after 20 iterations
regarless of success.
This changes the loop to a for loop so it will always exit after
the set number of iterations.
CC src/scan.o
src/scan.c: In function ‘scan_bss_compute_rank’:
src/scan.c:1048:4: warning: this decimal constant is unsigned only in ISO C90
factor = factor * data_rate / 2340000000 +
The auto-connect state will now consist of the two phases:
STATION_STATE_AUTOCONNECT_QUICK and STATION_STATE_AUTOCONNECT_FULL.
The auto-connect will always start with STATION_STATE_AUTOCONNECT_QUICK
and then transition into STATION_STATE_AUTOCONNECT_FULL if no
connection has been established. During STATION_STATE_AUTOCONNECT_QUICK
phase we take advantage of the wireless scans with the limited number
of channels on which the known networks have been observed before.
This approach allows to shorten the time required for the network
sweeps, therefore decreases the connection latency if the connection
is possible. Thereafter, if no connection has been established after
the first phase we transition into STATION_STATE_AUTOCONNECT_FULL and
do the periodic scan just like we did before the split in
STATION_STATE_AUTOCONNECT state.
For simplicity 160Mhz and 80+80Mhz were grouped together when
parsing the VHT capabilities, but the 80+80 bits were left in
vht_widht_map. This could cause an overflow when getting the
width map.
wiphy_select_akm will now check if BIP is supported, and if MFPR is
set in the scan_bss before returning either SAE AKMs. This will allow
fallback to another PSK AKM (e.g. hybrid APs) if any of the requirements
are not met.
Replace existing uses of memset to clear secrets with explicit_bzero to
make sure it doesn't get optimized away. This has some side effects as
documented in gcc docs but is still recommended.
In eap_secret_info_free make sure we clear both strings in the case of
EAP_SECRET_REMOTE_USER_PASSWORD secrets.
Environments with several AP's, all at low signal strength may
want to lower the roaming RSSI threshold to prevent IWD from
roaming excessively. This adds an option 'roam_rssi_threshold',
which is still defaulted to -70.
Also printing keys with l_debug conditional on an environment variable
as someone wanting debug logs, or leaving debug on accidentally, does
not necessarily want the keys in the logs and in memory.
At some point the connect command builder was modified, and the
control port over NL80211 check was moved to inside if (is_rsn).
For WPS, no supplicant_ie was set, so CONTROL_PORT_OVER_NL80211
was never set into CMD_CONNECT. This caused IWD to expect WPS
frames over netlink, but the kernel was sending them over the
legacy route.
This commit hardens the iwd.service.in template file for systemd
services. The following is a short explanation for each added directive:
+PrivateTmp=true
If true, sets up a new file system namespace for the executed processes
and mounts private /tmp and /var/tmp directories inside it that is not
shared by processes outside of the namespace.
+NoNewPrivileges=true
If true, ensures that the service process and all its children can never
gain new privileges through execve() (e.g. via setuid or setgid bits, or
filesystem capabilities).
+PrivateDevices=true
If true, sets up a new /dev mount for the executed processes and only
adds API pseudo devices such as /dev/null, /dev/zero or /dev/random (as
well as the pseudo TTY subsystem) to it, but no physical devices such as
/dev/sda, system memory /dev/mem, system ports /dev/port and others.
+ProtectHome=yes
If true, the directories /home, /root and /run/user are made
inaccessible and empty for processes invoked by this unit.
+ProtectSystem=strict
If set to "strict" the entire file system hierarchy is mounted
read-only, except for the API file system subtrees /dev, /proc and /sys
(protect these directories using PrivateDevices=,
ProtectKernelTunables=, ProtectControlGroups=).
+ReadWritePaths=/var/lib/iwd/
Sets up a new file system namespace for executed processes. These
options may be used to limit access a process might have to the file
system hierarchy. Each setting takes a space-separated list of paths
relative to the host's root directory (i.e. the system running the
service manager). Note that if paths contain symlinks, they are resolved
relative to the root directory set with RootDirectory=/RootImage=.
Paths listed in ReadWritePaths= are accessible from within
the namespace with the same access modes as from outside of
it.
+ProtectControlGroups=yes
If true, the Linux Control Groups (cgroups(7)) hierarchies accessible
through /sys/fs/cgroup will be made read-only to all processes of the
unit.
+ProtectKernelModules=yes
If true, explicit module loading will be denied. This allows module
load and unload operations to be turned off on modular kernels.
For further explanation to all directives see `man systemd.directives`
Hostapd has now been updated to include the group number when rejecting
the connection with UNSUPP_FINITE_CYCLIC_GROUP. We still need the existing
len == 0 check because old hostapd versions will still behave this way.
The single-use password is apparently sent in plaintext over the network
but at least try to prevent it from staying in the memory until we know
it's been used.
station.c generates the IEs we will need to use for the
Authenticate/Associate and EAPoL frames and sets them into the
handshake_state object. However the driver may modify some of them
during CMD_CONNECT and we need to use those update values so the AP
isn't confused about differing IEs in diffent frames from us.
Specifically the "wl" driver seems to do this at least for the RSN IE.
The KDF function processes data in 32 byte chunks so for groups which
primes are not divisible by 32 bytes, you will get a buffer overflow
when copying the last chunk of data.
Now l_checksum_get_digest is limited to the bytes remaining in the
buffer, or 32, whichever is the smallest.
Since eapol_encrypt_key_data already calculates the key data length and
encodes it into the key frame, we can just return this length and avoid
having to obtain it again from the frame.
Similar to SAE, EAP-PWD derives an ECC point (PWE). It is possible
for information to be gathered from the timing of this derivation,
which could be used to to recover the password.
This change adapts EAP-PWD to use the same mitigation technique as
SAE where we continue to derive ECC points even after we have found
a valid point. This derivation loop continues for a set number of
iterations (20 in this case), so anyone timing it will always see
the same timings for every run of the protocol.
This is not used by any of the scan notify callback implementations and
for P2P we're going to need to scan on an interface without an ifindex
so without this the other changes should be mostly contained in scan.
Also add a mask parameter to wiphy_get_supported_iftypes to make sure
the SupportedModes property only contains the values that can be used
as Device.Mode.
dbus_iftype_to_string returns NULL for unknown iftypes, the strdup will
also return NULL and ret[i] will be assigned a NULL. As a result
the l_strjoinv will not print the known iftypes that might have come
after that and will the l_strfreev will leak the strduped strings.
sc->state would get set when the TRIGGERED event arrived or when the
triggered callback for our own SCAN_TRIGGER command is received.
However it would not get reset to NOT_RUNNING when the NEW_SCAN_RESULTS
event is received, instead we'd first request the results with GET_SCAN
and only reset sc->state when that returns. If during that command a
new scan gets triggered, the GET_SCAN callback would still reset
sc->state and clobber the value set by the new scan.
To fix that repurpose sc->state to only track that period from the
TRIGGERED signal to the NEW_SCAN_RESULTS signal. sc->triggered can be
used to check if we're still waiting for the GET_SCAN command and
sc->start_cmd_id to check if we're waiting for the scan to get
triggered, so one of these three variables will now always indicate if
a scan is in progress.
We can crash if we abort the connection, but the connect command has
already gone through. In this case we will get a sequence of
authenticate_event, associate_event, connect_event. The first and last
events don't crash since they check whether netdev->connected is true.
However, this causes an annoying warning to be printed.
Fix this by introducing an 'aborting' flag and ignore all connection
related events if it is set.
++++++++ backtrace ++++++++
Now that the OWE failure/retry is handled in netdev, we can catch
all associate error status' inside owe_rx_associate rather than only
catching UNSUPP_FINITE_CYCLIC_GROUP.
Apart from OWE, the association event was disregarded and all association
processing was done in netdev_connect_event. This led to
netdev_connect_event having to handle all the logic of both success and
failure, as well as parsing the association for FT and OWE. Also, without
checking the status code in the associate frame there is the potential
for the kernel to think we are connected even if association failed
(e.g. rogue AP).
This change introduces two flags into netdev, expect_connect_failure and
ignore_connect_event. All the FT processing that was once in
netdev_connect_event has now been moved into netdev_associate_event, as
well as non-FT associate frame processing. The connect event now only
handles failure cases for soft/half MAC cards.
Note: Since fullmac cards rely on the connect event, the eapol_start
and netdev_connect_ok were left in netdev_connect_event. Since neither
auth/assoc events come in on fullmac we shouldn't have any conflict with
the new flags.
Once a connection has completed association, EAPoL is started from
netdev_associate_event (if required) and the ignore_connect_event flag can
be set. This will bypass the connect event.
If a connection has failed during association for whatever reason, we can
set expect_connect_failure, the netdev reason, and the MPDU status code.
This allows netdev_connect_event to both handle the error, and, if required,
send a deauth telling the kernel that we have failed (protecting against the
rogue AP situation).
OWE processing can be completely taken care of inside
netdev_authenticate_event and netdev_associate_event. This removes
the need for OWE specific checks inside netdev_connect_event. We can
now return early out of the connect event if OWE is in progress.
Several Auth/Assoc failure status codes indicate that the connection
failed for reasons such as bandwidth issues, poor channel conditions
etc. These conditions should not result in the BSS being blacklisted
since its likely only a temporary issue and the AP is not actually
"broken" per-se.
This adds support in station.c to temporarily blacklist these BSS's
on a per-network basis. After the connection has completed we clear
out these blacklist entries.
Certain error conditions require that a BSS be blacklisted only for
the duration of the current connection. The existing blacklist
does not allow for this, and since this blacklist is shared between
all interfaces it doesnt make sense to use it for this purpose.
Instead, each network object can contain its own blacklist of
scan_bss elements. New elements can be added with network_blacklist_add.
The blacklist is cleared when the connection completes, either
successfully or not.
Now inside network_bss_select both the per-network blacklist as well as
the global blacklist will be checked before returning a BSS.
Several netdev events benefit from including event data in the callback.
This is similar to how the connect callback works as well. The content
of the event data is documented in netdev.h (netdev_event_func_t).
By including event data for the two disconnect events, we can pass the
reason code to better handle the failure in station.c. Now, inside
station_disconnect_event, we still check if there is a pending connection,
and if so we can call the connect callback directly with HANDSHAKE_FAILED.
Doing it this way unifies the code path into a single switch statment to
handle all failures.
In addition, we pass the RSSI level index as event data to
RSSI_LEVEL_NOTIFY. This removes the need for a getter to be exposed in
netdev.h.
On successful send, scan_send_start(..) used to set msg to NULL,
therefore the further management of the command by the caller was
impossible. This patch removes wrapper around l_genl_family_send()
and lets the callers to take responsibility for the command.
This change cleans up the mess of status vs reason codes. The two
types of codes have already been separated into different enumerations,
but netdev was still treating them the same (with last_status_code).
A new 'event_data' argument was added to the connect callback, which
has a different meaning depending on the result of the connection
(described inside netdev.h, netdev_connect_cb_t). This allows for the
removal of netdev_get_last_status_code since the status or reason
code is now passed via event_data.
Inside the netdev object last_status_code was renamed to last_code, for
the purpose of storing either status or reason. This is only used when
a disconnect needs to be emitted before failing the connection. In all
other cases we just pass the code directly into the connect_cb and do
not store it.
All ocurrences of netdev_connect_failed were updated to use the proper
code depending on the netdev result. Most of these simply changed from
REASON_CODE_UNSPECIFIED to STATUS_CODE_UNSPECIFIED. This was simply for
consistency (both codes have the same value).
netdev_[authenticate|associate]_event's were updated to parse the
status code and, if present, use that if their was a failure rather
than defaulting to UNSPECIFIED.
Even though .check_settings in our EAP method implementations does the
settings validation, .load_settings also has minimum sanity checks to
rule out segfaults if the settings have changed since the last
.check_settings call.
If OWE fails in association there is no reason to send a disconnect
since its already known that we failed. Instead we can directly
call netdev_connect_failed
Instead of sending a reason_code to netdev_setting_keys_failed, make it
take an errno (negative) instead. Since key setting failures are
entirely a system / software issue, and not a protocol issue, it makes
no sense to use a protocol error code.
Some users may need their own control over 2.4/5GHz preference. This
adds a new user option, 'rank_5g_factor', which allows users to increase
or decrease their 5G preference.
This adds support for parsing the VHT IE, which allows a BSS supporting
VHT (80211ac) to be ranked higher than a BSS supporting only HT/basic
rates. Now, with basic/HT/VHT parsing we can calculate the theoretical
maximum data rate for all three and rank the BSS based on that.
This adds HT IE parsing and data rate calculation for HT (80211n)
rates. Now, a BSS supporting HT rates will be ranked higher than
a basic rate BSS, assuming the RSSI is at an acceptable level.
The spec dictates RSSI thresholds for different modulation schemes, which
correlate to different data rates. Until now were were ranking a BSS with
only looking at its advertised data rate, which may not even be possible
if the RSSI does not meet the threshold.
Now, RSSI is taken into consideration and the data rate returned from
parsing (Ext) Supported Rates IE(s) will reflect that.
All over the place we do "ie[1] + 2" for getting the IE length. It
is much clearer to use a macro to do this. The macro also checks
for NULL, and returns zero in this case.
Supported rates will soon be parsed along with HT/VHT capabilities
to determine the best data rate. This will remove the need for the
supported_rates uintset element in scan_bss, as well as the single
API to only parse the supported rates IE. AP still does rely on
this though (since it only supports basic rates), so the parsing
function was moved into ap.c.
In the methods' check_settings do a more complete early check for
possible certificate / private key misconfiguration, including check
that the certificate and the private key are always present or absent
together and that they actually match each other. Do this by encrypting
and decrypting a small buffer because we have no better API for that.
A method's .check_settings method checks for inconsistent setting files
and prints readable errors so there's no need to do that again in
.load_settings, although at some point after removing the duplicate
error messages from the load_settings methods we agreed to keep minimum
checks that could cause a crash e.g. in a corner case like when the
setting file got modified between the check_settings and the
load_settings call. Some error messages have been re-added to
load_settings after that (e.g. in
bb4e1ebd4f) but they're incomplete and not
useful so remove them.
Previously, the storage dir has only been created after a successful
network connection, causing removal of Known Network interface from
Dbus and failure to register dir watcher until daemon is restarted.
A length check was still assuming the 256 bit ECC group. This
was updated to scale with the group. The commit buffer was also
not properly sized. This was changed to allow for the largest
ECC group supported.
SAE was hardcoded to work only with group 19. This change fixes up the
hard coded lengths to allow it to work with group 20 since ELL supports
it. There was also good amount of logic added to support negotiating
groups. Before, since we only supported group 19, we would just reject
the connection to an AP unless it only supported group 19.
This did lead to a discovery of a potential bug in hostapd, which was
worked around in SAE in order to properly support group negotiation.
If an AP receives a commit request with a group it does not support it
should reject the authentication with code 77. According to the spec
it should also include the group number which it is rejecting. This is
not the case with hostapd. To fix this we needed to special case a
length check where we would otherwise fail the connection.
Most of this work was already done after moving ECC into ELL, but
there were still a few places where the 256-bit group was assumed.
This allows the 384-bit group to be used, and theoretically any
other group added to ELL in the future.
If we have a BSS list where all BSS's have been blacklisted we still
need a way to force a connection to that network, instead of having
to wait for the blacklist entry to expire. network_bss_select now
takes a boolean 'fallback_to_blacklist' which causes the selection
to still return a connectable BSS even if the entire list was
blacklisted.
In most cases this is set to true, as these cases are initiated by
DBus calls. The only case where this is not true is inside
station_try_next_bss, where we do want to honor the blacklist.
This both prevents an explicit connect call (where all BSS's are
blacklisted) from trying all the blacklisted BSS's, as well as the
autoconnect case where we simply should not try to connect if all
the BSS's are blacklisted.
There are is some implied behavior here that may not be obvious:
On an explicit DBus connect call IWD will attempt to connect to
any non-blacklisted BSS found under the network. If unsuccessful,
the current BSS will be blacklisted and IWD will try the next
in the list. This will repeat until all BSS's are blacklisted,
and in this case the connect call will fail.
If a connect is tried again when all BSS's are blacklisted IWD
will attempt to connect to the first connectable blacklisted
BSS, and if this fails the connect call will fail. No more
connection attempts will happen until the next DBus call.
If IWD fails to connect to a BSS we can attempt to connect to a different
BSS under the same network and blacklist the first BSS. In the case of an
incorrect PSK (MMPDU code 2 or 23) we will still fail the connection.
station_connect_cb was refactored to better handle the dbus case. Now the
netdev result switch statement is handled before deciding whether to send
a dbus reply. This allows for both cases where we are trying to connect
to the next BSS in autoconnect, as well as in the dbus case.
This makes __station_connect_network even less intelligent by JUST
making it connect to a network, without any state changes. This makes
the rekey logic much cleaner.
We were also changing dbus properties when setting the state to
CONNECTING, so those dbus property change calls were moved into
station_enter_state.
A new driver extended feature bit was added signifying if the driver
supports PTK replacement/rekeying. During a connect, netdev checks
for the driver feature and sets the handshakes 'no_rekey' flag
accordingly.
At some point the AP will decide to rekey which is handled inside
eapol. If no_rekey is unset we rekey as normal and the connection
remains open. If we have set no_rekey eapol will emit
HANDSHAKE_EVENT_REKEY_FAILED, which is now caught inside station. If
this happens our only choice is to fully disconnect and reconnect.
If we receive handshake message 1/4 after we are already connected
the AP is attempting to rekey. This may not be allowed and if not
we do not process the rekey and emit HANDSHAKE_EVENT_REKEY_FAILED
so any listeners can handle accordingly.
The AP structure was getting cleaned up twice. When the DBus stop method came
in we do AP_STOP on nl80211. In this callback the AP was getting freed in
ap_reset. Also when the DBus interface was cleaned up it triggered ap_reset.
Since ap->started gets set to false in ap_reset, we now check this and bail
out if the AP is already stopped.
Fixes:
++++++++ backtrace ++++++++
0 0x7f099c11ef20 in /lib/x86_64-linux-gnu/libc.so.6
1 0x43fed0 in l_queue_foreach() at ell/queue.c:441 (discriminator 3)
2 0x423a6c in ap_reset() at src/ap.c:140
3 0x423b69 in ap_free() at src/ap.c:162
4 0x44ee86 in interface_instance_free() at ell/dbus-service.c:513
5 0x451730 in _dbus_object_tree_remove_interface() at ell/dbus-service.c:1650
6 0x405c07 in netdev_newlink_notify() at src/netdev.c:4449 (discriminator 9)
7 0x440775 in l_hashmap_foreach() at ell/hashmap.c:534
8 0x4455d3 in process_broadcast() at ell/netlink.c:158
9 0x4439b3 in io_callback() at ell/io.c:126
10 0x442c4e in l_main_iterate() at ell/main.c:473
11 0x442d1c in l_main_run() at ell/main.c:516
12 0x442f2b in l_main_run_with_signal() at ell/main.c:644
13 0x403ab3 in main() at src/main.c:504
14 0x7f099c101b97 in /lib/x86_64-linux-gnu/libc.so.6
+++++++++++++++++++++++++++
This will allow for blacklisting a BSS if the connection fails. The
actual blacklist module is simple and must be driven by station. All
it does is add BSS addresses, a timestamp, and a timeout to a queue.
Entries can also be removed, or checked if they exist. The blacklist
timeout is configuratble in main.conf, as well as the blacklist
timeout multiplier and maximum timeout. The multiplier is used after
a blacklisted BSS timeout expires but we still fail to connect on the
next connection attempt. We multiply the current timeout by the
multiplier so the BSS remains in the blacklist for a larger growing
amount of time until it reaches the maximum (24 hours by default).
Soon BSS blacklisting will be added, and in order to properly decide if
a BSS should be blacklisted we need the status code on a failed
connection. This change stores the status code when there is a failure
in netdev and adds a getter to retrieve later. In many cases we have
the actual status code from the AP, but in some corner cases its not
obtainable (e.g. an error sending an NL80211 command) in which case we
just default to MMPDU_REASON_CODE_UNSPECIFIED.
Rather than continue with the pattern of setting netdev->result and
now netdev->last_status_code, the netdev_connect_failed function was
redefined so its no longer used as both a NL80211 callback and called
directly. Instead a new function was added, netdev_disconnect_cb which
just calls netdev_connect_failed. netdev_disconnect_cb should not be
used for all the NL80211 disconnect commands. Now netdev_connect_failed
takes both a result and status code which it sets in the netdev object.
In the case where we were using netdev_connect_failed as a callback we
still need to set the result and last_status_code but at least this is
better than having to set those in all cases.
Remove an unneeded buffer and its memcpy, remove the now unneeded use of
l_checksum_digest_length and use l_checksum_reset instead of creating a
new l_checksum for each chunk.
ELL ECC supports group 20 (P384) so OWE can also support it. This also
adds group negotiation, where OWE can choose a different group than the
default if the AP requests it.
A check needed to be added in netdev in order for the negotiation to work.
The RFC says that if a group is not supported association should be rejected
with code 77 (unsupported finite cyclic group) and association should be
started again. This rejection was causing a connect event to be emitted by
the kernel (in addition to an associate event) which would result in netdev
terminating the connection, which we didn't want. Since OWE receives the
rejected associate event it can intelligently decide whether it really wants
to terminate (out of supported groups) or try the next available group.
This also utilizes the new MIC/KEK/KCK length changes, since OWE dictates
the lengths of those keys.
Rather than hard coding to SHA256, we can pass in l_checksum_type
and use that SHA. This will allow for OWE/SAE/PWD to support more
curves that use different SHA algorithms for hashing.
OWE defines KEK/KCK lengths depending on group. This change adds a
case into handshake_get_key_sizes. With OWE we can determine the
key lengths based on the PMK length in the handshake.
In preparation for OWE supporting multiple groups eapol needed some
additional cases to handle the OWE AKM since OWE dictates the KEK,
KCK and MIC key lengths (depending on group).
Right now the PMK is hard coded to 32 bytes, which works for the vast
majority of cases. The only outlier is OWE which can generate a PMK
of 32, 48 or 64 bytes depending on the ECC group used. The PMK length
is already stored in the handshake, so now we can just pass that to
crypto_derive_pairwise_ptk
The crypto_ptk was hard coded for 16 byte KCK/KEK. Depending on the
AKM these can be up to 32 bytes. This changes completely removes the
crypto_ptk struct and adds getters to the handshake object for the
kck and kek. Like before the PTK is derived into a continuous buffer,
and the kck/kek getters take care of returning the proper key offset
depending on AKM.
To allow for larger than 16 byte keys aes_unwrap needed to be
modified to take the kek length.
The MIC length was hard coded to 16 bytes everywhere, and since several
AKMs require larger MIC's (24/32) this needed to change. The main issue
was that the MIC was hard coded to 16 bytes inside eapol_key. Instead
of doing this, the MIC, key_data_length, and key_data elements were all
bundled into key_data[0]. In order to retrieve the MIC, key_data_len,
or key_data several macros were introduced which account for the MIC
length provided.
A consequence of this is that all the verify functions inside eapol now
require the MIC length as a parameter because without it they cannot
determine the byte offset of key_data or key_data_length.
The MIC length for a given handshake is set inside the SM when starting
EAPoL. This length is determined by the AKM for the handshake.
Non-802.11 AKMs can define their own key lengths. Currently only OWE does
this, and the MIC/KEK/KCK lengths will be determined by the PMK length so
we need to save it.
Make sure we don't pass NULLs to memcmp or l_memdup when the prefix
buffer is NULL. There's no point having callers pass dummy buffers if
they need to watch frames independent of the frame data.
Start using l_key_generate_dh_private and l_key_validate_dh_payload to
check for the disallowed corner case values in the DH private/public
values generated/received.
Some of the EAP methods don't require a clear-text identity to
be sent with the Identity Response packet. The mandatory identity
filed has resulted in unnecessary transmission of the garbage
values. This patch makes the Identity field to be optional and
shift responsibility to ensure its existence to the individual
methods if the field is required. All necessary identity checks
have been previously propagated to individual methods.
If a network is being forgotten, then make sure to reset connected_time.
Otherwise the rank logic thinks that the network is known which can
result in network_find_rank_index returning -1.
Found by sanitizer:
src/network.c:1329:23: runtime error: index -1 out of bounds for type
'double [64]'
==25412==ERROR: AddressSanitizer: global-buffer-overflow on address 0x000000421ab0 at pc 0x000000402faf bp 0x7fffffffdb00 sp 0x7fffffffdaf0
READ of size 4 at 0x000000421ab0 thread T0
#0 0x402fae in validate_mgmt_ies src/mpdu.c:128
#1 0x403ce8 in validate_probe_request_mmpdu src/mpdu.c:370
#2 0x404ef2 in validate_mgmt_mpdu src/mpdu.c:662
#3 0x405166 in mpdu_validate src/mpdu.c:706
#4 0x402529 in ie_order_test unit/test-mpdu.c:156
#5 0x418f49 in l_test_run ell/test.c:83
#6 0x402715 in main unit/test-mpdu.c:171
#7 0x7ffff5d43ed9 in __libc_start_main (/lib64/libc.so.6+0x20ed9)
#8 0x4019a9 in _start (/home/denkenz/iwd-master/unit/test-mpdu+0x4019a9)
This fixes the valgrind warning:
==14804== Conditional jump or move depends on uninitialised value(s)
==14804== at 0x402E56: sae_is_quadradic_residue (sae.c:218)
==14804== by 0x402E56: sae_compute_pwe (sae.c:272)
==14804== by 0x402E56: sae_build_commit (sae.c:333)
==14804== by 0x402E56: sae_send_commit (sae.c:591)
==14804== by 0x401CC3: test_confirm_after_accept (test-sae.c:454)
==14804== by 0x408A28: l_test_run (test.c:83)
==14804== by 0x401427: main (test-sae.c:566)
The return from l_ecc_point_from_data was not being checked for NULL,
which would cause a segfault if the peer sent an invalid point.
This adds a check and fails the protocol if p_element is NULL, as the
spec defines.
src/eap-ttls.c:766:50: error: ‘Password’ directive output may be truncated writing 8 bytes into a region of size between 1 and 72 [-Werror=format-truncation=]
snprintf(password_key, sizeof(password_key), "%sPassword", prefix);
^~~~~~~~
In file included from /usr/include/stdio.h:862,
from src/eap-ttls.c:28:
/usr/include/bits/stdio2.h:64:10: note: ‘__builtin___snprintf_chk’ output between 9 and 80 bytes into a destination of size 72
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Stop using l_pem_load_certificate which has been removed from ell, use
the same functions to load certificate files to validate them as those
used by the TLS implementation itself.
Check that the TLS logic has verified the server is trusted by the CA if
one was configured. This is more of an assert as ell intentionally only
allows empty certificate chains from the peer in server mode (if a CA
certficate is set) although this could be made configurable.
This should not change the behaviour except for fixing a rare crash
due to scan_cancel not working correctly when cancelling the first scan
request in the queue while a periodic scan was running, and potentially
other corner cases. To be able to better distinguish between a periodic
scan in progress and a scan request in progress add a sc->current_sr
field that points either at a scan request or is NULL when a periodic
scan is in ongoing. Move the triggered flag from scan_request and
scan_preiodic directly to scan_context so it's there together with
start_cmd_id. Hopefully make scan_cancel simpler/clearer.
Note sc->state and sc->triggered have similar semantics so one of them
may be easily removed. Also the wiphy_id parameter to the scan callback
is rather useless, note I temporarily pass 0 as the value on error but
perhaps it should be dropped.
In the name of failing earlier try to generate the PSK from the
passphrase as soon as we receive the passphrase or read it from the
file, mainly to validate it has the right number of characters.
The passphrase length currently gets validates inside
crypto_psk_from_passphrase which will be called when we receive a new
passphrase from the agent or when the config file has no PSK in it. We
do not do this when there's already both the PSK and the passphrase
available in the settings -- we can add that separately if needed.
The main difference with this is that scan_context removal will also
trigger the .destroy calls. Normally there won't be any requests left
during scan_context but if there were any we should call destroy on
them.
If we haven't sent a PMKID, and we're not running EAP, then ignore
whatever PMKID the AP sends us. Frequently the APs send us garbage in
this field. For PSK and related AKMs, if the PMK is wrong, then we
simply fail to generate a proper MIC and the handshake would fail at a
later stage anyway.
Fix incorrect usage of the caller’s scan triggered callback.
In case of a failure, destroy scan request and notify caller
about the issue by returning zero scan id instead of calling
callers’ scan triggered callback with an error code.
Using backtrace() is of no use when building with PIE (which most
distro compilers do by default) and prevents catching the coredump
for later retracing, which is needed since distros usually don't
install debug symbols by default either.
This patch thus only enables backtrace() when --enable-maintainer-mode
is passed and also tries to explicitly disable PIE.
ECDH was expecting the private key in LE, but the public key in BE byte ordering.
For consistency the ECDH now expect all inputs in LE byte ordering. It is up to
the caller to order the bytes appropriately.
This required adding some ecc_native2be/be2native calls in OWE
The changes to station.c are minor. Specifically,
station_build_handshake_rsn was modified to always build up the RSN
information, not just for SECURITY_8021X and SECURITY_PSK. This is
because OWE needs this RSN information, even though it is still
SECURITY_NONE. Since "regular" open networks don't need this, a check
was added (security == NONE && akm != OWE) which skips the RSN
building.
netdev.c needed to be changed in nearly the same manor as it was for
SAE. When connecting, we check if the AKM is for OWE, and if so create
a new OWE SM and start it. OWE handles all the ECDH, and netdev handles
sending CMD_AUTHENTICATE and CMD_ASSOCIATE when triggered by OWE. The
incoming authenticate/associate events just get forwarded to OWE as they
do with SAE.
This module is similar to SAE in that it communicates over authenticate
and associate frames. Creating a new OWE SM requires registering two TX
functions that handle sending the data out over CMD_AUTHENTICATE/ASSOCIATE,
as well as a complete function.
Once ready, calling owe_start will kick off the OWE process, first by
sending out an authenticate frame. There is nothing special here, since
OWE is done over the associate request/response.
After the authenticate response comes in OWE will send out the associate
frame which includes the ECDH public key, and then receive the AP's
public key via the associate response. From here OWE will use ECDH to
compute the shared secret, and the PMK/PMKID. Both are set into the
handshake object.
Assuming the PMK/PMKID are successfully computed the OWE complete callback
will trigger, meaning the 4-way handshake can begin using the PMK/PMKID
that were set in the handshake object.
The RFC (5869) for this implementation defines two functions,
HKDF-Extract and HKDF-Expand. The existing 'hkdf_256' was implementing
the Extract function, so it was renamed appropriately. The name was
changed for consistency when the Expand function will be added in the
future.
In the current version SECURITY_PSK was handled inside the is_rsn block
while the SECURITY_8021X was off in its own block. This was weird and a
bit misleading. Simplify the code flow through the use of a goto and
decrease the nesting level.
Also optimize out unnecessary use of scan_bss_get_rsn_info
In network_autoconnect, when the network was SECURITY_8021X there was no
check (for SECURITY_PSK) before calling network_load_psk. Since the
provisioning file was for an 8021x network neither PreSharedKey or
Passphrase existed so this would always fail. This fixes the 8021x failure
in testConnectAutoconnect.
During the handshake setup, if security != SECURITY_PSK then 8021x settings
would get set in the handshake object. This didn't appear to break anything
(e.g. Open/WEP) but its better to explicitly check that we are setting up
an 8021x network.
Check for HAVE_EXECINFO_H for all __iwd_backtrace_init usages.
Fixes:
src/main.o: In function `main':
main.c:(.text.startup+0x798): undefined reference to `__iwd_backtrace_init'
collect2: error: ld returned 1 exit status
A sorted list of hidden network BSSs observed in the recent scan
is kept for the informational purposes of the clients. In addition,
it has deprecated the usage of seen_hidden_networks variable.
Refactor the network->psk and network->passphrase loading and saving
logic to not require the PreSharedKey entry in the psk config file and
to generate network->psk lazily on request. Still cache the computed
PSK in memory and in the .psk file to avoid recomputing it which uses
many syscalls. While there update the ask_psk variable to
ask_passphrase because we're specifically asking for the passphrase.
According to the specification, Supported rates IE is supposed
to have a maximum length of eight rate bytes. In the wild an
Access Point is found to add 12 bytes of data instead of placing
excess rate bytes in an Extended Rates IE.
BSS: len 480
BSSID 44:39:C4:XX:XX:XX
Probe Response: true
TSF: 0 (0x0000000000000000)
IEs: len 188
...
Supported rates:
1.0(B) 2.0(B) 5.5(B) 6.0(B) 9.0 11.0(B) 12.0(B) 18.0 Mbit/s
24.0(B) 36.0 48.0 54.0 Mbit/s
82 84 8b 8c 12 96 98 24 b0 48 60 6c .......$.H`l
DSSS parameter set: channel 3
03
...
Any following IEs decode nicely, thus it seems that we can relax
Supported Rates IE length handling to support this thermostat.
After moving AP EAPoL code into eapol.c there were a few functions that
no longer needed to be public API's. These were changed to static's and
the header definition was removed.
Set an upper limit on a fragmented EAP-TLS request size similar to how
we do it in EAP-TTLS. While there make the code more similar to the
EAP-TTLS flag processing to keep them closer in sync. Note that the
spec suggests a 64KB limit but it's not clear if that is for the TLS
record or EAP request although it takes into account the whole TLS
negotiation so it might be good for both.
Some of the TTLS server implementations set the L flag in the fragment
packets other than the first one. To stay interoperable with such devices,
iwd is relaxing the L bit check.
Switch EAP-MD5 to use the common password setting key nomenclature.
The key name has been changed from PREFIX-MD5-Secret to PREFIX-Password.
Note: The old key name is supported.
In addition, this patch adds an ability to request Identity and/or
Password from user.
Adhoc was not waiting for BOTH handshakes to complete before adding the
new peer to the ConnectedPeers property. Actually waiting for the gtk/igtk
(in a previous commit) helps with this, but adhoc also needed to keep track
of which handshakes had completed, and only add the peer once BOTH were done.
This required a small change in netdev, where we memcmp the addresses from
both handshakes and only set the PTK on one.
Currently, netdev triggers the HANDSHAKE_COMPLETE event after completing
the SET_STATION (after setting the pairwise key). Depending on the timing
this may happen before the GTK/IGTK are set which will result in group
traffic not working initially (the GTK/IGTK would still get set, but group
traffic would not work immediately after DBus said you were connected, this
mainly poses a problem with autotests).
In order to fix this, several flags were added in netdev_handshake_state:
ptk_installed, gtk_installed, igtk_installed, and completed. Each of these
flags are set true when their respective keys are set, and in each key
callback we try to trigger the handshake complete event (assuming all the
flags are true). Initially the gtk/igtk flags are set to true, for reasons
explained below.
In the WPA2 case, all the key setter functions are called sequentially from
eapol. With this change, the PTK is now set AFTER the gtk/igtk. This is
because the gtk/igtk are optional and only set if group traffic is allowed.
If the gtk/igtk are not used, we set the PTK and can immediately trigger the
handshake complete event (since gtk_installed/igtk_installed are initialized
as true). When the gtk/igtk are being set, we immediately set their flags to
false and wait for their callbacks in addition to the PTK callback. Doing it
this way handles both group traffic and non group traffic paths.
WPA1 throws a wrench into this since the group keys are obtained in a
separate handshake. For this case a new flag was added to the handshake_state,
'wait_for_gtk'. This allows netdev to set the PTK after the initial 4-way,
but still wait for the gtk/igtk setters to get called before triggering the
handshake complete event. As a precaution, netdev sets a timeout that will
trigger if the gtk/igtk setters are never called. In this case we can still
complete the connection, but print a warning that group traffic will not be
allowed.
==1628== Invalid read of size 1
==1628== at 0x405E71: hardware_rekey_cb (netdev.c:1381)
==1628== by 0x444E5B: process_unicast (genl.c:415)
==1628== by 0x444E5B: received_data (genl.c:534)
==1628== by 0x442032: io_callback (io.c:126)
==1628== by 0x4414CD: l_main_iterate (main.c:387)
==1628== by 0x44158B: l_main_run (main.c:434)
==1628== by 0x403775: main (main.c:489)
==1628== Address 0x5475208 is 312 bytes inside a block of size 320 free'd
==1628== at 0x4C2ED18: free (vg_replace_malloc.c:530)
==1628== by 0x43D94D: l_queue_clear (queue.c:107)
==1628== by 0x43D998: l_queue_destroy (queue.c:82)
==1628== by 0x40B431: netdev_shutdown (netdev.c:4765)
==1628== by 0x403B17: iwd_shutdown (main.c:81)
==1628== by 0x4419D2: signal_callback (signal.c:82)
==1628== by 0x4414CD: l_main_iterate (main.c:387)
==1628== by 0x44158B: l_main_run (main.c:434)
==1628== by 0x403775: main (main.c:489)
==1628== Block was alloc'd at
==1628== at 0x4C2DB6B: malloc (vg_replace_malloc.c:299)
==1628== by 0x43CA4D: l_malloc (util.c:62)
==1628== by 0x40A853: netdev_create_from_genl (netdev.c:4517)
==1628== by 0x444E5B: process_unicast (genl.c:415)
==1628== by 0x444E5B: received_data (genl.c:534)
==1628== by 0x442032: io_callback (io.c:126)
==1628== by 0x4414CD: l_main_iterate (main.c:387)
==1628== by 0x44158B: l_main_run (main.c:434)
==1628== by 0x403775: main (main.c:489)
Adhoc requires 2 GTK's to be set, a single TX GTK and a per-mac RX GTK.
The per-mac RX GTK already gets set via netdev_set_gtk. The single TX GTK
is created the same as AP, where, upon the first station connecting a GTK
is generated and set in the kernel. Then any subsequent stations use
GET_KEY to retrieve the GTK and set it in the handshake.
AdHoc will also need the same functionality to verify and parse the
key sequence from GET_KEY. This block of code was moved from AP's
GET_KEY callback into nl80211_parse_get_key_seq.
Netdev/AP share several NL80211 commands and each has their own
builder API's. These were moved into a common file nl80211_util.[ch].
A helper was added to AP for building NEW_STATION to make the associate
callback look cleaner (rather than manually building NEW_STATION).
Check that netdev->device is not NULL before doing device_remove()
(which would crash) and emitting NETDEV_WATCH_EVENT_DEL. It may be
NULL if the initial RTM_SETLINK has failed to bring device UP.
If there are Ad-hoc BSSes they should be present in the scan results
together with regular APs as far as scan.c is concerned. But in
station mode we can't connect to them -- the Connect method will fail and
autoconnect would fail. Since we have no property to indicate a
network is an IBSS just filter these results out for now. There are
perhaps better solutions but the benefit is very low.
This is a replacement for station's static select_akm_suite. This was
done because wiphy can make a much more intellegent decision about the
akm suite by checking the wiphy supported features e.g. SAE support.
This allows a connection to hybrid WPA2/WPA3 AP's if SAE is not
supported in the kernel.
Set a default GTK cipher type same as our current PTK type, generate a
random GTK when the first STA connects and set it up in the kernel, then
pass the values that EAPoL is going to need to the handshake_state.
In netdev_set_powered also check that no NL80211_CMD_SET_INTERFACE is in
progress because once it returned we would overwrite
netdev->set_powered_cmd_id (could also add a check there but it seems
more logical to just disallow Powered property changes while Mode is
being changed, since we also disallow Mode changes while Powered is
being changed.)
Since device object no longer creates / destroys station objects, use
station_find inside ap directed roam events to direct these to the
station interface.
Add places to store the GTK data, index and RSC in struct
handshake_state and add a setter function for these fields. We may want
to also convert install_gtk to use these fields similar to install_ptk.
As a consequence of the previous commit, netdev watches are always
called when the device object is valid. As a result, we can drop the
netdev_get_device calls and checks from individual AP/AdHoc/Station/WSC
netdev watches
Instead of creating the Station interface in device.c create it directly
on the netdev watch event the same way that the AP and AdHoc interfaces
are created and freed. This fixes some minor incosistencies, for
example station_free was previously called twice, once from device.c and
once from the netdev watch.
device.c would previously keep the pointer returned by station_create()
but that pointer was not actually useful so remove it. Autotests still
seem to pass.
Call netdev_disconnect() to make netdev forget any of station.c's
callbacks for connections or transitions in progress or established.
Otherwise station.c will crash as soon as we're connected and try to
change interface mode:
==17601== Invalid read of size 8
==17601== at 0x11DFA0: station_disconnect_event (station.c:775)
==17601== by 0x11DFA0: station_netdev_event (station.c:1570)
==17601== by 0x115D18: netdev_disconnect_event (netdev.c:868)
==17601== by 0x115D18: netdev_mlme_notify (netdev.c:3403)
==17601== by 0x14E287: l_queue_foreach (queue.c:441)
==17601== by 0x1558B4: process_multicast (genl.c:469)
==17601== by 0x1558B4: received_data (genl.c:532)
==17601== by 0x152888: io_callback (io.c:123)
==17601== by 0x151BCD: l_main_iterate (main.c:376)
==17601== by 0x151C9B: l_main_run (main.c:423)
==17601== by 0x10FE20: main (main.c:489)
Since the interfaces are not supposed to exist when the device is DOWN
(we destroy the interfaces on NETDEV_WATCH_EVENT_DOWN too), don't
create the interfaces if the device hasn't been brought up yet.
When we detect a new device we either bring it down and then up or only
up. The IFF_UP flag in netdev->ifi_flags is updated before that, then
we send the two rtnl commands and then fire the NETDEV_WATCH_EVENT_NEW
event if either the bring up succeeded or -ERFKILL was returned, so the
device may either be UP or DOWN at that point.
It seems that a RTNL NEWLINK notification is usually received before
the RTNL command callback but I don't think this is guaranteed so update
the IFF_UP flag in the callbacks so that the NETDEV_WATCH_EVENT_NEW
handlers can reliably use netdev_get_is_up()
The NL80211_ATTR_KEY_DEFAULT_TYPES attribute is only parsed by the
kernel if either NL80211_ATTR_KEY_DEFAULT or
NL80211_ATTR_KEY_DEFAULT_MGMT are also present, however these are only
used with NL80211_CMD_SET_KEY and ignored for NEW_KEY. As far as I
understand the default key concept only makes sense for a Tx key because
on Rx all keys can be tried, so we don't need this for client mode. The
kernel decides whether the NEW_KEY is for unicast or multicast based on
whether NL80211_ATTR_KEY_MAC was supplied.
device password was read from settings using l_settings_get_string which
returns a newly-allocated string due to un-escape semantics. However,
when assigning wsc->device_password, we strdup-ed the password again
unnecessarily.
==1069== 14 bytes in 2 blocks are definitely lost in loss record 1 of 1
==1069== at 0x4C2AF0F: malloc (vg_replace_malloc.c:299)
==1069== by 0x16696A: l_malloc (util.c:62)
==1069== by 0x16B14B: unescape_value (settings.c:108)
==1069== by 0x16D12C: l_settings_get_string (settings.c:971)
==1069== by 0x149680: eap_wsc_load_settings (eap-wsc.c:1270)
==1069== by 0x146113: eap_load_settings (eap.c:556)
==1069== by 0x12E079: eapol_start (eapol.c:2022)
==1069== by 0x1143A5: netdev_connect_event (netdev.c:1728)
==1069== by 0x118751: netdev_mlme_notify (netdev.c:3406)
==1069== by 0x1734F1: notify_handler (genl.c:454)
==1069== by 0x168987: l_queue_foreach (queue.c:441)
==1069== by 0x173561: process_multicast (genl.c:469)
wsc_pin_is_valid allows two types of PINs through:
1. 4 digit numeric PIN
2. 8 digit numeric PIN
The current code always calls wsc_pin_is_checksum_valid to determine
whether a DEFAULT or USER_SPECIFIED PIN is used. However, this function
is not safe to call on 4 digit PINs and causes a buffer overflow.
Add simple checks to treat 4 digit PINs as DEFAULT PINs and do not call
wsc_pin_is_checksum_valid on these.
Reported-By: Matthias Gerstner <matthias.gerstner@suse.de>
EAP-WSC handles 4 digit, 8 digit and out-of-band Device passwords. The
latter in particular can be anything, so drop the mandatory minimum
password length check here.
This also has the effect of enabling 4-digit PINs to actually work as
they are intended.
The struct allows to support multiple types of the tunneled methods.
Previously, EAP-TTLS was supporting only the eap based ones.
This patch is also starts to move some of the phase 2 EAP
functionality into the new structure.
Boiled down, FT over SAE is no different than FT over PSK, apart from
the different AKM suite. The bulk of this change fixes the current
netdev/station logic related to SAE by rebuilding the RSNE and adding
the MDE if present in the handshake to match what the PSK logic does.
A common function was introduced into station which will rebuild the
handshake rsne's for a target network. This is used for both new
network connections as well as fast transitions.
To prepare for FT over SAE, several case/if statements needed to include
IE_RSN_AKM_SUITE_FT_OVER_SAE. Also a new macro was introduced to remove
duplicate if statement code checking for both FT_OVER_SAE and SAE AKM's.
All the watchlist notify macros were broken in that they did not check
that the watchlist item was still valid before calling it. This only
came into play when a watchlist was being notified and one of the notify
functions removed an item from the same watchlist. It appears this was
already thought of since watchlist_remove checks 'in_notify' and will
mark the item's id as stale (0), but that id never got checked in the
notify macros.
This fixes testAdHoc valgrind warning:
==3347== Invalid read of size 4
==3347== at 0x416612: eapol_rx_auth_packet (eapol.c:1871)
==3347== by 0x416DD4: __eapol_rx_packet (eapol.c:2334)
==3347== by 0x40725B: netdev_pae_read (netdev.c:3515)
==3347== by 0x440958: io_callback (io.c:123)
==3347== by 0x43FDED: l_main_iterate (main.c:376)
==3347== by 0x43FEAB: l_main_run (main.c:423)
==3347== by 0x40377A: main (main.c:489)
...
In the case of the open networks with hidden SSIDs
the settings object is already created.
Valgrind:
==4084== at 0x4C2EB6B: malloc (vg_replace_malloc.c:299)
==4084== by 0x43B44D: l_malloc (util.c:62)
==4084== by 0x43E3FA: l_settings_new (settings.c:83)
==4084== by 0x41D101: network_connect_new_hidden_network (network.c:1053)
==4084== by 0x4105B7: station_hidden_network_scan_results (station.c:1733)
==4084== by 0x419817: scan_finished (scan.c:1165)
==4084== by 0x419CAA: get_scan_done (scan.c:1191)
==4084== by 0x443562: destroy_request (genl.c:139)
==4084== by 0x4437F7: process_unicast (genl.c:424)
==4084== by 0x4437F7: received_data (genl.c:534)
==4084== by 0x440958: io_callback (io.c:123)
==4084== by 0x43FDED: l_main_iterate (main.c:376)
==4084== by 0x43FEAB: l_main_run (main.c:423)
Some of the PEAP server implementations set the L flag along with
redundant TLS Message Length field for the un-fragmented packets.
This patch allows to identify and handle such occasions.
EAP Extensions type 33 is used in PEAPv0 as a termination
mechanism for the tunneled EAP methods. In PEAPv1
the regular EAP-Success/Failure packets must be used to terminate
the method. Some of the server implementations of PEAPv1
rely on EAP Extensions method to terminate the conversation
instead of the required Success/Failure packets. This patch
makes iwd interoperable with such devices.
The "H" function used by SAE and EAP-PWD was effectively the same
function, EAP-PWD just used a zero key for its calls. This removes
the duplicate implementations and merges them into crypto.c as
"hkdf_256".
Since EAP-PWD always uses a zero'ed key, passing in a NULL key to
hkdf_256 will actually use a 32 byte zero'ed array as the key. This
avoids the need for EAP-PWD to store or create a zero'ed key for
every call.
Both the original "H" functions never called va_end, so that was
added to hkdf_256.
The ifindex as reported by netdev is unsigned, so make sure that it is
printed as such. It is astronomically unlikely that this causes any
actual issues, but lets be paranoid.
Move the roam initiation (signal loss, ap directed roaming) and scanning
details into station from device. Certain device functions have been
exposed temporarily to make this possible.
process_bss performs two main operations. It adds a seen BSS to a
network object (existing or new) and if the device is in the autoconnect
state, it adds an autoconnect entry as needed. Split this operation
into two separate & independent steps.
To avoid confusion in case of an authenticator side handshake_state
structure and eapol_sm structure, rename own_ie to supplicant_ie and
ap_ie to authenticator_ie. Also rename
handshake_state_set_{own,ap}_{rsn,wpa} and fix when we call
handshake_state_setup_own_ciphers. As a result
handshake_state_set_authenticator, if needed, should be called before
handshake_state_set_{own,ap}_{rsn,wpa}.
After EAPOL logic was moved to eapol.c a check was added to
ap_associate_sta_cb to bitwise compare the AP's RSNE to the RSNE
received in the (Re)Association frame. There is as far as I know no
reason for them to be the same (although they are in our autotest) and
if there was a reason we'd rather validate the (Re)Association RSNE
immediately when received. We also must set different RSNEs as the
"own" (supplicant) and "ap" RSNEs in the handshake_state for validation
of step 2/4 in eapol.c (fixes wpa_supplicant's and MS Windows
connections being rejected)
Make sure we interrupt eapol traffic (4-way handshake) if we receive a
Disassociation from station. Actually do this in ap_del_station because
it's called from both ap_disassoc_cb and ap_success_assoc_resp_cb and
seems to make sense in both cases.
On one hand when we're called with HANDSHAKE_EVENT_FAILED or
HANDSHAKE_EVENT_SETTING_KEYS_FAILED the eapol_sm will be freed in
eapol.c, fix a double-free by setting it to NULL before ap_free_sta
is called.
On the other hand make sure we call eapol_sm_free before setting
sta->sm to NULL in ap_drop_rsna to avoid potential leak and avoid
the eapol_sm continuing to use the handshake_state we freed.
timespec_compare wanted to receive network_info structures as arguments
to compare connected_time timestamps but in one instance we were passing
actual timespec structures. Add a new function to compare plain timespec
values and switch the names for readability.
==7330== 112 bytes in 1 blocks are still reachable in loss record 1 of 1
==7330== at 0x4C2CF8F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==7330== by 0x14CF7D: l_malloc (util.c:62)
==7330== by 0x152A25: l_io_new (io.c:172)
==7330== by 0x16B217: l_fswatch_init (fswatch.c:171)
==7330== by 0x16B217: l_fswatch_new (fswatch.c:198)
==7330== by 0x13B9D9: known_networks_init (knownnetworks.c:401)
==7330== by 0x110020: main (main.c:439)
There was somewhat overlapping functionality in the device_watch
infrastructure as well as the netdev_event_watch. This commit combines
the two into a single watch based on the netdev object and cleans up the
various interface additions / removals.
With this commit the interfaces are created when the netdev/device is
switched to Powered=True state AND when the netdev iftype is also in the
correct state for that interface. If the device is brought down, then
all interfaces except the .Device interface are removed.
This will make it easy to implement Device.Mode property properly since
most nl80211 devices need to be brought into Powered=False state prior
to switching the iftype.
The way that netdev_set_linkmode_and_operstate was used resulted in
potential crashes when the netdev was destroyed. This is because netdev
was given as data to l_netlink_send and could be destroyed between the
time of the call and the callback. Since the result of calls to
netdev_set_linkmode_and_operstate is inconsequential, it isn't really
worthwhile tracking these calls in order to cancel them.
This patch simplies the handling of these rtnl calls, makes sure that
netdev isn't passed as user data and rewrites the
netdev_set_linkmode_and_operstate signature to be more consistent with
rtnl_set_powered.
Since all netdevs share the rtnl l_netlink object, it was possible for
netdevs to be destroyed with outstanding commands still executing on the
rtnl object. This can lead to crashes and other nasty situations.
This patch makes sure that Powered requests are always tracked via
set_powered_cmd_id and the request is canceled when netdev is destroyed.
This also implies that netdev_set_powered can now return an -EBUSY error
in case a request is already outstanding.
SAE is meant to work in a peer-to-peer fashion where neither side acts
as a dedicated authenticator or supplicant. This was not the case with
the current code. The handshake state authenticator address was hard
coded as the destination address for all packets, which will not work
when mesh comes into play. This also made unit testing the full SAE
procedure with two sae_sm's impossible.
This patch adds a peer address element to sae_sm which is filled with
either aa/spa based on the value of handshake->authenticator
This removes the authenticator bit in eapol_sm as well as unifies
eapol_register_authenticator and eapol_register. Taking advantage
of the handshake state authenticator bit we no longer have a need
for 2 separate register functions.
ap, and adhoc were also updated to set the authenticator bit in
the handshake and only use eapol_register to register their sm's.
netdev was updated to use the authenticator bit when choosing the
correct key address for adhoc.
Both SAE and adhoc can benefit from knowing whether the handshake state
is an authenticator or a supplicant. It will allow both to easily
obtain the remote address rather than sorting out if aa/spa match the
devices own address.
The send confirm counter is incremented before calling sae_send_confirm
in all cases, but the function itself was also incrementing sc after
sending the packet. This isn't critical to the successful execution of
SAE as the AP just uses the sc value in the packet but it did violate
the 802.11 spec.
In order to plug SAE into the existing connect mechanism the actual
CMD_CONNECT message is never sent, rather sae_register takes care
of sending out CMD_AUTHENTICATE. This required some shuffling of
code in order to handle both eapol and sae. In the case of non-SAE
authentication everything behaves as it did before. When using SAE
an sae_sm is created when a connection is attempted but the eapol_sm
is not. After SAE succeeds it will start association and then create
the eapol_sm and start the 4-way handshake.
This change also adds the handshake SAE events to device and
initializes SAE in main.
SAE (Simultaneous Authentication of Equals) takes place during
authentication, and followed by EAPoL/4-way handshake. This
module handles the entire SAE commit/confirm exchange. This was
done similar to eapol.
SAE begins when sae_register is called. At this point a commit
message will be created and sent out which kicks off the SAE
authentication procedure.
The commit/confirm exchange is very similar to EAP-PWD, so all
the ecc utility functions could be re-used as-is. A few new ecc
utility functions were added to conform to the 80211 'blinding'
technique for computing the password element.
For an SAE network, the raw passphrase is required. For this reason,
known network psk files should now always contain a 'Passphrase' entry.
If a psk file is found without a Passphrase entry the agent will be asked
for the Passphrase before connecting. This will update the legacy psk
file with the Passphrase entry.
Due to the quirk in how storage_network_sync implements file writing,
iwd was generating unnecessary KnownNetwork removal events (and
preventing certain test cases from passing successfully)
storage_network_sync tries to perform atomic writes by writing to a
temporary storage location first, unlinking the existing file and
renaming the tmp file as the original.
This generates a set of inotify events which confuses the current
implementation.
The previous change did not consider the case of the PSK being written
for the very first time. In this case storage_network_open would return
NULL and an empty file would be written.
Change this so that if storage_network_open fails, then the current
network settings are written to disk and not a temporary.
Reload the network settings from disk before calling
storage_network_sync in network_sync_psk to avoid potentially
overwriting changes made to the storage by user since the connection
attempt started. This won't account for all situations but it
covers some of them and doesn't cost us much.
Our logic would set CONTROL_PORT_OVER_NL80211 even in cases where
CONTROL_PORT wasn't used (e.g. for open networks). While the kernel
ignored this attribute in this case, it is nicer to set this only if
CONTROL_PORT is intended to be used.
SAE will require some of the same CMD_ASSOCIATE building code that
FT currently uses. This breaks out the common code from FT into
netdev_build_cmd_associate_common.
This also required passing in the akm suite in case the key description
version was zero. In the zero case the akm must be checked. For now this
only supports the SAE akm.
Update the known networks list and network properties on file creations,
removals and modifications. We watch for these filesystem events using
ell's fswatch and react accordingly.
This makes testEAP-PEAP-GTC pass for me by re-adding the check for the
GTC-Secret setting which was replaced with the check for the secrets
list in 3d2285ec7e.
eap_append_secret now takes a new cache_policy parameter which can be
used by the EAP method to signal that the value received from the agent
is to never be cached, i.e. each value can only be used once. The
parameter value should be EAP_CACHE_NEVER for this and we use this in
value EAP-GTC where the secret tokens are one time use. The
EAP_CACHE_TEMPORARY value is used in other methods, it preserves the
default behaviour where a secret can be cached for as long as the
network stays in range (this is the current implementation more than a
design choice I believe, I didn't go for a more specific enum name as
this may still change I suppose).
SAE generates the PMKID during the authentication process, rather than
generating it on-the-fly using the PMK. For this reason SAE needs to be
able to set the PMKID once its generated. A new flag was also added
(has_pmkid) which signifies if the PMKID was set or if it should be
generated.
SAE needs access to the raw passphrase, not the PSK which network
saves. This changes saves the passphrase in network and handshake
objects, as well as adds getters to both objects so SAE can retrieve
the passphrase.
This fixes improper cleanup when ofono leaves the bus after a simauth
instance has been cleaned up. The problem was that the plugin
exit was being called after the simauth module, causing there to
be stale simauth instances that were no longer valid. Now plugins
cleanup before simauth.
This fix fixes the print seen when iwd exits:
"Auth provider queue was not empty on exit!"
Make the network_storage_* functions uniformly accept an enum value
instead of a string so that he conversion to string doesn't need to
happen in all callers.
Now, EAP-GTC behaves similar to MSCHAPv2 where check_settings allows
for missing EAP-Identity and GTC-Secret fields. Either or both can be
missing and the agent will request the missing fields.
Add ObjectManager objects with properties for each Known Network so that
signals are emitted for creation or removal of a Known Network and a
Property Changed is emitted on LastConnectedTime change. Remove the
ListKnownNetworks method from the old KnownNetworks interface.
Note this breaks clients that used the known networks interface.
Drop the corresponding network_info field, function and D-Bus property.
The last seen times didn't seem useful but if a client needs them it can
probably implement the same logic with the information already available
through DBus.
If the sm object (or the handshake object) is NULL, don't call the
corresponding function.
0 0x7fb6cd37da80 in /lib64/libc.so.6
1 0x414764 in eapol_sm_destroy() at eapol.c:673
2 0x42e402 in ap_sta_free() at ap.c:97
3 0x439dbe in l_queue_clear() at /home/parallels/wrk/iwd/ell/queue.c:109
4 0x439e09 in l_queue_destroy() at /home/parallels/wrk/iwd/ell/queue.c:83
5 0x42e4bf in ap_reset() at ap.c:132
6 0x42e519 in ap_free() at ap.c:147
7 0x447456 in interface_instance_free() at /home/parallels/wrk/iwd/ell/dbus-service.c:513
8 0x449be0 in _dbus_object_tree_remove_interface() at /home/parallels/wrk/iwd/ell/dbus-service.c:1595
9 0x449ced in _dbus_object_tree_object_destroy() at /home/parallels/wrk/iwd/ell/dbus-service.c:787
10 0x40fb8c in device_free() at device.c:2717
11 0x405cdb in netdev_free() at netdev.c:605
12 0x439dbe in l_queue_clear() at /home/parallels/wrk/iwd/ell/queue.c:109
13 0x439e09 in l_queue_destroy() at /home/parallels/wrk/iwd/ell/queue.c:83
14 0x40aac2 in netdev_shutdown() at netdev.c:4483
15 0x403b75 in iwd_shutdown() at main.c:80
16 0x43d9f3 in signal_callback() at /home/parallels/wrk/iwd/ell/signal.c:83
17 0x43d4ee in l_main_iterate() at /home/parallels/wrk/iwd/ell/main.c:376
18 0x43d5ac in l_main_run() at /home/parallels/wrk/iwd/ell/main.c:419
19 0x40379b in main() at main.c:454
20 0x7fb6cd36788a in /lib64/libc.so.6
Until now network.c managed the list of network_info structs including
for known networks and networks that are seen in at least one device's
scan results, with the is_known flag to distinguish known networks.
Each time the list was processed though the code was either interested
in one subset of networks or the other. Split the list into a Known
Networks list and the list of other networks seen in scans. Move all
code related to Known Networks to knownnetworks.c, this simplifies
network.h. It also gets rid of network_info_get_known which actually
returned the list of all network_infos (not just for known networks),
which logically should have been private to network.c. Update device.c
and scan.c to use functions specific to Known Networks instead of
filtering the lists by the is_known flag.
This will also allow knownnetworks.c to export DBus objects and/or
properties for the Known Networks information because it now knows when
Known Networks are added, removed or modified by IWD.
The return value from network_connected is not checked and even if one
of the storage operations fails the function should probably continue
so only print a message on error.
If the device mode it toggled from 'ap' back to 'station' without actually
starting the access point ap_free attempts to zero out the psk, which
causes a crash because it had never been allocated (Start() never was
called). Since ap->psk is actually never used this was removed. Also added
a memset to zero out the pmk on cleanup.
This is the crash observed:
++++++++ backtrace ++++++++
0 0x7f6ffe978a80 in /lib64/libc.so.6
1 0x7f6ffe9d6766 in /lib64/libc.so.6
2 0x42dd51 in memset() at /usr/include/bits/string3.h:90
3 0x42ddd9 in ap_free() at src/ap.c:144
4 0x445ec6 in interface_instance_free() at ell/dbus-service.c:513
5 0x448650 in _dbus_object_tree_remove_interface() at ell/dbus-service.c:1595
6 0x40d980 in device_set_mode_sta() at src/device.c:2113
7 0x447d4c in properties_set() at ell/dbus-service.c:1861
8 0x448a33 in _dbus_object_tree_dispatch() at ell/dbus-service.c:1691
9 0x442587 in message_read_handler() at ell/dbus.c:285
10 0x43cac9 in io_callback() at ell/io.c:123
11 0x43bf5e in l_main_iterate() at ell/main.c:376
12 0x43c01c in l_main_run() at ell/main.c:419
13 0x40379d in main() at src/main.c:460
14 0x7f6ffe96288a in /lib64/libc.so.6
+++++++++++++++++++++++++++
- wsc module does not need nl80211 any longer, so remove it.
- Move wsc_init & wsc_exit declarations to iwd.h and remove wsc.h
- re-arrange how wsc_init & wsc_exit is called inside main.c.
The plugin_exit was in the wrong place, it should be triggered in case
genl creation fails. Also adhoc_exit was in the wrong sequence compared
to _init()
Rather than have device.c manage the creation/removal of
AP/AdHoc interfaces this new event was introduced. Now
anyone can listen for device events and if the mode changes
handle accordingly. This fixes potential memory leaks
in WSC when switching modes as well.
These will issue a JOIN/LEAVE_IBSS to the kernel. There is
a TODO regarding network configuration. For now, only the
SSID is configurable. This configuration is also required
for AP, but needs to be thought out. Since the current
AP Dbus API has nothing related to configuration items
such as freq/channel or RSN elements they are hard coded,
and will be for Ad-Hoc as well (for now).
Now that the device mode can be changed, netdev must check that
the iftype is correct before starting a connection or disconnecting.
netdev_connect, netdev_connect_wsc, and netdev_disconnect now check
that the iftype is station before continuing.
With the introduction of Ad-Hoc, its not as simple as choosing
aa/spa addresses when setting the keys. Since Ad-Hoc acts as
both the authenticator and supplicant we must check how the netdev
address relates to the particular handshake object as well as
choose the correct key depending on the value of the AA/SPA address.
802.11 states that the higher of the two addresses is to be used
to set the key for the Ad-Hoc connection.
A simple helper was added to choose the correct addressed based on
netdev type and handshake state. netdev_set_tk also checks that
aa > spa in the handshake object when in Ad-Hoc mode. If this is
true then the keys from that handshake are used, otherwise return
and the other handshake key will be used (aa will be > spa).
The station/ap mode behaves exactly the same as before.
For Ad-Hoc networks, the kernel takes care of auth/assoc
and issues a NEW_STATION event when that is complete. This
provides a way to notify when NEW_STATION events occur as
well as forward the MAC of the station to Ad-Hoc.
The two new API's added:
- netdev_station_watch_add()
- netdev_station_watch_remove()
When the EAPOL-Key data field is encrypted using AES Wrap, check
that the data field is large enough before calculating the expected
plaintext length.
Previously, if the encrypted data field was smaller than 8 bytes, an
integer underflow would occur when calculating the expected plaintext
data length. This would cause iwd to try to allocate a huge amount of
memory, which causes it to abort and terminate. If the data field was
equal to 8 bytes, iwd would try to allocate 0 bytes of memory, making
l_new return NULL, which subsequently causes iwd to crash on a NULL
pointer deference.
Reported-by: Mathy Vanhoef <Mathy.Vanhoef@cs.kuleuven.be>
triggered flag was being reset to false in all cases. However, due to
how scan_finished logic works, it should have remained true if no more
commands were left to be sent (e.g. the scan was finished).
Having hidden SSIDs or SSIDs with non-UTF8 characters around make iwd
flood the logs with messages. Make iwd less verbose and show these
messages with enabled debug output only.
In addition, the periodic scan can now alternate between the
active or passive modes. The active mode is enabled by existence of
the known hidden networks and observation of them in the
previous scan result.
To support an auto-connect for the hidden networks and having
a limited number of SSIDs that can be appended into a probe
request, introduced a concept of a command batch. Now, scan request
may consist of a series of commands. The commands in the batch
are triggered sequentially. Once we are notified about the
results from a previous command, a consequent command in the
batch is triggered. The collective results are reported once
the batch is complete. On a command failure, the batch
processing is canceled and scan request is removed
Rework the logic slightly to simplify the need for error labels. Also
the connect_pending variable might not have been properly reset to NULL
in case of error, so make sure we reset it prior to calling into
network_connect_new_hidden_network
1) Change signature of process_bss to return a confirmation
that bss has been added to a network otherwise we can
discard it.
2) Implements logic for the discovery and connection to
a hidden network.
This removes the need for duplicate code in AP/netdev for issuing
a DEL_STATION command. Now AP can issue a DEL_STATION with
netdev_del_station, and specify to either disassociate or deauth
depending on state.
If netdev fails to set the keys, there was no way for device/ap to
know. A new handshake event was added for this. The key setting
failure function was also fixed to support both AP/station iftypes.
It will now automatically send either a disconnect or del_station
depending on the interface type.
In similar manner, netdev_handshake_failed was also modified to
support both AP/station iftypes. Now, any handshake event listeners
should call netdev_handshake_failed upon a handshake failure
event, including AP.
If device is already disconnected or in autoconnect mode, don't return
an error if .Disconnect is called. Instead simply silently return
success after disabling autoconnect.
==1058== 231 (32 direct, 199 indirect) bytes in 1 blocks are definitely lost in loss record 10 of 10
==1058== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1058== by 0x452472: l_malloc (util.c:62)
==1058== by 0x456324: l_settings_new (settings.c:83)
==1058== by 0x427D45: storage_network_open (storage.c:262)
==1058== by 0x42806C: network_settings_load (network.c:75)
==1058== by 0x428C2F: network_autoconnect (network.c:490)
==1058== by 0x4104E9: device_autoconnect_next (device.c:194)
==1058== by 0x410E38: device_set_scan_results (device.c:393)
==1058== by 0x410EFA: new_scan_results (device.c:414)
==1058== by 0x424A6D: scan_finished (scan.c:1012)
==1058== by 0x424B88: get_scan_done (scan.c:1038)
==1058== by 0x45DC67: destroy_request (genl.c:134)