- Ensure that input isn't an empty string
- Ensure that EINVAL errno (which could be optionally returned by
strto{ul|l} is also checked.
- Since strtoul allows '+' and '-' characters in input, ensure that
input which is expected to be an unsigned number doesn't start with
'-'
Given an ASN1 blob of the right form, parse and create
an l_ecc_point object. The form used is specific to DPP
hence why this isn't general purpose and put into dpp-util.
Like in ap.c, allow the event callback to mark the handshake state as
destroyed, without causing invalid accesses after the callback has
returned. In this case the crash was because try_handshake_complete
needed to access members of handshake_state after emitting the event,
as well as access the netdev, which also has been destroyed:
==257707== Invalid read of size 8
==257707== at 0x408C85: try_handshake_complete (netdev.c:1487)
==257707== by 0x408C85: try_handshake_complete (netdev.c:1480)
(...)
==257707== Address 0x4e187e8 is 856 bytes inside a block of size 872 free'd
==257707== at 0x484621F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==257707== by 0x437887: ap_stop_handshake (ap.c:151)
==257707== by 0x439793: ap_del_station (ap.c:316)
==257707== by 0x43EA92: ap_station_disconnect (ap.c:3411)
==257707== by 0x43EA92: ap_station_disconnect (ap.c:3399)
==257707== by 0x454276: p2p_group_event (p2p.c:1006)
==257707== by 0x439147: ap_event (ap.c:281)
==257707== by 0x4393AB: ap_new_rsna (ap.c:390)
==257707== by 0x4393AB: ap_handshake_event (ap.c:1010)
==257707== by 0x408C7F: try_handshake_complete (netdev.c:1485)
==257707== by 0x408C7F: try_handshake_complete (netdev.c:1480)
(...)
Previously we added logic to defer doing anything in ap_free() to after
the AP event handler has returned so that ap_event() has a chance to
inform whoever called it that the ap_state has been freed. But there's
also a chance that the event handler is destroying both the AP and the
netdev it runs on, so after the handler has returned we can't even use
netdev_get_wdev_id or netdev_get_ifindex. The easiest solution seems to
be to call ap_reset() in ap_free() even if we're within an event handler
to ensure we no longer need any external objects. Also make sure
ap_reset() can be called multiple times.
Another option would be to watch for NETDEV_WATCH_EVENT_DEL and remove
our reference to the netdev (because there's no need actually call
l_rtnl_ifaddr_delete or frame_watch_wdev_remove if the netdev was
destroyed -- frame_watch already tracks netdev removals), or to save
just the ifindex and the wdev id...
The purpose of this was to have a single utility to both cancel an
existing offchannel operation (if one exists) and start a new one.
The problem was the previous offchannel operation was being canceled
first which opened up the radio work queue to other items. This is
not desireable as, for example, a scan would end up breaking the
DPP protocol most likely.
Starting the new offchannel then canceling is the correct order of
operations but to do this required saving the new ID, canceling, then
setting offchannel_id to the new ID so dpp_presence_timeout wouldn't
overwrite the new ID to zero.
This also removes an explicit call to offchannel_cancel which is
already done by dpp_offchannel_start.
Several members are named based on initiator/responder (i/r)
terminology. Eventually both initiator and responder will be
supported so rename these members to use own/peer naming
instead.
ASN1 parsing will soon be required which will need some utilities in
asn1-private.h. To avoid duplication include this private header and
replace the OID's with the defined structures as well as remove the
duplicated macros.
station_set_scan_results takes an autoconnect flag which was being
set true in both regular/quick autoconnect scans. Since OWE networks
are processed after setting the scan results IWD could end up
connecting to a network before all the OWE hidden networks are
populated.
To fix this regular/quick autoconnect results will set the flag to
false, then process OWE networks, then start autoconnect. If any
OWE network scans are pending station_autoconnect_start will fail
but will pick back up after the hidden OWE scan.
scan_request_failed and scan_finished remove the finished scan_request
from the request queue right away, before calling the callback. This
breaks those clients that rely on scan_cancel working on such requests
(i.e. to force the destroy callback to be invoked synchronously, see
a0911ca778 ("station: Make sure roam_scan_id is always canceled").
Fix this by removing the scan_request from the request queue after
invoking the callback. Also provide a re-entrancy guard that will make
sure that the scan_request isn't removed in scan_cancel itself.
There are similar operations being performed but with different
callbacks and userdata, depending on whether 'sr' is NULL or not.
Optimize the function flow slightly to make if-else unnecessary.
While here, update the comment. periodic scans are now scheduled only
based on the periodic timeout timer.
If periodic scan is active and we receive a SCAN_ABORTED event, we would
still invoke the periodic scan callback with an error. This is rather
pointless since the periodic scan callback cannot do anything useful
with this information. Fix that.
We should never reach a point where NEW_SCAN_RESULTS or SCAN_ABORTED are
received before a corresponding TRIGGER_SCAN is received. Even if this
does happen, there's no harm from processing the commands anyway.
This makes it a little easier to book-keep the started variable. Since
scan_request already has a 'passive' bit-field, there should be no
storage penalty.
If scan_cancel is called on a scan_request that is 'finished' but with
the GET_SCAN command still in flight, it will trigger a crash as
follows:
Received Deauthentication event, reason: 2, from_ap: true
src/station.c:station_disconnect_event() 11
src/station.c:station_disassociated() 11
src/station.c:station_reset_connection_state() 11
src/station.c:station_roam_state_clear() 11
src/scan.c:scan_cancel() Trying to cancel scan id 6 for wdev 200000002
src/scan.c:scan_cancel() Scan is at the top of the queue, but not triggered
src/scan.c:get_scan_done() get_scan_done
Aborting (signal 11) [/home/denkenz/iwd-master/src/iwd]
++++++++ backtrace ++++++++
#0 0x7f9871aef3f0 in /lib64/libc.so.6
#1 0x41f470 in station_roam_scan_notify() at /home/denkenz/iwd-master/src/station.c:2285
#2 0x43936a in scan_finished() at /home/denkenz/iwd-master/src/scan.c:1709
#3 0x439495 in get_scan_done() at /home/denkenz/iwd-master/src/scan.c:1739
#4 0x4bdef5 in destroy_request() at /home/denkenz/iwd-master/ell/genl.c:676
#5 0x4c070b in l_genl_family_cancel() at /home/denkenz/iwd-master/ell/genl.c:1960
#6 0x437069 in scan_cancel() at /home/denkenz/iwd-master/src/scan.c:842
#7 0x41dc2e in station_roam_state_clear() at /home/denkenz/iwd-master/src/station.c:1594
#8 0x41dd2b in station_reset_connection_state() at /home/denkenz/iwd-master/src/station.c:1619
#9 0x41dea4 in station_disassociated() at /home/denkenz/iwd-master/src/station.c:1644
The happens because get_scan_done callback is still called as a result of
l_genl_cancel. Add a re-entrancy guard in the form of 'canceled'
variable in struct scan_request. If set, get_scan_done will skip invoking
scan_finished.
It isn't clear what 'l_queue_peek_head() == results->sr' check was trying
to accomplish. If GET_SCAN dump was scheduled, then it should be
reported. Drop it.
results->sr is set to NULL for 'opportunistic' scans which were
triggered externally. See scan_notify() for details. However,
get_scan_done would only invoke scan_finished (and thus the periodic
scan callback sc->sp.callback) only if the scan queue was empty. It
should do so in all cases.
The point type was being hard coded to 0x3 (BIT1) which may have resulted
in the peer subtracting Y from P when reading in the point (depending on
if Y was odd or not).
Instead set the compressed type to whatever avoids the subtraction which
both saves IWD from needing to do it, as well as the peer.
The intent of this check is to make sure that at least 2 bytes are
available for reading. However, the unintended consequence is that tags
with a zero length at the end of input would be rejected.
While here, rework the check to be more resistant to potential
overflow conditions.
The DPP spec says nothing about how to handle re-transmits but it
was found in testing this can happen relatively easily for a few
reasons.
If the configurator requests a channel switch but does not get onto
the new channel quick enough the enrollee may have already sent the
authenticate response and it was missed. Also by nature of how the
kernel goes offchannel there are moments in time between ROC when
the card is idle and not receiving any frames.
Only frames where there was no ACK will be retransmitted. If the
peer received the frame and dropped it resending the same frame wont
do any good.
Now the result is sent immediately. Prior a connect attempt or
scan could have started, potentially losing this frame. In addition
the offchannel operation is cancelled after sending the result
which will allow the subsequent connect or scan to happen much
faster since it doesn't have to wait for ROC to expire.
The previous (incorrect) else was removed since it ended up
printing in most cases since the if clause returned. This should
have been an else if conditional from the start and only print if the
station device was not found.
IWD may be in the middle of some long operation, e.g. scanning.
If the URI is returned before IWD is ready, a configurator could
start sending frames and IWD either wont receive them, or will
be unable to respond quickly.
The offchannel priority was also changed to zero, which matches the
priority of frames. Currently there should be no interaction between
offchannel and connect (previous offchannel priority).
Periodic scans were handled specially where they were only
started if no other requests were pending in the scan queue.
This is fine, and what we want, but this can actually be
handled automatically by nature of the wiphy work queue rather
than needing to check the request queue explicitly.
Instead we can insert periodic scans at a lower priority than
other scans. This puts them at the end of the work queue, as
well as allows future requests to jump ahead if a periodic scan
has not yet started.
Eventually, once all pending scans are done, the peridoic scan
may begin. This is no different than the preivous behavior and
avoids the need for any special checks once scan requests
complete.
One check was added to address the problem of the periodic scan
timer firing before the scan could even start. Currently this
happened to be handled fine in scan_periodic_queue, as it checks
the queue length. Since this check was removed we must see check
for this condition inside scan_periodic_timeout.
This adds a priority argument to scan_common rather than hard
coding it when inserting the work item and uses the newly
defined wiphy priority for scanning.
Work priority was never explicitly defined anywhere, and a module
using wiphy_radio_work APIs needed to ensure it was not inserting
at a priority that would interfere with other work.
Now all the types of work have been defined with their own priority
and future priorities can easily be added before, after, or in
between existing priorities.
- Mostly problems with whitespace:
- Use of spaces instead of tabs
- Stray spaces before closing ')
- Missing spaces
- Missing 'void' from function declarations & definitions that
take no arguments.
- Wrong indentation level
When this attribute is included, the initiator is requesting all
future frames be sent on this channel. There is no reason for a
configurator to act on this attribute (at least for now) so the
request frame will be dropped in this case. Enrollees will act
on it by switching to the new channel and sending the authentication
response.
While connected the driver ends up choosing quite small ROC
durations leading to excessive calls to ROC. This also will
negatively effect any wireless performance for the current
network and possibly lead to missed DPP frames.
Currently the enrollee relied on autoconnect to handle connecting
to the newly configured network. This usually resulted in poor
performance since periodic scans are done at large intervals apart.
Instead first check if the newly configured network is already
in IWD's network queue. If so it can be connected to immediately.
If not, a full scan must be done and results given to station.
With better JSON support the configuration request object
can now be fully parsed. As stated in the previous comment
there really isn't much use from the configurator side apart
from verifying mandatory values are included.
This patch also modifies the configuration result to handle
sending non 'OK' status codes in case of JSON parsing errors.
json_iter_parse is only meant to work on objects while
json_iter_next is only meant to work on arrays.
This adds checks in both APIs to ensure they aren't being
used incorrectly.
Arrays can now be parsed using the JSON_ARRAY type (stored in
a struct json_iter) then iterated using json_iter_next. When
iterating the type can be checked with json_iter_get_type. For
each iteration the value can be obtained using any of the type
getters (int/uint/boolean/null).
This adds support for boolean, (unsigned) integers, and
null types. JSON_PRIMITIVE should be used as the type when
parsing and the value should be struct json_iter.
Once parsed the actual value can be obtained using one of
the primitive getters. If the type does not match they will
return false.
If using JSON_OPTIONAL with JSON_PRIMITIVE the resulting
iterator can be checked with json_iter_is_valid. If false
the key/value was not found or the type was not matching.
First, this was renamed to 'count_tokens_in_container' to be
more general purpose (i.e. include future array counting).
The way the tokens are counted also changed to be more intuitive.
While the previous way was correct, it was somewhat convoluted in
how it worked (finding the next parent of the objects parent).
Instead we can use the container token itself as the parent and
begin counting tokens. When we find a token with a parent index
less than the target we have reached the end of this container.
This also works for nested containers, including arrays since we
no longer rely on a key (which an array element would not have).
For example::
{
"first":{"foo":"bar"},
"second":{"foo2":"bar2"}
}
index 0 <overall object>
index 1 "first" with parent 0
index 2 {"foo":"bar"} with parent 1
Counting tokens inside "first"'s object we have:
index 3 "foo" with parent 2
index 4 "bar" with parent 3
If we continue counting we reach:
index 5 "second" with parent 0
This terminates the counting loop since the parent index is
less than '2' (the index of {"foo":"bar"} object).
In file included from ./ell/ell.h:15,
from ../../src/dpp.c:29:
../../src/dpp.c: In function ‘authenticate_request’:
../../ell/log.h:79:22: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 8 has type ‘size_t’ {aka ‘unsigned int’} [-Wformat=]
79 | l_log(L_LOG_DEBUG, "%s:%s() " format, __FILE__, \
| ^~~~~~~~~~
../../ell/log.h:54:16: note: in definition of macro ‘l_log’
54 | __func__, format "\n", ##__VA_ARGS__)
| ^~~~~~
../../ell/log.h:103:31: note: in expansion of macro ‘L_DEBUG_SYMBOL’
103 | #define l_debug(format, ...) L_DEBUG_SYMBOL(__debug_desc, format, ##__VA_ARGS__)
| ^~~~~~~~~~~~~~
../../src/dpp.c:1235:3: note: in expansion of macro ‘l_debug’
1235 | l_debug("I-Nonce has unexpected length %lu", i_nonce_len);
| ^~~~~~~
Direct leak of 64 byte(s) in 1 object(s) allocated from:
#0 0x7fa226fbf0f8 in __interceptor_malloc (/usr/lib/gcc/x86_64-pc-linux-gnu/9.4.0/libasan.so.5+0x10c0f8)
#1 0x688c98 in l_malloc ell/util.c:62
#2 0x6c2b19 in msg_alloc ell/genl.c:740
#3 0x6cb32c in l_genl_msg_new_sized ell/genl.c:1567
#4 0x424f57 in netdev_build_cmd_authenticate src/netdev.c:3285
#5 0x425b50 in netdev_sae_tx_authenticate src/netdev.c:3385
Direct leak of 7 byte(s) in 1 object(s) allocated from:
#0 0x7fd748ad00f8 in __interceptor_malloc (/usr/lib/gcc/x86_64-pc-linux-gnu/9.4.0/libasan.so.5+0x10c0f8)
#1 0x688c21 in l_malloc ell/util.c:62
#2 0x4beec7 in handshake_state_set_vendor_ies src/handshake.c:324
#3 0x464e4e in station_handshake_setup src/station.c:1203
#4 0x472a2f in __station_connect_network src/station.c:2975
#5 0x473a30 in station_connect_network src/station.c:3078
#6 0x4ed728 in network_connect_8021x src/network.c:1497
Fixes: f24cfa481b ("handshake: Add setter for vendor IEs")
This implements a configurator in the responder role. Currently
configuring an enrollee is limited to only the connected network.
This is to avoid the need to go offchannel for any reason. But
because of this a roam, channel switch, or disconnect will cause
the configuration to fail as none of the frames are being sent
offchannel.
Added both enrollee and configurator roles, as well as the needed
logic inside the authentication protocol to verify role compatibility.
The dpp_sm's role will now be used when setting capability bits making
the auth protocol agnostic to enrollees or configurators.
This also allows the card to re-issue ROC if it ends in the middle of
authenticating or configuring as well as add a maximum timeout for
auth/config protocols.
IO errors were also handled as these sometimes can happen with
certain drivers but are not fatal.
Allows creating a new configuration object based on settings, ssid,
and akm suite (for configurator role) as well as converting a
configuration object to JSON.
Rather than hard coding ad0, use the actual frame data. There really
isn't a reason this would differ (only status attribute) but just
in case its better to use the frame data directly.
This is a minimal implementation only supporting legacy network
configuration, i.e. only SSID and PSK/passphrase are supported.
Missing features include:
- Fragmentation/comeback delay support
- DPP AKM support
- 8021x/PKEX support
This implements the DPP protocol used to authenticate to a
DPP configurator.
Note this is not a full implementation of the protocol and
there are a few missing features which will be added as
needed:
- Mutual authentication (needed for BLE bootstrapping)
- Configurator support
- Initiator role
The presence procedure implemented is a far cry from what the spec
actually wants. There are two reason for this: a) the kernels offchannel
support is not at a level where it will work without rather annoying
work arounds, and b) doing the procedure outlined in the spec will
result in terrible discovery performance.
Because of this a simpler single channel announcement is done by default
and the full presence procedure is left out until/if it is needed.
This is a minimal wrapper around jsmn.h to make things a bit easier
for iterating through a JSON object.
To use, first parse the JSON and create a contents object using
json_contents_new(). This object can then be used to initialize a
json_iter object using json_iter_init().
The json_iter object can then be parsed with json_iter_parse by
passing in JSON_MANDATORY/JSON_OPTIONAL arguments. Currently only
JSON_STRING and JSON_OBJECT types are supported. Any JSON_MANDATORY
values that are not found will result in an error.
If a JSON_OPTIONAL string is not found, the pointer will be NULL.
If a JSON_OPTIONAL object is not found, this iterator will be
initialized but 'start' will be -1. This can be checked with a
convenience macro json_object_not_found();
Static analysis was not happy since this return can be negative and
it was being fed into an unsigned argument. In reality this cannot
happen since the key buffer is always set to the maximum size supported
by any curves.
This module provides a convenient wrapper around both
CMD_[CANCEL_]_REMAIN_ON_CHANNEL APIs.
Certain protocols require going offchannel to send frames, and/or
wait for a response. The frame-xchg module somewhat does this but
has some limitations. For example you cannot just go offchannel;
an initial frame must be sent out to start the procedure. In addition
frame-xchg does not work for broadcasts since it expects an ACK.
This module is much simpler and only handles going offchannel for
a duration. During this time frames may be sent or received. After
the duration the caller will get a callback and any included error
if there was one. Any offchannel request can be cancelled prior to
the duration expriring if the offchannel work has finished early.
The disconnect event handler was mistakenly bailing out if FT or
reassociation was going on. This was done because a disconnect
event is sent by the kernel when CMD_AUTH/CMD_ASSOC is used.
The problem is an AP could also disconnect IWD which should never
be ignored.
To fix this always parse the disconnect event and, if issued by
the AP, always notify watchers of the disconnect.
LLD 13 and GNU ld 2.37 support -z start-stop-gc which allows garbage
collection of C identifier name sections despite the __start_/__stop_
references. GNU ld before 2015-10 had the behavior as well. Simply set
the retain attribute so that GCC 11 (if configure-time binutils is 2.36
or newer)/Clang 13 will set the SHF_GNU_RETAIN section attribute to
prevent garbage collection.
Without the patch, there are linker errors with -z start-stop-gc
(LLD default) when -Wl,--gc-sections is used:
```
ld.lld: error: undefined symbol: __start___eap
>>> referenced by eap.c
>>> src/eap.o:(eap_init)
```
The remain attribute will not be needed if the metadata sections are
referenced by code directly.
ap.c has been mostly careful to call the event handler at the end of any
externally called function to allow methods like ap_free() to be called
within the handler, but that isn't enough. For example in
ap_del_station we may end up emitting two events: STATION_REMOVED and
DHCP_LEASE_EXPIRED. Use a slightly more complicated mechanism to
explicitly guard ap_free calls inside the event handler.
To make it easier, simplify cleanup in ap_assoc_reassoc with the use of
_auto_.
In ap_del_station reorder the actions to send the STATION_REMOVED event
first as the DHCP_LEASE_EXPIRED is a consequence of the former and it
makes sense for the handler to react to it first.
src/eap.c: In function 'eap_rx_packet':
src/eap.c:419:50: error: 'vendor_type' may be used uninitialized in this function [-Werror=maybe-uninitialized]
419 | (type == EAP_TYPE_EXPANDED && vendor_id == (id) && vendor_type == (t))
| ^~
src/eap.c:430:11: note: 'vendor_type' was declared here
430 | uint32_t vendor_type;
It isn't clear why GCC complains about vendor_type, but not vendor_id.
But in all cases if type == EAP_TYPE_EXPANDED, then vendor_type and
vendor_id are set. Silence this spurious warning.
There is an unchecked NULL pointer access in network_has_open_pair.
open_info can be NULL, when out of multiple APs in range that advertise
the same SSID some advertise OWE transition elments and some don't.
The Hotspot 2.0 spec has some requirements that IWD was missing depending
on a few bits in extended capabilities and the HS2.0 indication element.
These requirements correspond to a few sysfs options that can be set in
the kernel which are now set on CONNECTED and unset on DISCONNECTED.
Netconfig was the only user of sysfs but now other modules will
also need it.
Adding existing API for IPv6 settings, a IPv4 and IPv6 'supports'
checker, and a setter for IPv4 settings.
The way a SA Query was done following a channel switch was slightly
incorrect. One because it is only needed when OCVC is set, and two
because IWD was not waiting a random delay between 0 and 5000us as
lined out by the spec. This patch fixes both these issues.
Cache the latest v4 and v6 domain string lists in struct netconfig state
to be able to more easily detect changes in those values in future
commits. For that split netconfig_set_domains's code into this function,
which now only commits the values in netconfig->v{4,6}_domain{,s} to the
resolver, and netconfig_domains_update() which figures out the active
domains string list and saves it into netconfig->v{4,6}_domain{,s}. This
probably saves some cycles as the callers can now decide to only
recalculate the domains list which may have changed.
While there simplify netconfig_set_domains return type to void as the
result was always 0 anyway and was never checked by callers.
Cache the latest v4 and v6 DNS IP string lists in struct netconfig state
to be able to more easily detect changes in those values in future
commits. For that split netconfig_set_dns's code into this function,
which now only commit the values in netconfig->dns{4,6}_list to the
resolver, and netconfig_dns_list_update() which figures out the active
DNS IP address list and saves it in netconfig->dns{4,6} list. This
probably saves some cycles as the callers can now decide to only
recalculate the dns_list which may have changed.
While there simplify netconfig_set_dns return type to void as the result
was always 0 anyway and was never checked by callers.
Cache the latest v4 and v6 gateway IP string in struct netconfig state
to be able to more easily detect changes in those values in future
commits and perhaps to simplify the ..._routes_install functions.
netconfig_ipv4_get_gateway's out_mac parameter can now be NULL. While
editing that function fix a small formatting annoyance.
Use a separate fils variable to make the code a bit prettier.
Also make sure that the out_mac parameter is not NULL prior to storing
the gateway_mac in it.
Add netconfig_enabled() and use that in all places that want to know
whether network configuration is enabled. Drop the enable_network_config
deprecated setting, which was only being handled in one of these 5 or so
places.
This code path was never tested and used to ensure a OWE transition
candidate gets selected over an open one (e.g. if all the BSS's are
blacklisted). But this logic was incorrect and the path was being
taken for BSS's that did not contain the owe_trans element, basically
all BSS's. For RSN's this was somewhat fine since the final check
would set a candidate, but for open BSS's the loop would start over
and potentially complete the loop without ever returning a candidate.
If fallback was false, NULL would be returned.
To fix this only take the OWE transition path if its an OWE transition
BSS, i.e. inverse the logic.