If EnableNetworkConfiguration was enabled ap.c required that
APRanges also be set. This prevents IWD from starting which
effects a perfectly valid station configuration. Instead if
APRanges is not provided IWD still allows ap_init to pass but
DHCP just will not be enabled.
Users can now supply an AP provisioning file containing an [IPv4]
section and define various DHCP settings:
[IPv4]
Address=<address>
Netmask=<netmask>
Gateway=<gateway>
IPRange=<start_address>,<end_address>
DNSList=<dns1>,<dns2>,...<dnsN>
LeaseTime=<lease_time>
There are a few notes/requirements to keep in mind when using a
provisioning file:
- All settings are optional but [IPv4].Address is required if the
interface does not already have an address set.
- If no [IPv4].Address is defined in the provisioning file and the AP
interface does not already have an address set, StartWithConfig()
will fail with -EINVAL.
- If a provisioning file is provided it will take precedence, and the
AP will not pull from the IP pool.
- A provisioning file containing an IPv4 section assumes DHCP is being
enabled and will override [General].EnableNetworkConfiguration.
- Any address that AP sets on the interface will be deleted when the AP
is stopped.
Users can now start an AP from settings based on a profile
on disk. The only argument is the SSID which will be used to
lookup the profile. If no profile is found a NotFound error
will be returned. Any invalid profiles will result in an
Invalid return.
This seems to happen occationally with testAP (potentially others).
The invalid read appears to happen when the frame_xchg_tx_cb detects
an early status and no ACK. In this particular case there is no
retry interval so we reach the retry limit and 'done' the frame.
This frees the 'fx' data all before the destroy callback can get
called. Once we finally return and the destroy callback is called
'fx' is freed and we see the invalid write.
==206== Memcheck, a memory error detector
==206== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==206== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==206== Command: iwd -p rad1,rad2,rad3,rad4 -d
==206== Parent PID: 140
==206==
==206== Invalid write of size 4
==206== at 0x4493A0: frame_xchg_tx_destroy (frame-xchg.c:941)
==206== by 0x46DAF6: destroy_request (genl.c:673)
==206== by 0x46DAF6: process_unicast (genl.c:1002)
==206== by 0x46DAF6: received_data (genl.c:1101)
==206== by 0x46AA4B: io_callback (io.c:118)
==206== by 0x469D6C: l_main_iterate (main.c:477)
==206== by 0x469E1B: l_main_run (main.c:524)
==206== by 0x469E1B: l_main_run (main.c:506)
==206== by 0x46A02B: l_main_run_with_signal (main.c:646)
==206== by 0x403E78: main (main.c:490)
==206== Address 0x4c59c6c is 172 bytes inside a block of size 176 free'd
==206== at 0x483B9F5: free (vg_replace_malloc.c:538)
==206== by 0x40F14C: destroy_work (wiphy.c:248)
==206== by 0x40F14C: wiphy_radio_work_done (wiphy.c:1578)
==206== by 0x44A916: frame_xchg_tx_cb (frame-xchg.c:930)
==206== by 0x46DAD9: process_unicast (genl.c:993)
==206== by 0x46DAD9: received_data (genl.c:1101)
==206== by 0x46AA4B: io_callback (io.c:118)
==206== by 0x469D6C: l_main_iterate (main.c:477)
==206== by 0x469E1B: l_main_run (main.c:524)
==206== by 0x469E1B: l_main_run (main.c:506)
==206== by 0x46A02B: l_main_run_with_signal (main.c:646)
==206== by 0x403E78: main (main.c:490)
==206== Block was alloc'd at
==206== at 0x483A809: malloc (vg_replace_malloc.c:307)
==206== by 0x4643CD: l_malloc (util.c:61)
==206== by 0x44AF8C: frame_xchg_startv (frame-xchg.c:1155)
==206== by 0x44B2A4: frame_xchg_start (frame-xchg.c:1108)
==206== by 0x42BC55: ap_send_mgmt_frame (ap.c:709)
==206== by 0x42F513: ap_probe_req_cb (ap.c:1869)
==206== by 0x449752: frame_watch_unicast_notify (frame-xchg.c:233)
==206== by 0x46DA2F: dispatch_unicast_watches (genl.c:961)
==206== by 0x46DA2F: process_unicast (genl.c:980)
==206== by 0x46DA2F: received_data (genl.c:1101)
==206== by 0x46AA4B: io_callback (io.c:118)
==206== by 0x469D6C: l_main_iterate (main.c:477)
==206== by 0x469E1B: l_main_run (main.c:524)
==206== by 0x469E1B: l_main_run (main.c:506)
==206== by 0x46A02B: l_main_run_with_signal (main.c:646)
==206==
The DHCP server can be enabled by enabling network configuration
with [General].EnableNetworkConfiguration. If an IP is not set
on the interface before the AP is started a valid IP range must
also be provided under [General].APRanges in IP prefix format e.g.
[General]
EnableNetworkConfiguration=true
APRanges=192.168.1.1/24
Each AP started will get assigned a new subnet within the range
specified by APRanges as to not conflict with other AP interfaces.
If there are no subnets left in the pool when an AP is started
it will fail with -EEXIST. Any AP's that are stopped will release
their subnet back into the pool to be used with other APs.
The DHCP IP pool will be automatically chosen by the ELL DHCP
implementation (+1 the AP's IP to *.254). The remaining DHCP
settings will be defaults chosen by ELL (DNS, lease time, etc).
periodic_scan_stop is called whenever we exit the autoscan state but a
periodic scan may not be running at the time. If we have a
user-triggered scan running, or the autoconnect_quick scan, and we reset
Scanning to false before that scan finished, a client could en up
calling GetOrderedNetwork too early and not receiving the scan results.
ConnectHiddenNetwork can be seen a triggering this sequence:
1. the active scan,
2. the optional agent request,
3. the Authentication/Association/4-Way Handshake/netconfig,
4. connected state
Currently Disconnect() interrupts 3 and 4, allow it to also interrupt
state 1. It's difficult to tell whether we're in state 2 from within
station.c.
Since our DBus API and our use cases only support initiating connections
and not accepting incoming connections we don't really need to reply to
Probe Requests on the P2P-Device interface. Start doing it firstly so
that we can test the scenario where we get discovered and pre-authorized
to connect in an autotest (wpa_supplicant doesn't seem to have a way to
authorize everyone, which is probably why most Wi-Fi Display dongles
don't do it and instead reply with "Fail: Information not available" and
then restart connection from their side) and secondly because the spec
wants us to do it.
Make sure dev->peer_list is non-NULL before using l_queue_push_tail()
same as we do when the peer info comes from a Probe Response (active
scan in Find Phase). Otherwise peers discovered through Probe Requests
before any Probe Responses are received will be lost.
The device type category array is indexed by the category ID so if we're
skipping i == 0 in the iteration, we should also skip the 0'th element
in device_type_categories.
The callback for the FRAME command was causing a crash in
wiphy_radio_work_done when not cancelled when the wiphy was being
removed from the system. This was likely to happen if this radio work
item was waiting for another item to finish. When the first one was
being cancelled due to the wiphy being removed, this one would be
started and immediately stopped by the radio work queue.
Now this crash could be fixed by dropping all frame exchange instances
on an interface that is being removed which is easy to do, but properly
cancelling the commands saves us the headache of analysing whether
there's a race condition in other situations where a frame exchange is
being aborted.
We want to use this flag only on the interfaces with one of the three
P2P iftypes so set the flag automatically depending on the iftype from
the last 'config' notification.
Convert ap_send_mgmt_frame() to use frame_xchg_start for sending frames,
this fixes among other things the ACK-received checks.
One side effect is that we're no longer sending Probe Responses with the
don't-wait-for-ack flag because frame-xchg doesn't support it, but other
AP implementations don't use that flag either.
Another side-effect is that we do use the no-cck-rate flag
unconditionally, something we may want to fix but would need to add
another parameter to frame-xchg.
Add a "psk" setting to allow the user to pass the binary PSK directly
instead of generating it from the passphrase and the SSID. In that case
we'll only send the PSK to WSC enrollees.
There has been a desire to remove the ELL plugin dependency from
IWD which is the only consumer of the plugin API. This removes
the dependency and prepares the tree for converting the existing
ofono plugin into a regular module.
sim_hardcoded was removed completely. This was originall implemented
before full ofono support purely to test the IWD side of EAP-SIM/AKA.
Since the ofono plugin (module-to-be) is now fully implemented there
really isn't a need for sim_hardcoded.
The Started property was being set in the Join IBSS callback which
isn't really when the IBSS has been started. The kernel automatically
scans for IBSS networks which takes some time. Its better to wait
on setting Started until we get the Join IBSS event.
Commit 1f910f84b4 ("eapol: Use eapol_start in authenticator mode too")
introduced the requirement that authentication eapol_sm objects also had
to be started via eapol_start. Adhoc was never updated to do that.
For multi-bss networks its nice to know which BSS is being connected
to. The ranking can hint at it, but blacklisting or network capabilities
could effect which network is actually chosen. An explicit debug print
makes debugging much easier.
Again the hs->support_ip_allocation flag is used for two purposes here,
first the user signals whether to support this mechanism through this
flag, then it reads the flag to find out if an IP was allocated.
Support IP allocation during the 4-Way Handshake as defined in the P2P
spec. This is the supplicant side implementation.
The API requires the user to set hs->support_ip_allocation true before
eapol_start(). On HANDSHAKE_EVENT_COMPLETE, if this same flag is still
set, we've received the IP lease, the netmask and the authenticator's
IP from the authenticator and there's no need to start DHCP. If the
flag is cleared, the user needs to use DHCP.
Allow the possibility of becoming the Group-owner when we parse the GO
Negotiation Request, build GO Negotiation Response and parse the GO
Negotiation Confirmation, i.e. if we're responding to a negotiation
initiated by the peer after it needed to request user action.
Until now the code assumed we can't become the GO or we'd report error.
Allow the possibility of becoming the Group-owner when we build the GO
Negotiation Request, parse GO Negotiation Response and build the GO
Negotiation Confirmation, i.e. if we're the initiator of the
negotiation.
Until now the code assumed we can't become the GO or we'd report error.
Add a utility to select random characters from the set defined in P2P
v1.7 Section 3.2.1. In this version the assumption is that we're only
actually using this for the two SSID characters.
explicit_bzero is used in src/ap.c since commit
d55e00b31d but src/missing.h is not
included, as a result build with uclibc fails on:
/srv/storage/autobuild/run/instance-1/output-1/host/lib/gcc/xtensa-buildroot-linux-uclibc/9.3.0/../../../../xtensa-buildroot-linux-uclibc/bin/ld: src/ap.o: in function `ap_probe_req_cb':
ap.c:(.text+0x23d8): undefined reference to `explicit_bzero'
Fixes:
- http://autobuild.buildroot.org/results/c7a0096a269bfc52bd8e23d453d36d5bfb61441d
Add the special case "DIRECT-" SSID, called the P2P Wildcard SSID, in
ap_probe_req_cb so as not to reject those Probe Requests on the basis of
ssid mismatch. I'd have preferred to keep all the P2P-specific bits in
p2p.c but in this case there's little point in adding a generic
config setting for SSID-matching quirks.
Prefix all the struct p2p_device members that are part of the connection
state with the "conn_" string for consistency. If we needed to support
multiple client connections, these members are the ones that would
probably land in a separate structure, without that prefix.
For WSC we should have been sending our probe requests from the same
address we're going to be doing EAP-WSC with the GO. Somehow I was able
to connect to most devices without that but other implementations seem
to use the Interface Address (the P2P-Client's MAC), not the Device
Address (P2P-Device's MAC). We could switch the order to first create
the new interface and scan from it is simpler to use the scan_context we
already have created on the device interface and set a different mac.
Check the conditions for PBC enrollee registration when we receive the
Association Request with WSC IE and indicate to the enrollee whether we
accept the association using a WSC IE in the Association Response.
After this, a NULL sta->assoc_rsne indicates that the station is not
establishing the RSNA and is a WSC enrollee.
Implement the caching of WSC probe requests -- when an Enrollee later
associates to start registration we need to have its Probe Request on
file. Also use this cache for PBC "Session Overlap" detection.
This adds the API for putting the AP in Push Button mode, which we'll
need to P2P GO side but may be useful on its own too. A WSC IE is added
to our beacons and probe responses indicating whether the PBC mode is
active.
On a new association or re-association, in addition to forgetting a
complete RSN Association, also stop the EAPoL SM to stop any ongoing
handshake.
Do this in a new function ap_stop_handshake that is now used in a few
places that had copies of the same few lines. I'll be adding some more
lines to this function for WSC support.
Reuse this flag on the authenticator side with a slightly different
meaning: when it's true we're forced to wait for the EAPoL-Start before
sending the first EAPoL-EAP frame to the supplicant, such as is required
in a WSC enrollee registration when the Association Request didn't have
a v2.0 WSC IE.
Add the wfa_build_authorized_macs function (wfa_ prefix following the
wfa_extract_ naming) and use it in wsc_build_probe_response. The logic
is changed slightly to treat the first 6-zeros address in the array as
the end of the array.
Setting 'match' false wouldn't do anything because it was already false.
If the frame is addressed to some other non-broadcast address ignore it
directly and exit ap_probe_req_cb.
To limit the number of ap_start parameters, group basic AP config
parameters in the ap_config struct that is passed as a pointer and owned
by the ap_state.
The intent was to read the UUID-E from the settings rather than generate
it from the enrollee's MAC because it needs to match the UUID-E from
enrolee's Probe Requests, fix this. The UUID-E supplied in the unit
test was being ignored but the test still passed because the supplied
UUID-E was generated the same way we generated it in eap-wsc.c.
When we're sending our probe response to the same peer that we're
currently connected or connecting to, use current WSC Configuration
Methods, UUID-E and WFD IE selected for this connection attempt, not the
ones we'd use when discovering peers or being discovered by peers.
In the case of the WFD IE, the "Available for WFD Session" flag is going
to differ between the two cases -- we may be unavailable for other peers
but we're still available for the peer we're trying to start the WFD
session with.
When we send our GO Negotiation Response, send the Configuration Method
selected for the current connection rather than the accepted methods mask
that we hold in dev->device_info.
When building the scan IEs for our provisioning scans, use the UUID-E
based on the Interface Address, not the Device Address, as that is what
wsc.c will be using to in the registration protocol.
Eventually we may have to base the UUID-E on the Device Address or
something else that is persistent, and pass the actual UUID-E to wsc.c,
as the Interface Address is randomly generated on every connect attempt.
IIRC the UUID-E is supposed to be persistent.
wsc_attr_builder_start_attr and wsc_attr_builder_free look at
builder->curlen to see whether the TLV's length needs to be updated to
include the previous attribute. If builder->curlen is 0
wsc_attr_builder_start_attr assumes there's no previous attribute and
starts writing at current builder->offset. If the previous attribute
length was 0 curlen would stay at 0 and that attribute would get
overwritten with the new one. To solve this add the 4 bytes of the T
and L to curlen as soon as a new attribute is started, and subtract
them when writing the L value. The alternative would be to set a flag
to say whether an attribute was started.
The spec explicitly allows 0-length attributes in section 12:
"The variable length string attributes, e.g., Device Name, are encoded
without null-termination, i.e., no 0x00 octets added to the end of the
value. If the string is empty, the attribute length is set to zero."
Add ability to populate search domains for resolvconf based systems.
Search domains are added using the 'search' directive and added using
the <ifname>.domain key into resolvconf.
Introduce a new resolvconf_invoke function that takes care of all the
details of invoking resolvconf and simplify the code a bit.
Introduce have_dns that tracks whether DNS servers were actually
provided. If no DNS info was provided, do not invoke resolvconf to
remove it.
Instead of interface index, resolvconf is now invoked with the printable
name of the interface and the dns entries are placed in the "dns"
protocol. This makes it a bit simpler to add additional info to
resolvconf instead of trying to generate a monolithic entry.
Resolve module does not currently track any state that has been set on
a per ifindex basis. This was okay while the set of information we
supported was quite small. However, with dhcpv6 support being prepared,
a more flexible framework is needed.
Change the resolve API to allocate and return an instance for a given
ifindex that has the ability to track information that was provided.
Found using lsan:
==29896==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 9 byte(s) in 1 object(s) allocated from:
#0 0x7fcd41e0c710 in __interceptor_malloc /var/tmp/portage/sys-devel/gcc-8.2.0-r6/work/gcc-8.2.0/libsanitizer/asan/asan_malloc_linux.cc:86
#1 0x606abd in l_malloc ell/util.c:62
#2 0x460230 in ie_tlv_vendor_ie_concat src/ie.c:140
#3 0x4605d1 in ie_tlv_extract_wfd_payload src/ie.c:216
#4 0x4a8773 in scan_parse_bss_information_elements src/scan.c:1105
#5 0x4a94a8 in scan_parse_attr_bss src/scan.c:1181
#6 0x4a99f8 in scan_parse_result src/scan.c:1238
#7 0x4abe4e in get_scan_callback src/scan.c:1451
#8 0x6442d9 in process_unicast ell/genl.c:979
#9 0x6453ff in received_data ell/genl.c:1087
#10 0x62e1a4 in io_callback ell/io.c:126
#11 0x628fca in l_main_iterate ell/main.c:473
#12 0x6294e8 in l_main_run ell/main.c:520
#13 0x629d8b in l_main_run_with_signal ell/main.c:642
#14 0x40681b in main src/main.c:505
#15 0x7fcd40a55bdd in __libc_start_main (/lib64/libc.so.6+0x21bdd)
This commit has all the changes to extend and generalise the current
eap-wsc.c code to handle both the Enrollee and Registrar side of the
protocol, reusing existing functions and structures.
Alongside the current EAP-WSC enrollee side support, add the initial
part of registrar side. In the same file, register a new method with
the name string of "WSC-R". In this patch only the load_settings
method is added. validate_identity and handle_response are added in
later patches.
Handle EAPoL-EAP frames using our eap.c methods in authenticator mode
same as we do on the supplicant side. The user (ap.c) will only need to
set a valid 8021x_settings in the handshake object, same as on the
supplicant side.
The goal is to add specifically EAP-WSC registrar side and it looks like
extending our EAP and EAPoL code to support both supplicant and
authenticator-side methods is simpler than adding just EAP-WSC as a
special case.
Since EAP-WSC always ends in an EAP failure, I haven't actually tested
the success path.
On the supplicant side eapol_register would only register the eapol_sm
on a given netdev to start receiving frames and an eapol_start call is
required for the state machine to start executing. On the authenticator
side we shouldn't have the "early frame" problem but there's no reason
for the semantics of the two methods to be different. Somehow we were
doing everything in eapol_register and not using eapol_start if
hs->authenticator was true, so bring this in line with the supplicant
side and require eapol_start to be called also from ap.c.
Move the update of station->networks_sorted order to before we set
station->connected_network NULL to avoid a crash when we attempt to
use the NULL pointer.
Besides being undefined behaviour, signed integer overflow can cause
unexpected comparison results. In the case of network_rank_compare(),
a connected network with rank INT_MAX would cause newly inserted
networks with negative rank to be inserted earlier in the ordered
network list. This is reflected in the GetOrderedMethods() DBus method
as can be seen in the following iwctl output:
[iwd]# station wlan0 get-networks
Network name Security Signal
----------------------------------------------------
BEOLAN 8021x **** }
BeoBlue psk *** } all unknown,
UI_Test_Network psk *** } hence assigned
deneb_2G psk *** } negative rank
BEOGUEST open **** }
> titan psk ****
Linksys05274_5GHz_dmt psk ****
Lyngby-4G-4 5GHz psk ****
Doing so ensures that the currently connected network is always at the
beginning of the list. Previously, the list would only get updated after
a scan.
This fixes the documented behaviour of GetOrderedNetworks() DBus method,
which states that the currently connected network is always at the
beginning of the returned array.
Fix a logic error which prevented iwd from using SAE/WPA3 when
attempting to connect to APs that are in transition mode. The SAE/WPA3
check incorrectly required mfpr bit to be set, which is true for
APs in WPA3-Personal only mode, but is set to 0 for APs in
WPA3-Personal transition mode.
This patch also adds a bit more diagnostic output to help diagnose
causes for connections where WPA3 is not attempted even when advertised
by the AP.
Replace the usage of eap_send_response() in the method implementations
with a new eap_method_respond that skips the redundant "type" parameter.
The new eap_send_packet is used inside eap_method_respond and will be
reused for sending request packets in authenticator side EAP methods.
Throughout the supplicant mode we'd use the eapol_sm_write wrapper but
in the authenticator mode we'd call __eapol_tx_packet directly. Adapt
eapol_sm_write to use the right destination address and use it
consistently.
sm->handshake already contains our RSN/WPA IE so there's no need to
rebuild it for msg 3/4, especially since we hardcode the fact that we
only support one pairwise cipher. If we start declaring more supported
ciphers and need to include a second RSNE we can first parse
sm->hs->authenticator_ie into a struct ir_rsn_info, overwrite the cipher
and rebuild it from that struct.
This way we duplicate less code and we hardcode fewer facts about the AP
in eapol.c which also helps in adding EAP-WSC.
In both FT or FILS EAPoL isn't used for the initial handshake and only
for the later re-keys. For FT we added the
eapol_sm_set_require_handshake mechanism to tell EAPoL to not require
the initial handshake and we can re-use it for FILS.
Currently an adversary can retransmit EAPOL Msg4/4 to make the AP
reinstall the PTK. Against older Linux kernels this can subsequently
be used to decrypt, replay, and possibly decrypt frames. See the
KRACK attacks research at krackattacks.com for attack scenarios.
In this case no machine-in-the-middle position is needed to trigger
the key reinstallation.
Fix this by using the ptk_complete boolean to track when the 4-way
handshake has completed (similar to its usage for clients). When
receiving a retransmitted Msg4/4 accept this frame but do not reinstall
the PTK.
Credits to Chris M. Stone, Sam Thomas, and Tom Chothia of Birmingham
University to help discover this issue.
Instead of creating the results->bss_list l_queue lazily, always create
one before sending the GET_SCAN command. This is to make sure that an
empty list is passed to the scan callback (e.g. in station.c) instead of
a NULL. Passing NULL has been causing difficult to debug crashes in
station.c, in fact I think I've been seeing them for over a year now
but can't be sure. station_set_scan_results has been taking ownership
of the new BSS list and, if station->connected_bss was not on the list,
it would try to add it not realizing that l_queue_push_tail() was doing
nothing. Always passing a valid list may help us prevent similar
problems in the future.
The crash might start with:
==120489== Invalid read of size 8
==120489== at 0x425D38: network_bss_select (network.c:709)
==120489== by 0x415BD1: station_try_next_bss (station.c:2263)
==120489== by 0x415E31: station_retry_with_status (station.c:2323)
==120489== by 0x415E31: station_connect_cb (station.c:2367)
==120489== by 0x407E66: netdev_connect_failed (netdev.c:569)
==120489== by 0x40B93D: netdev_connect_event (netdev.c:1801)
==120489== by 0x40B93D: netdev_mlme_notify (netdev.c:3678)
Incorporate the LGPL v2.1 licensed implementation of ARC4, taken from
the Nettle project (https://git.lysator.liu.se/nettle/nettle.git,
commit 3e7a480a1e351884), and tweak it a bit so we don't have to
operate on a skip buffer to fast forward the stream cipher, but can
simply invoke it with NULL dst or src arguments to achieve the same.
This removes the dependency [via libell] on the OS's implementation of
ecb(arc4), which may be going away, and which is not usually accelerated
in the first place.
Use a constant control flow in the derivation loop, avoiding leakage
in the iteration succesfuly converting the password.
Increase number of iterations (20 to 30) to avoid issues with
passwords needing more iterations.
With some devices the 10 seconds are not enough for the P2P Group Owner
to give us an address but I think we still want to use a timeout as
short as possible so that the user doesn't wait too long if the
connection isn't working.
p2p_connection_reset may be called as a result of a WFD service
unregistering and p2p_own_wfd is going to be NULL, don't update
p2p_own_wfd->available in this case.
With some WFD devices we occasionally get a Disconnect before or during
the DHCP setup on the first connection attempt to a newly formeg group,
with the reason code MMPDU_REASON_CODE_PREV_AUTH_NOT_VALID. Retrying a
a few times makes the connections consistently successful. Some
conditions are simplified/update in this patch because
conn_dhcp_timeout now implies conn_wsc_bss, and both imply
conn_retry_count.
In 98cf2bf3ec frame_xchg_stop was removed
and its use in p2p.c was changed to frame_xchg_cancel with the slight
complication that the ID returned by frame_xchg_start had do be stored.
Re-add frame_xchg_stop, (renamed as frame_xchg_stop_wdev) to simplify
this bit in p2p.c.
Since there may now be multiple frames-xchg record for each wdev, when
we receive the TX Status event, make sure we find the record who's radio
work has started, as indicated by fx->retry_cnt > 0. Otherwise we're
relying on the ordering of the frames in the "frame_xchgs" queue and
constant priority.
The BSSID (address_3) in response frames was being checked to be the
same as in the request frame, or all-zeros for faulty drivers. At least
one Wi-Fi Display device sends a GO Negotiation Response with the BSSID
different from its Device Address (by 1 bit) and I didn't see an easy
way to obtain that address beforhand so we can "whitelist" it for this
check, so just drop that check for now.
ANQP didn't have this check before it started using frame-xchg so it
shouldn't be critical.
When a frame registered in a given group Id triggers a callback and that
callback ends up calling frame_watch_group_remove for that group Id,
that call will happen inside WATCHLIST_NOTIFY_MATCHES and will free the
memory used by the watchlist. watchlist.h has protection against the
watchlist being "destroyed" inside WATCHLIST_NOTIFY_MATCHES, but not
against its memory being freed -- the memory where it stores the in_notify
and destroy_pending flags. Free the group immediately after
WATCHLIST_NOTIFY_MATCHES to avoid reads/writes to those flags triggering
valgrind warnings.
frame_xchg_destroy is passed as the wiphy radio work's destroy callback
to wiphy.c. If it's also called directly in frame_xchg_exit, there's
going to be a use-after-free when it's called again from wiphy_exit, so
instead use wiphy_radio_work_done which will call frame_xchg_destroy and
forget the frame_xchg record.
This patch lets us establish WFD connections by parsing, validating and
acting on WFD IEs in received frames, and adding our own WFD IEs in the
GO Negotiation and Association frames. Applications should assume that
any connection to a WFD-capable peer when we ourselves have a WFD
service registered, are WFD connections and should handle RTSP and
other IP-based protocols on those connections.
When connecting to a WFD-capable peer and when we have a WFD service
registered, the connection will fail if there are any conflicting or
invalid WFD parameters during GO Negotiation.
If anyone's registered as implementing the WFD service, add the
net.connman.iwd.p2p.Display DBus interface on peer objects that are
WFD-capable and are available for a WFD Session.
The net.connman.iwd.p2p.ServiceManager interface on the /net/connman/iwd
object lets user applications register/unregister the Wi-Fi Display
service. In this commit all it does is it adds local WFD information
as given by the app, to the frames we send out during discovery.
Instead of accepting raw WFD IE contents from the app and exposing
peers' raw WFD IEs to the app, we build the WFD IEs in our code based on
the few meaningful DBus properties that we support and using default
values for the rest. If an app ever needs any of the other WFD
capabilities more properties can be added.
The are useful for P2P service implementations to know unambiguously
which network interface a new P2P connection is on and the peer's IPv4
address if they need to initiate an IP connection or validate an
incoming connection's address from the peer.
This uses l_dhcp_lease_get_server_id to get the IP of the server that
offered us our current lease. l_dhcp_lease_get_server_id returns the
vaue of the L_DHCP_OPTION_SERVER_IDENTIFIER option, which is the address
that any unicast DHCP frames are supposed to be sent to so it seems to
be the best way to get the P2P group owner's IP address as a P2P-client.
peer->device_addr is a pointer to the Device Address contained in
one of two possible places in peer->bss. If during discovery we've
received a new beacon/probe response for an existing peer and we're
going to replace peer->bss, we also have to update peer->device_addr.
If we were in discovery only to be able to receive the target peer's
GO Negotiation Request (i.e. we have no users requesting discovery)
and we've received the frame and decided that the connection has
failed, exit discovery.
To use the wiphy radio work queue, scanning mostly remained the same.
start_next_scan_request was modified to be used as the work callback,
as well as not start the next scan if the current one was done
(since this is taken care of by wiphy work queue now). All
calls to start_next_scan_request were removed, and more or less
replaced with wiphy_radio_work_done.
scan_{suspend,resume} were both removed since radio management
priorities solve this for us. ANQP requests can be inserted ahead of
scan requests, which accomplishes the same thing.
Before connecting to a hidden network we must scan. During this scan
if another connection attempt comes in the expected behavior is to
abort the original connection. Rather than waiting for the scan to
complete, then canceling the original hidden connection we can just
cancel the hidden scan immediately, reply to dbus, and continue with
the new connection attempt.
The new frame-xchg module now handles a lot of what ANQP used to do. ANQP
now does not need to depend on nl80211/netdev for building and sending
frames. It also no longer needs any of the request lookups, frame watches
or to maintain a queue of requests because frame-xchg filters this for us.
From an API perspective:
- anqp_request() was changed to take the wdev_id rather than ifindex.
- anqp_cancel() was added so that station can properly clean up ANQP
requests if the device disappears.
During testing a bug was also fixed in station on the timeout path
where the request queue would get popped twice.
In order to first integrate frame-xchg some refactoring needed to
be done. First it is useful to allow queueing frames up rather than
requiring the module (p2p, anqp etc) to wait for the last frame to
finish. This can be aided by radio management but frame-xchg needed
some refactoring as well.
First was getting rid of this fx pointer re-use. It looks like this
was done to save a bit of memory but things get pretty complex
needed to check if the pointer is stale or has been reset. Instead
of this we now just allocate a new pointer each frame-xchg. This
allows for the module to queue multiple requests as well as removes
the complexity of needed to check if the fx pointer is stale.
Next was adding the ability to track frame-xchgs by ID. If a module
can queue up multiple requests it also needs to be able to cancel
them individually vs per-wdev. This comes free with the wiphy work
queue since it returns an ID which can be given directly to the
caller.
Then radio management was simply piped in by adding the
insert/done APIs.
These APIs will handle fairness and order in any operations which
radios can only do sequentially (offchannel, scanning, connection etc.).
Both scan and frame-xchg are complex modules (especially scanning)
which is why the radio management APIs were implemented generic enough
where the changes to both modules will be minimal. Any module that
requires this kind of work can push a work item into the radio
management work queue (wiphy_radio_work_insert) and when the work
is ready to be started radio management will call back into the module.
Once the work is completed (and this may be some time later e.g. in
scan results or a frame watch) the module can signal back that the
work is finished (wiphy_radio_work_done). Wiphy will then pop the
queue and continue with the next work item.
A concept of priority was added in order to allow important offchannel
operations (e.g. ANQP) to take priority over other work items. The
priority is an integer, where lower values are of a higher priority.
The concept of priority cleanly solves a lot of the complexity that
was added in order to support ANQP queries (suspending scanning and
waiting for ANQP to finish before connecting).
Instead ANQP queries can be queued at a higher priority than scanning
which removes the need for suspending scans. In addition we can treat
connections as radio management work and insert them at a lower
priority than ANQP, but higher than scanning. This forces the
connection to wait for ANQP without having to track any state.
When roaming, iwd tries to scan a limited number of frequencies to keep
the roaming latency down. Ideally the frequency list would come in from
a neighbor report, but if neighbor reports are not supported, we fall
back to our internal database for known frequencies of this network.
iwd tries to keep the number of scans down to a bare minimum, which
means that we might miss APs that are in range. This could happen
because the user might have moved physically and our frequency list is
no longer up to date, or if the AP frequencies have been reconfigured.
If a limited scan fails to find any good roaming candidates, re-attempt
a full scan right away.
If the roam failed and we are no longer connected, station_disassociated
is called which ends up calling station_roam_state_clear. Thus
resetting the variables is not needed. Reflow the logic to make this a
bit more explicit.
If the roam attempt fails, do not reset this to false. Generally this
is set by the fact that we lost beacon and to not attempt neighbor
reports, etc. This hint should be preserved across roam attempts.
frame_xchg_startv was using sizeof(mmpdu) to check the minimum length
for a frame. Instead mmpdu_header_len should be used since this checks
fc.order and returns either 24 or 28 bytes, not 28 bytes always.
This change adds the requirement that the first iovec in the array
must contain at least the first 2 bytes (mmpdu_fc) of the header.
This really shouldn't be a problem since all current users of
frame-xchg put the entire header (or entire frame) into the first
iovec in the array.
explicit_bzero is used in src/p2p.c since commit
1675c765a3 but src/missing.h is not
included, as a result build with uclibc fails on:
/home/naourr/work/instance-0/output-1/per-package/iwd/host/opt/ext-toolchain/bin/../lib/gcc/mips64el-buildroot-linux-uclibc/5.5.0/../../../../mips64el-buildroot-linux-uclibc/bin/ld: src/p2p.o: in function `p2p_connection_reset':
p2p.c:(.text+0x2cf4): undefined reference to `explicit_bzero'
/home/naourr/work/instance-0/output-1/per-package/iwd/host/opt/ext-toolchain/bin/../lib/gcc/mips64el-buildroot-linux-uclibc/5.5.0/../../../../mips64el-buildroot-linux-uclibc/bin/ld: p2p.c:(.text+0x2cfc): undefined reference to `explicit_bzero'
This logic was using l_hashmap_insert, which supports duplicates. Since
some entries were inserted multiple times, they ended up being printed
multiple times. Fix that by introducing a macro that uses
l_hashmap_replace instead.
Right now, if the connection fails, then network always thinks that the
password should be re-asked. Loosen this to only do so if the
connection failed at least in the handshake phase. If the connection
failed due to Association / Authentication timeout, it is likely that
something is wrong with the AP and it can't respond.
Using the new station ANQP watch network can delay the connection
request until after ANQP has finished. Since station may be
autoconnecting we must also add a check in network_autoconnect
which prevents it from autoconnecting if we have a pending Connect
request.
This is to allow network to watch for ANQP activity in order to
fix the race condition between scanning finishing and ANQP finishing.
Without this it is possible for a DBus Connect() to come in before
ANQP has completed and causing the network to return NotConfigured,
when its actually in the process of obtaining all the network info.
The watch was made globally in station due to network not having
a station object until each individual network is created. Adding a
watch during network creation would result in many watchers as well
as a lot of removal/addition as networks are found and lost.
Change signature of network_connect_new_hidden_network to take
reference to the caller's l_dbus_message struct. This allows to
set the caller's l_dbus_message struct to NULL after replying in
the case of a failure.
==201== at 0x467C15: l_dbus_message_unref (dbus-message.c:412)
==201== by 0x412A51: station_hidden_network_scan_results (station.c:2504)
==201== by 0x41EAEA: scan_finished (scan.c:1505)
==201== by 0x41EC10: get_scan_done (scan.c:1535)
==201== by 0x462592: destroy_request (genl.c:673)
==201== by 0x462987: process_unicast (genl.c:988)
==201== by 0x462987: received_data (genl.c:1087)
==201== by 0x45F5A2: io_callback (io.c:126)
==201== by 0x45E8FD: l_main_iterate (main.c:474)
==201== by 0x45E9BB: l_main_run (main.c:521)
==201== by 0x45EBCA: l_main_run_with_signal (main.c:643)
==201== by 0x403B15: main (main.c:512)
Introduce hidden_pending to keep reference to the dbus message object
while we wait for the scan results to be returned while trying to
connect to a hidden network. This simplifies the logic by separating it
into two independent logical units: scanning, connecting and eliminates
a possibility of a memory leak in the case when Network.Connect being
initiated while Station.ConnectHiddenNetwork is in progress.
If a connection is initiated (via dbus) while a quick scan is in
progress, the quick scan will be aborted. In this case,
station_quick_scan_results will always transition to the
AUTOCONNECT_FULL state regardless of whether it should or not.
Fix this by making sure that we only enter AUTOCONNECT_FULL if we're
still in the AUTOCONNECT_QUICK state.
Reported-by: Alvin Šipraga <alsi@bang-olufsen.dk>
If start_scan_next_request() is called while a scan request
(NL80211_CMD_TRIGGER_SCAN) is still running, the same scan request will
be sent again. Add a check in the function to avoid sending a request if
one is already in progress. For consistency, check also that scan
results are not being requested (NL80211_CMD_GET_SCAN), before trying to
send the next scan request. Finally, remove similar checks at
start_next_scan_request() callsites to simplify the code.
This also fixes a crash that occurs if the following conditions are met:
- the duplicated request is the only request in the scan request
queue, and
- both scan requests fail with an error not EBUSY.
In this case, the first callback to scan_request_triggered() will delete
the request from the scan request queue. The second callback will find
an empty queue and consequently pass a NULL scan_request pointer to
scan_request_failed(), causing a segmentation fault.