With some WFD devices we occasionally get a Disconnect before or during
the DHCP setup on the first connection attempt to a newly formeg group,
with the reason code MMPDU_REASON_CODE_PREV_AUTH_NOT_VALID. Retrying a
a few times makes the connections consistently successful. Some
conditions are simplified/update in this patch because
conn_dhcp_timeout now implies conn_wsc_bss, and both imply
conn_retry_count.
In 98cf2bf3ec frame_xchg_stop was removed
and its use in p2p.c was changed to frame_xchg_cancel with the slight
complication that the ID returned by frame_xchg_start had do be stored.
Re-add frame_xchg_stop, (renamed as frame_xchg_stop_wdev) to simplify
this bit in p2p.c.
Since there may now be multiple frames-xchg record for each wdev, when
we receive the TX Status event, make sure we find the record who's radio
work has started, as indicated by fx->retry_cnt > 0. Otherwise we're
relying on the ordering of the frames in the "frame_xchgs" queue and
constant priority.
The BSSID (address_3) in response frames was being checked to be the
same as in the request frame, or all-zeros for faulty drivers. At least
one Wi-Fi Display device sends a GO Negotiation Response with the BSSID
different from its Device Address (by 1 bit) and I didn't see an easy
way to obtain that address beforhand so we can "whitelist" it for this
check, so just drop that check for now.
ANQP didn't have this check before it started using frame-xchg so it
shouldn't be critical.
When a frame registered in a given group Id triggers a callback and that
callback ends up calling frame_watch_group_remove for that group Id,
that call will happen inside WATCHLIST_NOTIFY_MATCHES and will free the
memory used by the watchlist. watchlist.h has protection against the
watchlist being "destroyed" inside WATCHLIST_NOTIFY_MATCHES, but not
against its memory being freed -- the memory where it stores the in_notify
and destroy_pending flags. Free the group immediately after
WATCHLIST_NOTIFY_MATCHES to avoid reads/writes to those flags triggering
valgrind warnings.
frame_xchg_destroy is passed as the wiphy radio work's destroy callback
to wiphy.c. If it's also called directly in frame_xchg_exit, there's
going to be a use-after-free when it's called again from wiphy_exit, so
instead use wiphy_radio_work_done which will call frame_xchg_destroy and
forget the frame_xchg record.
This patch lets us establish WFD connections by parsing, validating and
acting on WFD IEs in received frames, and adding our own WFD IEs in the
GO Negotiation and Association frames. Applications should assume that
any connection to a WFD-capable peer when we ourselves have a WFD
service registered, are WFD connections and should handle RTSP and
other IP-based protocols on those connections.
When connecting to a WFD-capable peer and when we have a WFD service
registered, the connection will fail if there are any conflicting or
invalid WFD parameters during GO Negotiation.
If anyone's registered as implementing the WFD service, add the
net.connman.iwd.p2p.Display DBus interface on peer objects that are
WFD-capable and are available for a WFD Session.
The net.connman.iwd.p2p.ServiceManager interface on the /net/connman/iwd
object lets user applications register/unregister the Wi-Fi Display
service. In this commit all it does is it adds local WFD information
as given by the app, to the frames we send out during discovery.
Instead of accepting raw WFD IE contents from the app and exposing
peers' raw WFD IEs to the app, we build the WFD IEs in our code based on
the few meaningful DBus properties that we support and using default
values for the rest. If an app ever needs any of the other WFD
capabilities more properties can be added.
First, looking for DeviceState.connected gives a much better indication
if we are actually connected vs the connected property on the network
object. Second, its good practice to also check that hostapd sees that
the station is connected.
Restarting hostapd from python was actually leaking memory and
causing the hostapd object to stay referenced in python. The
GLib timeout in wait_for_event was the ultimate cause, but this
had no come to light because no tests restarted hostapd then
used wait_for_event.
In addition, any use of wait_for_event after a restart would
cause an exception because the event socket was never re-attached
after hostapd restarted.
Now we properly clean up the timeout in wait_for_event and
re-initialize the hostapd object on restart.
The are useful for P2P service implementations to know unambiguously
which network interface a new P2P connection is on and the peer's IPv4
address if they need to initiate an IP connection or validate an
incoming connection's address from the peer.
This uses l_dhcp_lease_get_server_id to get the IP of the server that
offered us our current lease. l_dhcp_lease_get_server_id returns the
vaue of the L_DHCP_OPTION_SERVER_IDENTIFIER option, which is the address
that any unicast DHCP frames are supposed to be sent to so it seems to
be the best way to get the P2P group owner's IP address as a P2P-client.
peer->device_addr is a pointer to the Device Address contained in
one of two possible places in peer->bss. If during discovery we've
received a new beacon/probe response for an existing peer and we're
going to replace peer->bss, we also have to update peer->device_addr.
If we were in discovery only to be able to receive the target peer's
GO Negotiation Request (i.e. we have no users requesting discovery)
and we've received the frame and decided that the connection has
failed, exit discovery.
To use the wiphy radio work queue, scanning mostly remained the same.
start_next_scan_request was modified to be used as the work callback,
as well as not start the next scan if the current one was done
(since this is taken care of by wiphy work queue now). All
calls to start_next_scan_request were removed, and more or less
replaced with wiphy_radio_work_done.
scan_{suspend,resume} were both removed since radio management
priorities solve this for us. ANQP requests can be inserted ahead of
scan requests, which accomplishes the same thing.
Before connecting to a hidden network we must scan. During this scan
if another connection attempt comes in the expected behavior is to
abort the original connection. Rather than waiting for the scan to
complete, then canceling the original hidden connection we can just
cancel the hidden scan immediately, reply to dbus, and continue with
the new connection attempt.
The new frame-xchg module now handles a lot of what ANQP used to do. ANQP
now does not need to depend on nl80211/netdev for building and sending
frames. It also no longer needs any of the request lookups, frame watches
or to maintain a queue of requests because frame-xchg filters this for us.
From an API perspective:
- anqp_request() was changed to take the wdev_id rather than ifindex.
- anqp_cancel() was added so that station can properly clean up ANQP
requests if the device disappears.
During testing a bug was also fixed in station on the timeout path
where the request queue would get popped twice.
In order to first integrate frame-xchg some refactoring needed to
be done. First it is useful to allow queueing frames up rather than
requiring the module (p2p, anqp etc) to wait for the last frame to
finish. This can be aided by radio management but frame-xchg needed
some refactoring as well.
First was getting rid of this fx pointer re-use. It looks like this
was done to save a bit of memory but things get pretty complex
needed to check if the pointer is stale or has been reset. Instead
of this we now just allocate a new pointer each frame-xchg. This
allows for the module to queue multiple requests as well as removes
the complexity of needed to check if the fx pointer is stale.
Next was adding the ability to track frame-xchgs by ID. If a module
can queue up multiple requests it also needs to be able to cancel
them individually vs per-wdev. This comes free with the wiphy work
queue since it returns an ID which can be given directly to the
caller.
Then radio management was simply piped in by adding the
insert/done APIs.
These APIs will handle fairness and order in any operations which
radios can only do sequentially (offchannel, scanning, connection etc.).
Both scan and frame-xchg are complex modules (especially scanning)
which is why the radio management APIs were implemented generic enough
where the changes to both modules will be minimal. Any module that
requires this kind of work can push a work item into the radio
management work queue (wiphy_radio_work_insert) and when the work
is ready to be started radio management will call back into the module.
Once the work is completed (and this may be some time later e.g. in
scan results or a frame watch) the module can signal back that the
work is finished (wiphy_radio_work_done). Wiphy will then pop the
queue and continue with the next work item.
A concept of priority was added in order to allow important offchannel
operations (e.g. ANQP) to take priority over other work items. The
priority is an integer, where lower values are of a higher priority.
The concept of priority cleanly solves a lot of the complexity that
was added in order to support ANQP queries (suspending scanning and
waiting for ANQP to finish before connecting).
Instead ANQP queries can be queued at a higher priority than scanning
which removes the need for suspending scans. In addition we can treat
connections as radio management work and insert them at a lower
priority than ANQP, but higher than scanning. This forces the
connection to wait for ANQP without having to track any state.
Many tests force a reauth after the initial connection. When the tests
were written there was no way of ensuring the reauth completed except
waiting (IWD.wait()). Now we can wait for hostapd events in the tests,
which is faster and more reliable than busy waiting.
This test was not reliably passing. Busy waiting is not really reliable,
but in this specific case its really the only option as the blacklist
must expire based on time.
When roaming, iwd tries to scan a limited number of frequencies to keep
the roaming latency down. Ideally the frequency list would come in from
a neighbor report, but if neighbor reports are not supported, we fall
back to our internal database for known frequencies of this network.
iwd tries to keep the number of scans down to a bare minimum, which
means that we might miss APs that are in range. This could happen
because the user might have moved physically and our frequency list is
no longer up to date, or if the AP frequencies have been reconfigured.
If a limited scan fails to find any good roaming candidates, re-attempt
a full scan right away.
If the roam failed and we are no longer connected, station_disassociated
is called which ends up calling station_roam_state_clear. Thus
resetting the variables is not needed. Reflow the logic to make this a
bit more explicit.
If the roam attempt fails, do not reset this to false. Generally this
is set by the fact that we lost beacon and to not attempt neighbor
reports, etc. This hint should be preserved across roam attempts.
If an application has a bug and hangs on SIGTERM this causes
test-runner to hang as well. This is obviously an issue with
the application in question, but test-runner should have a way
of continuing onto the next test rather than hanging.
Instead we can use WNOHANG and a sleep to allow applications
some amount of time to exit, and if they haven't use SIGKILL
instead as well as print an error. Similar to how
wait_for_socket works. The timeout is hard coded to 2 seconds
(100ms sleep + 20 iterations).
frame_xchg_startv was using sizeof(mmpdu) to check the minimum length
for a frame. Instead mmpdu_header_len should be used since this checks
fc.order and returns either 24 or 28 bytes, not 28 bytes always.
This change adds the requirement that the first iovec in the array
must contain at least the first 2 bytes (mmpdu_fc) of the header.
This really shouldn't be a problem since all current users of
frame-xchg put the entire header (or entire frame) into the first
iovec in the array.
explicit_bzero is used in src/p2p.c since commit
1675c765a3 but src/missing.h is not
included, as a result build with uclibc fails on:
/home/naourr/work/instance-0/output-1/per-package/iwd/host/opt/ext-toolchain/bin/../lib/gcc/mips64el-buildroot-linux-uclibc/5.5.0/../../../../mips64el-buildroot-linux-uclibc/bin/ld: src/p2p.o: in function `p2p_connection_reset':
p2p.c:(.text+0x2cf4): undefined reference to `explicit_bzero'
/home/naourr/work/instance-0/output-1/per-package/iwd/host/opt/ext-toolchain/bin/../lib/gcc/mips64el-buildroot-linux-uclibc/5.5.0/../../../../mips64el-buildroot-linux-uclibc/bin/ld: p2p.c:(.text+0x2cfc): undefined reference to `explicit_bzero'
This logic was using l_hashmap_insert, which supports duplicates. Since
some entries were inserted multiple times, they ended up being printed
multiple times. Fix that by introducing a macro that uses
l_hashmap_replace instead.