The pending wiphy state 'use_default' variable was not set early enough
in some circumstances resulting in weird behavior for blacklisted
drivers. Fix this by adding a manager_wiphy_dump_done callback which
will properly initialize the use_default value.
Fixes: c4b2f10483 ("manager: Handle missing NEW_WIPHY events")
brcmfmac does not allow the removal of the default / primary interface.
So there isn't much point in having iwd attempt this.
Another issue is that brcmfmac _does_ allow the deletion of non-default
interfaces. So starting iwd on a system with a station & ap interface
active can result in iwd attempting to delete all the interfaces. Given
the above, it succeeds in deleting the ap interface but not the station
one. In strange circumstances it might end up thinking that the ap
interface is the 'default' and trying to use it, whereas it was just
successfully removed.
==192== Conditional jump or move depends on uninitialised value(s)
==192== at 0x4531D3: l_queue_find (queue.c:346)
==192== by 0x42F1F8: manager_config_notify (manager.c:667)
==192== by 0x45A895: process_multicast (genl.c:970)
==192== by 0x45A895: received_data (genl.c:1037)
==192== by 0x4577B2: io_callback (io.c:126)
==192== by 0x456B0D: l_main_iterate (main.c:473)
==192== by 0x456BCB: l_main_run (main.c:520)
==192== by 0x456DDA: l_main_run_with_signal (main.c:642)
==192== by 0x4034B0: main (main.c:497)
The kernel emits NEW_WIPHY events whenever a new wiphy is registered.
Unfortunately these events are emitted under the 'legacy' semantics and
have a hard size limit of 4096 bytes. Unfortunately, it is possible for
a NEW_WIPHY message to exceed this limit (ath10k cards seem to be
affected in particular), which results in the kernel never sending these
messages out. This can lead to NEW_INTERFACE events being emitted with
a wiphy_id that had no corresponding NEW_WIPHY event emitted. Such a
sequence can confuse iwd's hardware detection logic, particularly during
hot-plug or system boot.
Fix this by re-dumping the wiphy if such a condition is detected. This
has some interaction with blacklisted wiphys, so the wiphy objects are
now always tracked and marked as blacklisted. Before, the blacklisted
wiphys were simply not added to the iwd list of tracked wiphys.
Sometimes, at least with brcmfmac, the default interface apparently
takes a moment to get created after the NEW_WIPHY event. We didn't
really consider this case in the NEW_WIPHY handler and we've got a race
condition. It fixes the following bug for me:
https://bugs.archlinux.org/task/63912 -- tested by removing and
re-modprobing the brcmfmac module rather than rebooting.
To work around this wait for the NEW_INTERFACE event and then retry the
setup. We still do the initial attempt directly after NEW_WIPHY to
handle cases like wiphys with no default interfaces and pre-existing
wiphys.
Allow netdev_create_from_genl callers to draw a random or non-random MAC
and pass it in the parameter instead of a bool to tell us to generating
the MAC locally. In P2P we are generating the MAC some time before
creating the netdev in order to pass it to the peer during negotiation.
A NEW_WIPHY event may not always contain all the information about a
given phy, but GET_WIPHY will. In order to get everything we must
mimic the behavior done during initalization and dump both wiphy
and interfaces when a NEW_WIPHY comes in.
Now, any NEW_WIPHY event will initialize a wiphy, but then do a
GET_WIPHY/GET_INTERFACE to obtain all the information. Because of
this we can ignore any NEW_INTERFACE notifications since we are
dumping the interface anyways.
Once some kernel changes get merged we wont need to do this anymore
so long as the 'full' NEW_WIPHY feature is supported.
If supported by the driver, we can create an interface directly with a
random MAC if configured to do so. If the driver does not have this
capability, then tell netdev to perform the necessary logic as part of
the interface initialization procedure.
Blacklist some drivers known to crash when interfaces are deleted or
created so that we don't even attempt that before falling back to using
the default interface.
manager_interface_dump_done would use manager_create_interfaces() at the
end of the loop iterating over pending_wiphys. To prevent it from
crashing make sure manager_create_interfaces never frees the pending
wiphy state and instead make the caller check whether it needs to be
freed so it can be done safely inside loops.
If the setting is true we'll not attempt to remove or create
interfaces on any wiphys and will only use the default interface
(if it exists). If false, force us managing the interfaces. Both
values override the auto logic.
Let manager.c signal to wiphy.c when the wiphy parsing from the genl
messages is complete. When we query for existing wiphy using the
GET_WIPHY dump command we get many genl messages per wiphy, on a
notification we only get one message. So after wiphy_create there may
be one or many calls to wiphy_update_from_genl. wiphy_create_complete
is called after all of them, so wiphy.c can be sure it's done with
parsing the wiphy attributes when in prints the new wiphy summary log
message, like it did before manager.c was added.
I had wrongly assumed that all the important wiphy attributes were in
the first message in the dump, but NL80211_ATTR_EXT_FEATURES was not and
wasn't being parsed which was breaking at least testRSSIAgent.
Instead of handling NEW_WIPHY events and WIHPY_DUMP events in a similar
fashion, split up the paths to optimize iwd startup time. There's
fundamentally no reason to wait a second (and eat up file-descriptor
resources for timers unnecessarily) when we can simply start an
interface dump right after the wiphy dump.
In case a new wiphy is added in the middle of a wiphy dump, we will
likely get a new wiphy event anyway, in which case a setup_timeout will
be created and we will ignore this phy during the interface dump
processing.
This also optimizes the case of iwd being re-started, in which case
there are no interfaces present.
Separate out the two types of NEW_WIPHY handlers into separate paths and
factor out the common code into a utility function.
Dumps of CMD_NEW_WIPHY can be split up over several messages, while
CMD_NEW_WIPHY events (generated when a new card is plugged in) are
stuffed into a single message.
This also prepares ground for follow-on commits where we will handle the
two types of events differently.
Mirror netdev.c white/blacklist logic. If either or both the whitelist
and the blacklist are given also fall back to not touching the existing
interface setup on the wiphy.
If we get an error during DEL_INTERFACE or NEW_INTERFACE we may be
dealing with a driver that doesn't implement virtual interfaces or
doesn't implement deleting the default interface. In this case fall
back to using the first usable interface that we've detected on this
wiphy.
There's at least one full-mac driver that doesn't implement the cfg80211
.del_virtual_intf and .add_virtual_intf methods and at least one that
only allows P2P interfaces to be manipulated. mac80211 drivers seem to
at least implement those methods but I didn't check to see if there are
driver where they'd eventually return EOPNOTSUPP.
This is probably the trickiest part in this patchset. I'm introducing a
new logic where instead of using the interfaces that we find present
when a wiphy is detected, which would normally be the one default
interface per wiphy but could be 0 or more than one, we create one
ourselves with the socket owner attribute and use exactly one for
Station, AP and Ad-Hoc modes. When IWD starts we delete all the
interfaces on existing wiphys that we're going to use (as determined by
the wiphy white/blacklists) or freshly hotplugged ones, and only then we
register the interface we're going to use meaning that the wiphy's
limits on the number of concurrent interfaces of each type should be at
0. Otherwise we'd be unlikely to be abe to create the station interface
as most adapters only allow one. After that we ignore any interfaces
that may be created by other processes as we have no use for multiple
station interfaces.
At this point manager.c only keeps local state for wiphys during
the interface setup although when we start adding P2P code we will be
creating and removing interfaces multiple times during the wiphy's
runtime and may need to track it here or in wiphy.c. We do not
specifically check the interface number limits received during the wiphy
dump, if we need to create any interfaces and we're over the driver's
maximum for that specific iftype we'll still attempt it and report error
if it fails.
I tested this and it seems to work with my laptop's intel card and some
USB hotplug adapters.
Add manager.c, a new file where the wiphy and interface creation/removal
will be handled and interface use policies will be implemented. Since
not all kernel-side nl80211 interfaces are tied to kernel-side netdevs,
netdev.c can't manage all of the interfaces that we will be using, so
the logic is being moved to a common place where all interfaces on a
wiphy will be managed according to the policy, device support for things
like P2P and user enabling/disabling/connecting with P2P which require
interfaces to be dynamically added and removed.