This page describes important issues and bug fixes found on android-mainline
that might be significant to partners.
November 15, 2024
Clang is updated to 19.0.1 for
android-mainline
andandroid16-6.12
- Summary: The new version of Clang introduces a bounds sanitizer for arrays,
where the array's size is stored in a separate variable linked to the array
using the
__counted_by
attribute. This feature might cause a kernel panic if the array size isn't properly updated. The error message looks like this:
UBSAN: array-index-out-of-bounds in common/net/wireless/nl80211.c index 0 is out of range for type 'struct ieee80211_channel *[] __counted_by(n_channels)' (aka 'struct ieee80211_channel *[]')
Details: The bounds sanitizer is essential to protect the integrity of the kernel by detecting out-of-bounds access. And with
CONFIG_UBSAN_TRAP
enabled, bounds sanitizer triggers a kernel panic on any finding.- Previous version of the bounds sanitizer checked only fixed-size arrays
and couldn't check dynamically allocated arrays. The new version uses the
__counted_by
attribute to determine the array bounds at runtime and detect more cases of out-of-bound access. However, in some cases, the array is accessed before the size variable is set, triggering the bounds sanitizer and causing a kernel panic. To address this issue, set the array's size immediately after allocating the underlying memory, as illustrated in aosp/3343204.
- Previous version of the bounds sanitizer checked only fixed-size arrays
and couldn't check dynamically allocated arrays. The new version uses the
About
CONFIG_UBSAN_SIGNED_WRAP
: The new version of Clang sanitizes signed integer overflow and underflow despite the-fwrapv
compiler flag. The-fwrapv
flag is designed to treat signed integers as two's complement unsigned integers with defined overflow behavior.- While sanitizing signed integer overflow in the Linux kernel can help
identify bugs, there are instances where overflow is intentional, for
example, with
atomic_long_t
. As a result,CONFIG_UBSAN_SIGNED_WRAP
has been disabled to allow UBSAN to function solely as a bounds sanitizer.
- While sanitizing signed integer overflow in the Linux kernel can help
identify bugs, there are instances where overflow is intentional, for
example, with
About
CONFIG_UBSAN_TRAP
: UBSAN is configured to trigger a kernel panic when it detects an issue to protect the integrity of the kernel. However, we disabled this behavior from October 23 to November 12. We did this to unblock the compiler update while we fixed known__counted_by
issues.
- Summary: The new version of Clang introduces a bounds sanitizer for arrays,
where the array's size is stored in a separate variable linked to the array
using the
November 1, 2024
- Linux 6.12-rc4 landing
- Summary:
CONFIG_OF_DYNAMIC
potentially causing severe regressions for faulty drivers. - The details: While merging Linux
6.12-rc1
intoandroid-mainline
we noticed issues with out-of-tree drivers failing to load. The change that exposed the driver bugs was identified as commit274aff8711b2 ("clk: Add KUnit tests for clks registered with struct clk_parent_data")
and we temporarily reverted it in aosp/3287735. The change selectsCONFIG_OF_OVERLAY
, which selectsCONFIG_OF_DYNAMIC
. With!OF_DYNAMIC
, ref-counting onof_node_get()
andof_node_put()
is effectively disabled as they are implemented asnoops
. EnablingOF_DYNAMIC
again exposes issues in drivers wrongly implementing ref-counting forstruct device_node
. This causes various types of errors like memory corruption, use-after-free, and memory leaks. - All uses of OF parsing related APIs must be inspected. The following list is
partial, but contains cases we have been observing:
- Use after free (UAF):
- Reuse of the same
device_node
argument: Those functions callof_node_put()
on the node given, potentially need to add anof_node_get()
before calling them (for example, when calling repeatedly with the same node as argument):of_find_compatible_node()
of_find_node_by_name()
of_find_node_by_path()
of_find_node_by_type()
of_get_next_cpu_node()
of_get_next_parent()
of_get_next_child()
of_get_next_available_child()
of_get_next_reserved_child()
of_find_node_with_property()
of_find_matching_node_and_match()
- Use of
device_node
after any type of exit from certain loops:for_each_available_child_of_node_scoped()
for_each_available_child_of_node()
for_each_child_of_node_scoped()
for_each_child_of_node()
- Keeping direct pointers to
char *
properties fromdevice_node
around, for example, using:const char *foo = struct device_node::name
of_property_read_string()
of_property_read_string_array()
of_property_read_string_index()
of_get_property()
- Reuse of the same
- Memory leaks:
- Getting a
device_node
and forgetting to unref it (of_node_put()
). Nodes returned from these need to be freed at some point:of_find_compatible_node()
of_find_node_by_name()
of_find_node_by_path()
of_find_node_by_type()
of_find_node_by_phandle()
of_parse_phandle()
of_find_node_opts_by_path()
of_get_next_cpu_node()
of_get_compatible_child()
of_get_child_by_name()
of_get_parent()
of_get_next_parent()
of_get_next_child()
of_get_next_available_child()
of_get_next_reserved_child()
of_find_node_with_property()
of_find_matching_node_and_match()
- Getting a
- Keeping a
device_node
from a loop iteration. If you're returning or breaking from within the following, you need to drop the remaining reference at some point:for_each_available_child_of_node()
for_each_child_of_node()
for_each_node_by_type()
for_each_compatible_node()
of_for_each_phandle()
- Use after free (UAF):
- The earlier mentioned change was restored while landing Linux
6.12-rc4
(see aosp/3315251) enablingCONFIG_OF_DYNAMIC
again and potentially exposing faulty drivers.
- Summary: