Info
This post is auto-generated from RSS feed The Rust Programming Language Forum - Latest topics. Source: Review this fix for unsafe_libyaml, yaml_string_extend
Review this fix for unsafe_libyaml, yaml_string_extend
โ Rust ๐ 2025-09-17 ๐ค surdeus ๐๏ธ 1The advisory has recently been posted addressing problems in libyml::string::yaml_string_extend, part of unsafe-libyaml.
This library, in one form or another, is still part of many YAML implementations (see here), including my own crate, until very recently. This code, auto-generated from C, provides low-level YAML parsing and generation. Hence, this issue potentially affects a large number of users. Let's not leave without attention.
Responding to the advisory, I switched immediately to unsafe-libyaml-norway as they recommend. However, after careful review to yaml_string_extend there, I suspect that the problem persists. The code there seems not much different from the old unsafe-libyaml and still computes sizes with
(*end).c_offset_from(*start)
which seem resolving to ptr::offset_from
on *const u8
. That operation is only defined when both pointers are derived from the same allocation (or oneโpast). Otherwise, it is undefined behavior. Also, the size math uses force_mul(2_i64)
which can silently overflow if buffers get large and offset_from
for the pointer rebase (*pointer = new_start.wrapping_offset((*pointer).c_offset_from(*start) โฆ)
) has the same precondition: *pointer
must be within the same allocation as *start
**. This may be UB too.
I propose an alternative that handles the uninitialized (NULL) case, avoiding offset_from
entirely. For size calculations, it uses integer address diffs and checks growth with checked_mul
to prevent integer overflow.
pub(crate) unsafe fn yaml_string_extend(
start: *mut *mut yaml_char_t,
pointer: *mut *mut yaml_char_t,
end: *mut *mut yaml_char_t,
) {
let cur_start = *start;
let cur_ptr = *pointer;
let cur_end = *end;
// Compute old capacity and used bytes without `offset_from`.
// Handle the uninitialized (NULL) case explicitly.
let (old_cap, used) = if cur_start.is_null() {
(0usize, 0usize)
} else {
let s = cur_start as usize;
let p = cur_ptr as usize;
let e = cur_end as usize;
// Sanity checks in debug builds; keeps Miri happy if invariants are violated upstream.
debug_assert!(p >= s && p <= e, "pointer must be within [start, end]");
debug_assert!(e >= s, "end must be >= start");
(e - s, p.saturating_sub(s))
};
// Pick a new capacity (bytes). Start with a small nonzero capacity if previously uninitialized.
let base = if old_cap == 0 { 16usize } else { old_cap };
let new_cap = match base.checked_mul(2) {
Some(v) => v,
None => {
// Overflow: leave unchanged or handle via your error reporting.
// You could also clamp to isize::MAX if your allocator accepts it.
return;
}
};
// Allocate / reallocate.
let new_start = yaml_realloc(cur_start as *mut c_void, new_cap as size_t) as *mut yaml_char_t;
if new_start.is_null() {
// Allocation failed; leave inputs unchanged (or set an error flag if you have one).
return;
}
// Zero only the grown tail.
if new_cap > old_cap {
let tail = new_start.add(old_cap);
memset(tail as *mut c_void, 0, (new_cap - old_cap) as size_t);
}
// Rebase `pointer` and `end` into the new allocation.
let new_ptr = new_start.add(used);
let new_end = new_start.add(new_cap);
*pointer = new_ptr;
*end = new_end;
*start = new_start;
}
I also looked at STRING_EXTEND here, I think that also can be rewritten more robustly, I suggest
macro_rules! STRING_EXTEND {
($string:expr) => {{
// Number of bytes you need ahead of `pointer`
const NEED: usize = 5;
let p = $string.pointer as usize;
let e = $string.end as usize;
if $string.start.is_null() || p.saturating_add(NEED) > e {
unsafe {
yaml_string_extend(
core::ptr::addr_of_mut!($string.start),
core::ptr::addr_of_mut!($string.pointer),
core::ptr::addr_of_mut!($string.end),
);
}
}
}};
}
I would be great to have the community check if these fixes are sound. Then we can open a pull request to the maintained branches of unsafe-libyaml, or if (hopefully unlikely) the community is too passive, I will create my own fork.
1 post - 1 participant
๐ท๏ธ Rust_feed