Weird benchmark differences caused by adding unrelated code

⚓ rust    📅 2025-06-07    👤 surdeus    👁️ 3      

surdeus

While benchmarking some code, I noticed that the performance of one benchmark was being affected by the other ones (when I commented out the other benchmarks, the first one became ~2x faster). I've minimized the code down to:

#![feature(test)]

extern crate test;

use test::Bencher;

#[bench]
fn bench(b: &mut Bencher) {
	b.iter(|| {
		[].is_ascii();
	});
}

/*// `#[no_mangle]` to prevent this from being optimized out
#[no_mangle]
fn breaks_it(b: &mut Bencher) {
	let bytes: &[u8] = &[];
	test::bench::iter(&mut || {
		bytes.is_ascii();
		Vec::new().resize(0, ());
	});
	b.iter(|| {});
}*/

Running cargo bench on this gives a time of 0.23 ns/iter for bench. If you uncomment the function breaks_it, the time goes up to 1.60 ns/iter.

The time goes back to 0.23 ns/iter if you comment any part of breaks_it, so it seems that the performance difference only happens when breaks_it contains:

  • 2 calls to test::bench::iter, at least 1 directly, others can be through Bencher::iter.
  • At least 1 use of b, (apparently black_box(b) doesn't count, so it must be either b.iter() or b.bench())
  • In the closures passed to test::bench::iter, there's a call to Vec::resize, and a call to [u8]::is_ascii on a slice borrowed from outside the closure.

The time also goes down to 0.23 ns/iter if I use the same closure for both calls to test::bench::iter, i.e.:

#[no_mangle]
fn breaks_it(b: &mut Bencher) {
	let bytes: &[u8] = &[];
	let mut inner = || {
		bytes.is_ascii();
		Vec::new().resize(0, ());
	};
	test::bench::iter(&mut inner);
	b.iter(inner);
}

The time also goes back down if I wrap both closures in breaks_it or the closure in bench in Boxes, but goes back up if I wrap all of them.

The time also goes down if I inline any of the functions used (test::bench::iter, [u8]::is_ascii, Vec::resize), so perhaps them being in the standard library changes something.

Incidentally, 0.23 ns/iter is the same time I get benching an empty closure, so it looks like in that case, is_ascii is being optimized out.

This implies that there's something about these very specific conditions that prevents optimizations in a different function.

Does anyone know what could cause this spooky action?

My rust version is 1.89.0-nightly (64a124607 2025-05-30).

1 post - 1 participant

Read full topic

🏷️ rust_feed