๐งต Stringlet UTF-8 Hack Option Niche?
โ Rust ๐ 2026-05-14 ๐ค surdeus ๐๏ธ 1With the release of Stringlet 0.10, Iโve looked at Option and Result niche optimization. Alas I didnโt find anything that seems applicable here.
Iโve come up with a scheme that would work for (all inline and if needed, other) strings: According to the UTF-8 standard no byte may currently (and maybe forever) be 0b1111_1xxx. That gives eight possible niche values in the first byte of non-zero sized UTF-8 byte arrays. Any way of expressing this to the compiler would be highly welcome!
I have another related need. I want to introduce a comfort wrapper unifying all kinds. Each being repr(C), the enum would implicitly (or if need be, explicitly) also be. So each, and thus the enum, would share the above niche:
enum Stringlet<const SIZE: usize> {
Fixed(FixedStringlet<SIZE>),
Var( VarStringlet<SIZE>),
Trim( TrimStringlet<SIZE>),
Slim( SlimStringlet<SIZE>),
}
Here only VarStringlet stores the actual length in one extra byte. Since stringlets will rarely be 255 bytes big, that leaves room to store the discriminator. (Currently SlimStringlet, and hence the whole enum, is even capped at size 64, but Iโm looking at relaxing that.) Again any way of expressing this to the compiler would be highly welcome!
11 posts - 5 participants
๐ท๏ธ Rust_feed