Should/can one trust invariants in serialized data?
⚓ Rust 📅 2025-10-27 👤 surdeus 👁️ 4This is a very general question for which I've heard very diverging answers, so I'd like to hear other people's opinion and try to debias myself.
Context: you have a serialization system and you have serialized data which satisfies invariants; for example, strings are serialized as sequences of bytes, and they are UTF-8 because that's the invariant that was true when you serialized the string.
Now you access this data is a direct way: it can be zero-copy deserialization, memory-mapping, some kind of superfast deserialization—the point is that you transfer directly those bytes into memory to create a string.
In this scenario, is it acceptable to use something like str::from_utf8_unchecked in a method that is not unsafe itself? The logic here is that the invariant was valid upon serialization, and thus we can rely on the same invariant to be true upon deserialization.
The main objection to this idea is that the file might have been tampered with or might come from malicious sources. My personal viewpoint is that security is a system property that emerges through a number of processes. If you have bad sources, change your sources. If your files are tempered with, check your system. But treating defensively any kind of external data in the language is approaching the problem in the wrong place. Rust is a memory-safe language, not a secure language (as recently TARmageddon showed). And obsessively checking for errors in data has a significant performance cost.
I also think in general that the purpose of Rust is not to play defensively against malicious actors—the purpose is to help you write efficient, correct software. I see it more as a safety net than an armor. And I cannot keep myself from thinking that somehow this is the mental model of the compiler team, or I cannot explain why the bug at the basis of cve-rs has been open for 10 years. It's not that they don't care, but that is doesn't make sense to invest resources in fixing an obscure problem that will never happen in real code. But, then again, here I'm trying to read people's mind, which is rarely successful.
Any kind of comment really appreciated.
1 post - 1 participant
🏷️ Rust_feed