Advice for designing a struct that can iterate over its fields

⚓ Rust    📅 2025-05-24    👤 surdeus    👁️ 11      

surdeus

Warning

This post was published 245 days ago. The information described in this article may have changed.

There are a number of posts on this forum about iterating over the fields of a struct. I'm looking for advice on what the best thing to do in my situation is.

My project uses the procfs crate. In particular, for each process, there is a Vec<MemoryMap> associated with it (MemoryMap in procfs::process - Rust):

pub struct MemoryMap {
    pub perms: MMPermissions,
    pub pathname: MMapPath,
    pub pss: u64
} // simplified for clarity

pub enum MMapPath {
    Path(PathBuf),
    Heap,
    Stack,
    TStack(u32),
    Vdso,
    Vvar,
    Vsyscall,
    Rollup,
    Anonymous,
    Vsys(i32),
    Other(String),
}

Each MemoryMap represents an entry inside the /proc/<pid>/smaps file for that process. Now, I want to aggregate all of these maps into one single struct that stores a process's memory usage by type of memory. For instance, Stack and Heap are different categories. Additionally, each unique (Path(path), MMPermissions) gets its own category. If in the Vec<MemoryMap> we encounter two maps with the same (Path(path), MMPermissions), we combine them into one map whose pss field is the sum of the two. Overall, in my first design, each process has a struct that looks like this:

pub struct MemoryExt {
    pub stack_pss: u64,
    pub heap_pss: u64,
    pub thread_stack_pss: u64,
    pub file_map: HashMap<(PathBuf, MMPermissions), u64>,
    pub anon_map_pss: u64,
    pub vdso_pss: u64,
    pub vvar_pss: u64,
    pub vsyscall_pss: u64,
    pub vsys_pss: u64,
    pub other_map: HashMap<String, u64>,
}

This works fine. Adding two of these is a meaningful operation for me, e.g., adding the memory usage of two child processes. Implementing it is a bit cumbersome, but it works:

impl Add for MemoryExt (click for more details)

One thing I am doing with this data is plotting the stack, then the heap, etc. for each field in the struct in a predefined order. (For each HashMap field, I can either plot the sum of its entries or each entry individually). This struct works fine for that. However, the next thing I want to do is sort all of the fields from greatest to least memory consumption. This is where I may need to rethink the design. Here is my first attempt:

pub struct MemoryExt(HashMap<MemCategory, u64>);

pub enum MemCategory {
    File(PathBuf, MMPermissions),
    Heap,
    Stack,
    TStack,
    Vdso,
    Vvar,
    Vsyscall,
    Anonymous,
    Vsys,
    Other(String)
}

This has the following advantages (I think):

  • constant time access when I know which category I'm looking for
  • more concise implementation of Add
  • iterable for free

But the following disadvantage:

  • When I want to iterate through all of the File keys, for example, to aggregate the usage of all memory-mapped files, I will have to iterate through all of the other keys, too. (Probably negligible performance cost in practice, but still bugs me a little bit.)

Another option is to make an Iter for my original struct that will visit each field in order, generating (MemCategory, u64) tuples. This seems like a good way to do it, but would also add more code.

A third option is in between the first two where I have something like this:

pub struct MemoryExt {
    pub const_map: HashMap<MemCategory, u64>,
    pub file_map: HashMap<(PathBuf, MMPermissions), u64>,
    pub other_map: HashMap<String, u64>,
}

pub enum MemCategory {
    Heap,
    Stack,
    TStack,
    Vdso,
    Vvar,
    Vsyscall,
    Anonymous,
    Vsys,
}

I think this would give me the usage characteristics I want while still cutting down on the implementation of Add. But it feels wrong to use a HashMap with an enum as the key, when that's pretty much equivalent to a struct up to the ability to iterate.

In summary: I'm stuck and would like to know what the community thinks is the best way to do this.

2 posts - 2 participants

Read full topic

🏷️ rust_feed