Warning
This post was published 42 days ago. The information described in this article may have changed.
Map<K, V>
Rust's type system is renowned for its rigor and expressiveness, helping us build safe and maintainable code. However, even experienced Rust developers often encounter difficulties when naming HashMap<K, V>
or BTreeMap<K, V>
instances, especially when both the key (K) and value (V) are concrete types. The most common pattern is to simply concatenate the type names of the key and value, such as orderItemStatusMap
. But this quickly reveals a core problem: we cannot directly and quickly discern which part of the name represents the key and which represents the value.
An interesting contrast arises when we directly read the type signature HashMap<K, V>
, for example, HashMap<(OrderId, ItemId), ItemStatus>
. Here, the identities of the key and value are crystal clear, naturally delimited by angle brackets <
and the comma ,
. This symbolic representation of type parameters is intuitive and requires no extra thought. However, once we "flatten" this clear structure into a variable name like orderItemStatusMap
, this inherent clarity vanishes. We lose the visual distinction between the key and value, forcing code readers to pause, find the variable definition, and examine the type signature to confirm the specific identities of K
and V
. In complex codebases, this small cognitive burden accumulates and can significantly impact code comprehension efficiency.
Let's illustrate this with a complex data model from an order system:
Assume we have the following structs to represent orders, order items, and their statuses:
type OrderId = u62; // Order ID
type ItemId = u32; // Item ID
struct Order {
id: OrderId,
// ... other order information
}
struct Item {
id: ItemId,
name: String,
// ... other item information
}
enum ItemStatus {
Pending,
Shipped,
Delivered,
Cancelled,
}
Now, consider a complex order processing scenario where we might need to store the following two mapping relationships:
(OrderId, ItemId)
(composite key)ItemStatus
orderItemStatusMap
OrderId
HashMap<ItemId, ItemStatus>
(a nested Map)orderItemsStatusMap
Let's see how these two mappings would be named using the traditional approach:
// Scenario 1: Mapping (Order ID, Item ID) to item status
// The actual type of `orderItemStatusMap`: HashMap<(OrderId, ItemId), ItemStatus>
let order_item_status_map: HashMap<(OrderId, ItemId), ItemStatus> = HashMap::new();
// Scenario 2: Mapping Order ID to (Item ID -> item status) Map
// The actual type of `orderItemsStatusMap`: HashMap<OrderId, HashMap<ItemId, ItemStatus>>
let order_items_status_map: HashMap<OrderId, HashMap<ItemId, ItemStatus>> = HashMap::new();
The problem is now evident:
order_item_status_map
: From the name alone, it's difficult to immediately tell whether this maps (OrderId, ItemId)
to ItemStatus
.order_items_status_map
: This name is even more ambiguous. Does it map OrderId
to HashMap<ItemId, ItemStatus>
? Or (OrderId, ItemId)
to ItemStatus
? It might even be misread as mapping Order
to Vec<ItemStatus>
. With just the name, we cannot instantly tell which is the key, which is the value, or whether the value itself is another collection or a composite structure.In real-world projects, this ambiguity introduces significant obstacles to reading and understanding code, forcing developers to constantly jump to type definitions to confirm the structure and semantics of the data.
Currently, there are some attempts in the community to alleviate this problem, such as using order_id_item_id_to_status_map
or creating type aliases, but they still have shortcomings:
order_id_item_id_to_status_map
is explicit, the excessive length of the variable name reduces code brevity. Moreover, it is still based on English descriptions and does not provide an immediate visual distinction between key and value identities.type OrderItemStatusMap = HashMap<(OrderId, ItemId), ItemStatus>;
improve readability, but it is still a plain text descriptive name that does not fundamentally solve the problem of symbolic differentiation between K and V. We need to look at its definition to determine the key and value types.While these methods have their value, none of them provide a way to distinguish K
and V
in a clear, symbolic way directly within the variable name itself.
Given that HashMap<K, V>
expresses the relationship between keys and values so clearly and naturally, could we allow some "angle-bracket-like" symbols in Rust variable or type names to directly represent keys and values?
Although Rust's current identifier rules do not allow the direct use of <
and >
, Unicode contains many visually similar characters that are not widely used in programming languages. For example, there are full-width less-than signs <
and greater-than signs >
, or mathematical angle brackets ⟨
and ⟩
. Since Rust code supports UTF-8 encoding, using these characters would not lead to garbled text.
Imagine if we could name variables like this:
// Clearly indicates: the key is (OrderId, ItemId), the value is ItemStatus
let map<<order_id, item_id>, item_status>: HashMap<(OrderId, ItemId), ItemStatus> = HashMap::new();
// Clearly indicates: the key is OrderId, the value is another HashMap<ItemId, ItemStatus>
let map<order_id, <item_id, item_status>>: HashMap<OrderId, HashMap<ItemId, ItemStatus>> = HashMap::new();
This approach visually mimics the structure of type parameters, making the identities of keys and values immediately obvious and fundamentally eliminating ambiguity. When we see map<<order_id, item_id>, item_status>
, we can almost instantly understand that it represents "a mapping from the combination of order ID and item ID to item status" without having to check its type signature.
Of course, I know that many objections will immediately arise:
Despite these challenges, I believe this proposal is not entirely unfounded. With the continuous evolution of programming languages and IDEs, we may have better input methods and more comprehensive font support in the future. IDEs themselves might even be able to render these special characters in a more readable format (e.g., displaying <
and >
as distinct visual cues).
Currently, we may not be able to use these symbols directly in Rust identifiers. However, the core idea is this: Do we need a more expressive, more symbolic way to name Map<K, V>
instances to eliminate the inherent ambiguity in key-value identities?
Perhaps instead of directly inserting these characters into variable names, we could consider:
This is an open discussion. I hope this bold proposal will stimulate deeper thinking within the community about the Map<K, V>
naming problem. How can we explore new naming paradigms that better reflect the semantics of the data structure itself, while maintaining code clarity and readability?
1 post - 1 participant
🏷️ rust_feed