Text editing differencing algorithm
⚓ Rust 📅 2026-01-14 👤 surdeus 👁️ 2Hi, I have met an algorithm issue about how to calculate the difference between two text content, I am trying to implement a text editor.
Please let me explain it in detail.
I am using the ropey crate as the data structure to store the text content of a text file in my project. In short, ropey provide a set of APIs let you directly updates the text content by lines and every chars on a line, with a great performance. The line index and char index on each line start from 0. Here we have:
pub struct Rope { ... }
And then I am trying to implement the undo, redo features. I want to create a structure Edit, it saves each editing operation from the user: inserting, deleting, changing.
pub struct Insert {
pub line: usize, // Insert at this line
pub char: usize, // Insert at this char of the line
pub payload: String, // Inserted payload
}
pub struct Delete {
pub line: usize, // Delete at this line
pub char: usize, // Delete at this char of the line
pub count: usize, // Deleted chars from the position
}
pub struct Change {
pub line: usize, // Change at this line. Note: A change only changes 1 line at a time.
pub start_char: usize, // Change start from this char of the line.
pub end_char: usize, // Change end at this char of the line.
pub payload: String // New payload
}
pub struct MultiLineChange {
pub lines: BTreeMap<usize, Change>, // Multiple line change, maps from line to the change.
}
pub enum Edit {
Insert(Insert),
Delete(Delete),
Change(Change),
MultiLineChange(MultiLineChange),
}
pub struct EditHistory {
pub edits: Vec<Edit>
}
So in my thinking, the EditHistory is a list of Edit (i.e. Vec<Edit>), the Rope is always the latest text content. So we can recover the text content before last N edits by applying the last N Edit operation reversely. This operation is a O(N) time complexity operation, we need to do N reverse editing on the current Rope.
Finally, here is my questions:
- Is there any algorithm, that can merge 2 or more continuously edit operations (
&[Edit]) into 1 edit operation (Edit)? This can help me reduce some duplicated/repeated operations such as first add a char, then delete it. - Now we have two rope snapshots
Rope-AandRope-B, theBis updated fromAafter some editing operations. Is there any algorithm that can calculate the difference between them? The difference result should be represented by a singleEditstructure above.It seems this question can vary based on different requirements:
- Line-iteration compare: Go through all the lines on
AandB, for the same line, just string-compare the two lines inAandB(I found some algorithms here: wu-diff-rs, lcs-diff-rs, diff.rs). - Some more advanced algorithms to minimize the edit payload just like
git diff?
- Line-iteration compare: Go through all the lines on
1 post - 1 participant
🏷️ Rust_feed