How to add text to string and update a grapheme-based cursor position?

⚓ Rust    📅 2026-04-01    👤 surdeus    👁️ 5      

surdeus

Say I have a function like this as part of a text editor. It inserts text at a cursor position and returns the new cursor position. For simple cases, it seems easy, but …

Imagine I have the text “cafe” and then I type a unicode combining character \u{301}, which will create “café”. It seems I should also normalize my string storage, so it will replace the “e + combining” with the single code point version of “é”. So in this case the return value of insert_text("cafe", 4, "\u{301}") should be 4. It doesn’t move.

But I’m not sure how the function could know this. Am I missing something simple, or am I going to need to look at the graphemes before and after the current cursor position, and then deduce from that whether something “merged”?

use unicode_normalization::UnicodeNormalization;
use unicode_segmentation::UnicodeSegmentation;

/// Return the new cursor position
/// span.content is a String
/// offset is a visual offset, a grapheme cluster position
pub fn insert_text(span: &mut Span, offset: usize, text: &str) -> usize {
    let byte_offset = grapheme_to_byte_offset(content, offset);
    span.content.insert_str(byte_offset, &text);
    span.content = span.content.nfc().to_string();
    ???
}

pub fn grapheme_to_byte_offset(s: &str, grapheme_offset: usize) -> usize {
    s.grapheme_indices(true)
        .nth(grapheme_offset)
        .map(|(i, _)| i)
        .unwrap_or(s.len())
}

pub fn grapheme_count(s: &str) -> usize {
    s.graphemes(true).count()
}

1 post - 1 participant

Read full topic

🏷️ Rust_feed