Apache Avro: repeated save to file

⚓ Rust    📅 2025-06-25    👤 surdeus    👁️ 6      

surdeus

Warning

This post was published 48 days ago. The information described in this article may have changed.

Hi,
I am trying to use apache-avro v.0.18.0, but I don't understand how to use it. To write data, they provider a Writer<'a, W: Write> with lifetime bound to the scheme:

#[derive(bon::Builder)]
pub struct Writer<'a, W: Write> {
    schema: &'a Schema,
    writer: W,
    #[builder(skip)]
    resolved_schema: Option<ResolvedSchema<'a>>,
    #[builder(default = Codec::Null)]
    codec: Codec,
    #[builder(default = DEFAULT_BLOCK_SIZE)]
    block_size: usize,
    #[builder(skip = Vec::with_capacity(block_size))]
    buffer: Vec<u8>,
    #[builder(skip)]
    num_values: usize,
    #[builder(default = generate_sync_marker())]
    marker: [u8; 16],
    #[builder(default = false)]
    has_header: bool,
    #[builder(default)]
    user_metadata: HashMap<String, Value>,
}

My problem is, I want to repeatedly write to a single file. The options I have:

  • keep the writer: How do I store it in a struct with schema? That would lead to a self referential struct. I know there is a crate helping with that, but it's kind of annoying.
  • Create writer on every write: I tried that but ether it is overwriting the whole file, or if opened in append mode, corrupting the file probably because of multiple header writes.
  • There is also pub fn append_to(schema: &'a Schema, writer: W, marker: [u8; 16]) -> Self, which does not write the header, but has this marker argument, that seems to be initialized with random values.

Does anyone know how to do this?

Thanks

1 post - 1 participant

Read full topic

🏷️ rust_feed