Info
This post is auto-generated from RSS feed The Rust Programming Language Forum - Latest topics. Source: Introducing "thaf": A Robust Transcriptome Extractor in Rust
We have recently released thaf, a transcriptome extractor that reads GFF3 files and generates transcriptome sequences from genomic DNA. It primarily performs functions already covered by gffread
, although thaf
can additionally generate gene maps for Salmon
. The primary motivation for developing this tool was that gffread
was failing with a segmentation fault on one of our genomes, and we could not pinpoint precisely where or why this occurred. Therefore, the package is proudly named after True Heroes Are Forced. We had no initial plans to work on this project.
We validated the new tool by comparing its output with that of gffread on a genome where gffread performed correctly, carefully examining the resulting sequences. thaf
joins exons in the proper order, applies reverse complement where necessary, and incorporates numerous internal consistency checks to validate the GFF3 input (e.g., detecting overlapping exons).
Since our goal was to create a program that would never crash due to unexplained memory issues, Rust was an obvious choice. Furthermore, processing crop genomes (such as wheat, which is over 16 GB) is computationally intensive, so it is beneficial that thaf
completes its tasks quickly.
This small tool serves our purposes very well. If others find it beneficial or express interest, we think it could potentially evolve into an interesting project.
1 post - 1 participant
🏷️ rust_feed