Help with glob/globset or generall get all files in an directory based on glob pattern

โš“ Rust    ๐Ÿ“… 2025-10-15    ๐Ÿ‘ค surdeus    ๐Ÿ‘๏ธ 2      

surdeus

Hi.

Next part of the migration from Java to rust :slight_smile:

Is it the rust way to use File glob via crates.io: Rust Package Registry or globset crates.io: Rust Package Registry or an directory iterator and a filter/match combo as shown in Directory Traversal - Rust Cookbook ?

I now need to collect all files in a log directory based on the file glob. The issue is the {...} pattern which is not supported by glob as described in this issue Several file extension pattern ยท Issue #163 ยท rust-lang/glob ยท GitHub . I thought that globset could help here. :person_shrugging:

The globs

/log71761/{2025-03,2025-04,2025-05,2025-06,2025-07,2025-08,2025-09,2025-10}.{gz,log}

and

/log71771/{2025-03,2025-04,2025-05,2025-06,2025-07,2025-08,2025-09,2025-10}.{gz,log}

DIRECTORY1=/log71761 \
DIRECTORY2=/log71771 \
DEST_FILE=lala.csv \
cargo run

use glob::glob;
use globset::GlobBuilder;
use jiff::{ToSpan, Zoned, fmt::strtime};
use std::{
    env,
    path::{Path, PathBuf, MAIN_SEPARATOR_STR},
    process::exit,
    str::FromStr,
};

const PATH1: &str = "/log/76/";
const PATH2: &str = "/log/77/";

fn main() {
    println!("Hello, world!");
    let path1 = env::var("DIRECTORY1").unwrap_or(PATH1.to_string());
    let path2 = env::var("DIRECTORY2").unwrap_or(PATH2.to_string());
    let _dest_file = env::var("DEST_FILE").unwrap_or(DEST_FILE.to_string());

    //create_dir_glob creates the file glob
    println!(
        "dir files: {:#?}",
        get_log_file_names(&[path1, path2], create_dir_glob())
    );
}

fn get_log_file_names(paths: &[String], glob_pattern: String) -> Vec::<PathBuf> {

    let mut all_files= Vec::<PathBuf>::new();

    // Iterate over given Paths and collect all files in the directory
    // based on glob pattern

    for path in paths {
        println!("My path :{:#?}:", path);
        println!("glob_pattern :{:#?}:", glob_pattern);

        // combine the given directory with file glob
        let mut full_path = String::from(path);
        full_path.push_str(MAIN_SEPARATOR_STR);
        full_path.push_str(&glob_pattern);

        // here is now my issue. globset || glob || filter?
        println!("full_path :{:#?}:", full_path);
        let mut glob = GlobBuilder::new(&full_path);
        glob.literal_separator(true);

        let my_glob = match glob.build() {
            Ok(new_glob) => {
                println!("new_glob :{}:", new_glob);
                new_glob
            }
            Err(e_glob) => {
                println!("Error at glob: {:#?}", e_glob.to_string());
                exit(-2);
            }
        };

        println!("glob :{:?}:", my_glob);
        //for entry in glob(&full_path) {
            /* for entry in match glob(&full_path) {
                Ok(my_entry) => {
                    println!("my_entry :{:?}", my_entry);
                    my_entry
                }
                Err(e_glob) => {
                    println!("Error at glob: {:#?}", e_glob.to_string());
                    exit(-2);
                }
            } { */
            //println!("entry :{:#?}", entry);
            // add file entry to the all_files Vector
            // all_files
        //}
    }

    //"".to_string()
    all_files
}

fn create_dir_glob() -> String {
    let mut to_glob_files = match String::from_str("*{") {
        Ok(new_str) => new_str,
        Err(e) => {
            println!("Error at String::from_str error: {:#?}", e.to_string());
            exit(-1);
        }
    };
    let start: Zoned = Zoned::now();
    let start_minus_n_months = match start.checked_sub(7.months()) {
        Ok(new_months) => new_months,
        Err(e_mon) => {
            println!(
                "Error at checked_sub_months error: {:#?}",
                e_mon.to_string()
            );
            exit(-2);
        }
    };

    let it = start_minus_n_months
        .datetime()
        .series(1.month())
        .filter_map(|dt| dt.to_zoned(start.time_zone().clone()).ok())
        .take_while(|zdt| zdt <= start);

    for zdt in it {
        let temp = match strtime::format("%Y-%m,", &zdt) {
            Ok(new_temp) => new_temp,
            Err(e_format) => {
                println!(
                    "Error at strtime::format error: {:#?}",
                    e_format.to_string()
                );
                exit(-2);
            }
        };
        to_glob_files.push_str(&temp);
        //println!("* {}", zdt.strftime("%Y-%m"));
    }

    // println!("capa: {}", to_glob_files.capacity());
    // println!("len: {}", to_glob_files.len());

    // Remove last ',' from the loop above
    to_glob_files.truncate(to_glob_files.len() - 1);

    // println!("len after -1: {}", to_glob_files.len());

    to_glob_files.push_str("}*.{gz,log}");

    //println!("to glob: '{}'", to_glob_files);

    to_glob_files
}

Thanks for reading and help.

3 posts - 3 participants

Read full topic

๐Ÿท๏ธ Rust_feed