Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using yara-x in a Rust library and handling lifetime specifiers of Scanner #184

Closed
xrl1 opened this issue Sep 1, 2024 · 6 comments
Closed

Comments

@xrl1
Copy link

xrl1 commented Sep 1, 2024

Hello,
Related to #139 , but maybe not the same use case:
I'm trying to create a library that uses the yara-x crate. The library should initialize the rules internally, and create a struct that holds an instance of the yara-x Scanner.

Reducted code in scanner.rs:

use yara_x::{Rules, Scanner as YaraScanner};
use anyhow::Result;

struct MyScanner<'a> {
    scanner: YaraScanner<'a>,
}

impl<'a> MyScanner<'a> {
    pub fn new(rules: &'a Rules) -> Self {
        let scanner = YaraScanner::new(&rules);
        MyScanner { scanner }
    }

    pub fn scan(data: String) -> Result<String> {
        // Some implementation
    }
}}

Reducted code of lib.rs:

pub struct MyLib<'a> {
    scanner: MyScanner<'a>,
}

impl<'a> MyLib<'a> {
    pub fn new() -> Result<Self> {
        let rules: Rules = load_rules()?;
        let scanner = MyScanner::new(&rules);
        Ok(MyLib { scanner })
    }

    pub fn scan(&self, data: String) -> Result<String> {
        self.scanner.scan(data)
    }
}

load_rules is compiling and loading the rules from a resource file.

I tried countless variations of this code, but I always reach the obstacle of the lifetime specifier on Scanner and get an error of "rules does not live long enough".

I couldn't find a way to wrap YaraScanner in an object that outlives it and holds a Rules object safely.

I cannot create the rules in main.rs because I intend to export this as a library, and I don't want the user to load the Yara rules herself.

The only solution Claude Sonnet and I found where to Box::leak this memory or statically load it, so it will live until the program exits. I want to avoid it to support in the future getting string rules as arguments to MyLib::new, so I'm confined to the lifetime of a MyLib instance.

Please let me know how you think it can be solved, because currently, I think only changing Scanner to take ownership of the rules can solve this.

@plusvic
Copy link
Member

plusvic commented Sep 1, 2024

I believe you can achieve what you want with a bit of unsafe code:

/// Wraps a yara_x::Rules, but preventing it from moving around in memory.
struct PinnedRules{
    rules: yara_x::Rules,
    _pin: PhantomPinned,
}

struct MyScanner<'a> {
    scanner: yara_x::Scanner<'a>,
    // This allows MyScanner to own the yara_x::Rules and pass a reference to the
    // scanner. The use of `Pin` guarantees that the rules won't be moved.
    _rules: Pin<Box<PinnedRules>>,
}

impl<'a> MyScanner<'a> {
    pub fn new(rules: yara_x::Rules) -> Self {
        let pinned_rules = Box::pin(PinnedRules{rules, _pin: PhantomPinned});
        let rules_ptr = std::ptr::from_ref(&pinned_rules.rules);
        let rules_ref = unsafe { rules_ptr.as_ref().unwrap() };
        let scanner = yara_x::Scanner::new(rules_ref);

        Self { scanner, _rules: pinned_rules }
    }

    pub fn scan(&mut self, data: String) -> Result<String> {
        todo!()
    }
}

I haven't tested it thoroughly, so it may contain bugs.

@xrl1
Copy link
Author

xrl1 commented Sep 1, 2024

Thank you, I tested this change in my code, all the tests passed, and nothing panics.

Even though it works, I think this solution is suboptimal - I need to test it more thoroughly, and I'll deep-dive into std::pin docs to make sure this unsafe code won't crash in the future, won't memory-leak, and there isn't any race in the destructor of MyScanner that may cause invalid memory access.

May I still suggest handling this issue in the yara-x library sometime in the future - to avoid forcing the library user to write unsafe code, or to introduce advanced Rust concepts.

@qjerome
Copy link
Contributor

qjerome commented Sep 2, 2024

@xrl1 then only thing you have to do is that you need to put Rules within your scanner struct so that the Rust compiler knows its lifetime doesn't expire before the struct is dropped. It means your MyScanner needs to own your Rules.

pub struct MyScanner<'s> {
    rules: yara_x::Rules,
    scanner: Option<yara_x::Scanner<'s>>,
}

@plusvic
Copy link
Member

plusvic commented Sep 2, 2024

@qjerome that doesn't work because yara_x::Scanner needs a reference to the rules in MyScanner.rules, you can create a scanner that receives that reference, but you can't move it into MyScanner.scanner.

@qjerome
Copy link
Contributor

qjerome commented Sep 2, 2024

My bad, I thought it would work ! That's what you get when you write code without testing it ...

@qjerome
Copy link
Contributor

qjerome commented Sep 2, 2024

Let me add a non null contribution this time:

impl<'s> Deref for MyScanner<'s> {
    type Target = yara_x::Scanner<'s>;
    fn deref(&self) -> &Self::Target {
        &self.scanner
    }
}

impl<'s> DerefMut for MyScanner<'s> {
    fn deref_mut(&mut self) -> &mut Self::Target {
        &mut self.scanner
    }
}

Should allow you to use your MyScanner as a yara_x::Scanner

@plusvic plusvic closed this as completed Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants