Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flush results to output as they are fetched #31

Closed
dee-see opened this issue Nov 26, 2020 · 3 comments · Fixed by #32
Closed

Flush results to output as they are fetched #31

dee-see opened this issue Nov 26, 2020 · 3 comments · Fixed by #32
Labels
bug Something isn't working enhancement New feature or request

Comments

@dee-see
Copy link
Contributor

dee-see commented Nov 26, 2020

Currently the tool fetches all subdomains and at the end prints them all to stdout. When running again targets with very large amounts of subdomains, this allocates a very large vector for subdomains before calling cleaner.clean(subdomains). This is made worse by the fact that I'm running Vita on a 2 GB VPS which simply can't handle it and crashes.

vita -d comcast.net results in memory allocation of 1610612736 bytes failed I don't know if there's something going wrong with the error message, but that's quite a bit of memory!

Flushing results to output as they are fetched would solve that, however it might make it difficult to output only unique results. Personally I wouldn't mind a --flush switch that outputs duplicated results.

@junnlikestea
Copy link
Owner

junnlikestea commented Nov 26, 2020

vita -d comcast.net results in memory allocation of 1610612736 bytes failed I don't know if there's something going wrong with the error message, but that's quite a bit of memory!

My guess is that because comcast.net returns an absurd amount of results that we're probably actually running out of memory to allocate it.

Flushing results to output as they are fetched would solve that, however it might make it difficult to output only unique results. Personally I wouldn't mind a --flush switch that outputs duplicated results.

Yea I think this is a good idea, maybe the solution is to make Runner.run method to return a stream, and the PostProcessor.clean method to just return an iterator over the filtered results.

Depending on the cli flag we then either remove the duplicates by collecting the iterator into a HashSet or just write it to stdout with a BufWriter. What do you think?

@dee-see
Copy link
Contributor Author

dee-see commented Nov 28, 2020

I think that sounds great!

@junnlikestea
Copy link
Owner

Another thing I noticed while digging into this issue was that I was allocating another large vec for the SonarSearch results coming over grpc.

vita/crobat/src/lib.rs

Lines 48 to 63 in 3782231

pub async fn get_subs(&mut self, host: Arc<String>) -> Result<Vec<String>> {
trace!("querying crobat client for subdomains");
let mut subdomains = Vec::new();
let request = tonic::Request::new(QueryRequest {
query: host.to_string(),
});
debug!("{:?}", &request);
let mut stream = self.client.get_subdomains(request).await?.into_inner();
while let Some(result) = stream.message().await? {
debug!("crobat result {:?}", &result);
subdomains.push(result.domain);
}
Ok(subdomains)
}

Because the type returned by the line below implements the Stream trait we could probably just return that and avoid all those extra allocations.
let mut stream = self.client.get_subdomains(request).await?.into_inner();

So the method would look something like:

    pub async fn get_subs(&mut self, host: Arc<String>) -> Result<impl Stream<Item = std::result::Result<Domain, Status>>> {
        trace!("querying crobat client for subdomains");
        let request = tonic::Request::new(QueryRequest {
            query: host.to_string(),
        });
        debug!("{:?}", &request);

        let stream = self.client.get_subdomains(request).await?.into_inner();
        Ok(stream)
    }

@junnlikestea junnlikestea added bug Something isn't working enhancement New feature or request labels Dec 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants