Skip to content
This repository has been archived by the owner on Jun 20, 2023. It is now read-only.

Very Slow #27

Closed
jhonathas opened this issue Apr 23, 2018 · 13 comments
Closed

Very Slow #27

jhonathas opened this issue Apr 23, 2018 · 13 comments

Comments

@jhonathas
Copy link

The time between the file uploaded and the scan process to the end, is taking more than 10 seconds. This is normal?

There is a faster way. Local scanning takes no more than 200ms

@AndrewLane
Copy link
Contributor

I've run 5 tests so far with very small files and all my times are over 20 seconds. I guess this is expected?

Time: 24.584 sec (0 m 24 s)
Time: 34.061 sec (0 m 34 s)
Time: 33.114 sec (0 m 33 s)
Time: 24.806 sec (0 m 24 s)
Time: 25.074 sec (0 m 25 s)

@AndrewLane
Copy link
Contributor

I found that if you crank the lambda function settings all the way up to 3GB (as high as it will go, as of now), the timing on the scan goes down to about 13s. Still not great, but improved.

@theblockent
Copy link

Has anyone had any luck with improving speeds? When scanning 2 files (500 bytes each), it takes 15 seconds each file @ 3008 megabytes memory, and it's because of the actual clamscan (from print to print):
print("Starting clamscan of %s." % path) av_proc = Popen( [ CLAMSCAN_PATH, "-v", "-a", "--stdout", "-d", AV_DEFINITION_PATH, path ], stderr=STDOUT, stdout=PIPE, env=av_env ) output = av_proc.communicate()[0] print("clamscan output:\n%s" % output)

Any way to speed this up? Would it be more valuable to put any of this inside of a lambda layer?

I am unsure what 'local' scan means in the above message, how do I enable/perform that?

@vrabeshko
Copy link

same issue. No matter how much file weigh, 5KB or 300MB. Execution time the same

@j1mmie
Copy link

j1mmie commented Jul 1, 2019

I'm scanning the 68 byte EICAR test file and it's taking around 70 seconds. Haven't cranked the memory, but it sounds like that has diminishing returns anyway.

The majority of time is spent scanning, not checking / downloading definitions, or even transferring the file to /tmp/. See below

18:24:51 START RequestId: 0b015f5f-97e9-46ef-a24b-a6186ee00aea Version: $LATEST
18:24:51 Script starting at 2019/07/01 18:24:51 UTC
18:24:51 Attempting to create directory /tmp/***.
18:24:52 Attempting to create directory /tmp/clamav_defs.
18:24:52 Downloading definition file main.cvd from s3://***/clamav_defs
18:24:54 Downloading definition file daily.cvd from s3://***/clamav_defs
18:24:55 Downloading definition file bytecode.cvd from s3://***/clamav_defs
18:24:55 Starting clamscan of /tmp/***/totally_not_a_virus.png.
18:26:02 clamscan output:
18:26:02 Scanning /tmp/***/totally_not_a_virus.png
18:26:02 /tmp/***/totally_not_a_virus.png: OK

@midnightcodr
Copy link

Same here, we see average scan time on the north of 80 seconds per file. Most of our files are images and pdfs.

@midnightcodr
Copy link

The scanning has ballooned to almost 100 seconds per file. It's not practical to use Lamda any more. We've implemented a local scan solution for our document uploads using a clamav docker image from https://github.com/mko-x/docker-clamav.

@JarmBlueOak
Copy link

Increasing the AWS Lambda max memory helps with run time. We set to 2048 and get 40-45 second run times.

@j1mmie
Copy link

j1mmie commented Oct 9, 2019

FWIW I went with this implementation instead: https://github.com/widdix/aws-s3-virusscan

Autoscaling EC2 cluster that does the scanning. Was pretty easy to set up and is near instant.

@chrisgilmerproj
Copy link
Collaborator

I think the answers in this thread are correct. Update the memory allocated for the lambda and that will speed things up. Also, if AWS re-uses a lambda it can have a faster spin-up time but that's not guaranteed. If you need something faster then lambda may not be the best option for your workflow.

@sean-redmond
Copy link

It looks to me like this PR will address this issue:

#112

@jhonathas
Copy link
Author

Thanks!

@sean-redmond
Copy link

I tested this PR out, it works really well in my testing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants