Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Description of benefits of "caching" feeds is needed #163

Open
befrankt opened this issue Nov 6, 2018 · 11 comments
Open

Description of benefits of "caching" feeds is needed #163

befrankt opened this issue Nov 6, 2018 · 11 comments
Labels
help wanted This is an issue that community can help with T: enhancement Type: enhancement. This issue seeks an improvement of an existing feature

Comments

@befrankt
Copy link
Contributor

befrankt commented Nov 6, 2018

The MISP manual does mention the caching of feeds, but then states it will require further work in the manual:

"Jobs ~ Todo: Explain differences Default, Email, Cache"

I am trying to work out what the benefit is of enabling caching on my built-in feeds. Is this for redundancy , efficiency or other reasons? Is there any information available?

Work environment

Questions Answers
Type of issue Question
OS version (server) Ubuntu
@SteveClement SteveClement transferred this issue from MISP/MISP May 22, 2019
@cbboggs
Copy link
Contributor

cbboggs commented May 23, 2019

Please correct me if I'm wrong, but "caching" downloads the feed content to your instance, and allows you to correlate attributes and see matching "Feed hits" (similar to correlated "Related Events") in the event view on each attribute row, but does not actually create any events in your instance.

@adulau
Copy link
Member

adulau commented Jun 21, 2019

Indeed we need a complete section about caching and feeds.

@adulau adulau added the T: enhancement Type: enhancement. This issue seeks an improvement of an existing feature label Jun 21, 2019
@enjeck enjeck added the help wanted This is an issue that community can help with label Nov 17, 2020
@TTycho
Copy link

TTycho commented Nov 17, 2020

hashes.cvs is pretty undocumented as well but it might explain the caching.

I see that when clicking "cache feed" hashes.csv is being retrieved and when doing a "Fetch all events" manifest.json and its events are pulled. I suppose hashes.csv are hashes of all entities in an event or something and this is a mechanism to see which events have to be pulled again..? I could not find any proper documentation. Its not mentioned in the MISP core format spec.

@adulau
Copy link
Member

adulau commented Nov 17, 2020

Yep, your guess is right. hashes.csv contains the hashes of all attributes and used for the feed caching.

Indeed; it was introduced later in the manifest feed format:

https://github.com/MISP/misp-rfc/blob/master/misp-core-format/raw.md#manifest

and we will add it in the core format spec.

Thanks for the feedback.

@chrisinmtown
Copy link
Contributor

chrisinmtown commented Feb 10, 2021

Yes, please document this! For me, "caching" means to save a copy of something for fast & easy access. I hear you all saying that MISP's "caching" action on a feed means "check the feed whether new data is available." Apparently "caching" does not mean to get the full feed content?

Asking the question differently, what's the difference between CACHING and FETCHING a feed? I interpret those words as nearly identical actions, but apparently that's not the correct interpretation in MISP-land.

@chrisinmtown
Copy link
Contributor

chrisinmtown commented Feb 10, 2021

Found this exchange on gitter at https://gitter.im/MISP/MISP?at=5aecb1c5b37eab7d046a4cb6 and it really helped:

Enabled (Feed is enabled for fetching locally - meaning when you run fetch feeds it will grab the data and create local events)
Caching enabled (Feed is enabled for caching, when you trigger a cache all feeds call, MISP will grab all of the values from the feed and store it in redis for correlations - meaning you can tell if an attribute is contained in the feed)
Lookup visible (if disabled users from organisations other than the host organisation cannot see the lookups against the caches from this feed / cannot see the metadata of the feed - IIRC)

@chrisinmtown
Copy link
Contributor

My follow-up question, maybe you can address this in the misp-book, is this: does it ever makes sense to both cache and fetch a feed? My initial guess is that it does not -- if the data is local in the database because of fetch, it's silly to copy the data into Redis also. Please clarify or set me straight.

@chrisinmtown
Copy link
Contributor

chrisinmtown commented Feb 10, 2021

Again gitter to the rescue https://gitter.im/MISP/Support?at=60242c0a9fa6765ef80eb008

me: is feed caching required to analyze feed overlap? Asked differently, if I fetch two feeds that share data but disable caching on both, what will the overlap analysis show me?

Answer from andras @andras:matrix.circl.lu:

[m] 13:55
Yeah it’s needed, basically it’s fully running the overlap analysis on the dataset in redis

@chrisinmtown
Copy link
Contributor

Are there any other benefits to caching besides analyzing feed overlap?

chrisinmtown added a commit to chrisinmtown/misp-book that referenced this issue Feb 26, 2021
Update managing feeds so figures and text match MISP v2.4.139 features
@chrisinmtown
Copy link
Contributor

I tried to address this in #225 please review

chrisinmtown added a commit to chrisinmtown/misp-book that referenced this issue Feb 26, 2021
Update managing feeds so figures and text match MISP v2.4.137 features
iglocska added a commit that referenced this issue Feb 26, 2021
chg: fix #85 #163 revise managing feeds and figs
@chrisinmtown
Copy link
Contributor

@befrankt please review the latest change to the misp-book managing feeds and post here if you think this issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted This is an issue that community can help with T: enhancement Type: enhancement. This issue seeks an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

6 participants