Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LRU cache for curie lookup #61

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

jeffhhk
Copy link
Collaborator

@jeffhhk jeffhhk commented Mar 30, 2021

Benchmark exercise:

    (cd medikanren && racket)
        (require xrepl)
        (require "db.rkt")
        (require "common.rkt") (load-databases #t)
        (time (enter! "../contrib/medikanren/use-cases/PMI-20-10-Il1R1-case-reviews.rkt"))  ; this is the time measurement shown

Database list:

    covid19
    orange
    pr-owl
    robokop
    rtx
    semmed
    sri_semmeddb
    textminingprovider
    sri-reference-kg-0.2.0
    rtx2_2020_09_16
    co-occur
    umlsmeta

Configurations:

    1) 5GB cui cache (2m11s startup time)
        (in-memory-names? . #f)
        (in-memory-cuis?  . #t)
        (num-cached-cuis . #f)
            cpu time: 60928 real time: 61051 gc time: 3053
    2) 1MB cui cache (32s startup time)
        (in-memory-names? . #f)
        (in-memory-cuis?  . #f)
        (num-cached-cuis . 3000)
            cpu time: 75705 real time: 75830 gc time: 3516
    3) No cui cache (32s startup time)
        (in-memory-names? . #f)
        (in-memory-cuis?  . #f)
        (num-cached-cuis . #f)
            cpu time: 212676 real time: 213121 gc time: 4044

The current pull request is to merge with LRU caching with no change to default behavior. That is, config.defaults.scm is configuration 1).

After merging, I propose another team member use configuration similar to 2) for a few days, and if successful, make it default.

Note that this PR includes #59 because #59 fixes the benchmark exercise above.

@gregr
Copy link
Collaborator

gregr commented Apr 2, 2021

Looks like Thi's changes were included here by mistake.

refactor: extract struct payload

implement: unit test demonstrating lru bug

fix: lru-freshen: the case of oldest lrun and newest lrun

fix: handling an empty linked list would be more work, and useless, so skip it

refactor: extract define lru-remove

refactor: move all lru-evict calls to lru-ref

amend: refactor: move all lru-evict calls to lru-ref

refactor: use lru-remove from lru-evict

refactor: remove unnecessary lru-evict call

refactor: make all the counting work the same way

renames/comments/formatting
Copy link
Collaborator

@gregr gregr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few issues, but otherwise looks good.

Comment on lines +136 to +137
(time (apply make-db (cons path options)))
(apply make-db (cons path options)))))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there an issue using the flat form of apply?

Comment on lines 9 to 12
(define (assert k st)
(if (not k)
(raise (format "assertion failure: ~a" st))
#f))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code.

Otherwise, use when since the result won't be useful.

Comment on lines 108 to 110
(if (>= (lru-num-entries ths) 2) ; freshen 1 entry is noop
(lru-freshen ths k)
#f)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to put this short-circuit case into lru-freshen itself so its callers don't have to think about it.


(struct payload (k v))

(struct lrun (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the n in lrun mean? n for node in a doubly-linked list?

medikanren/lru.rkt Outdated Show resolved Hide resolved


;;; If the lru is full, remove the oldest entry.
(define (lru-evict ths)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lru-evict!

#f))

;;; Make the entry with key k the newest entry.
(define (lru-freshen ths k)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lru-freshen!



;;; Fetch item from lru cache, or if absent, from ref-behind.
(define (lru-ref ths k)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lru-ref!

(begin
; we are removing the oldest
(set-lru-lrun-oldest! ths lrun2)
(set-lrun-older! lrun2 #f)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lrun2 may be #f here. Line 73 already takes care of this though, so it should be fine to delete this line.

(begin
; we are removing the newest
(set-lru-lrun-newest! ths lrun0)
(set-lrun-newer! lrun0 #f)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue with lrun0 possibly being #f here. See the other comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants