-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
my.coding.commits: test/mypy on CI + cleanup #136
Conversation
e975d23
to
c2fad10
Compare
Great, looks good 👍 |
c2fad10
to
5a17886
Compare
|
||
from my.config import commits as user_config | ||
@dataclass | ||
class commits_cfg(user_config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name conflict with def commits
below! But I guess doesn't hurt since it's basically internal to this module
from my.config import commits as user_config | ||
@dataclass | ||
class commits_cfg(user_config): | ||
roots: Sequence[PathIsh] = field(default_factory=list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting that without it, it would complain
ValueError: mutable default <class 'list'> for field roots is not allowed: use default_factory
-- so would require you to use tuple
in the user config. Didn't know about that.. but good that relatively easy to workaround with this field trick.. Not sure if there is an easier way (maybe some @dataclass
annotation attributes).
# experiment to make it lazy? | ||
# would be nice to have a nicer syntax for it... maybe make_config could return a 'lazy' object | ||
@functools.lru_cache() | ||
def config() -> commits_cfg: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about lru_cache in particular, but thinking that 'lazy' instantiation like this might be good to use in general, often the fact that config is constructed on the top level makes it harder to override/test.
Guess relevant to #46
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm..
I'd think one might run into less problems if you avoid lru_cache
for things like config, because it means the user has less options to do things like monkey patching.
It does look/feel worse to have to handle caching the result of some computation after the config has been loaded. I think previously some of that had been done in @property
functions on the config itself, but that doesn't feel like the best solution.
I think I agree with your comment:
would be nice to have a nicer syntax for it... maybe make_config could return a 'lazy' object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way that I've often handled this is to have an inputs
function which takes what the config gives you, and computes the resulting inputs for each of the functions. Then that function is the default parameter for any function which requires the input files
Is a bit confusing on first glance, but is nice for testing purposes and being able to specify a subset of the paths if I want to do something in the repl or create tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah ok, good point about caching!
I guess it doesn't hurt to instantiate the config multiple times, negligible in comparison with the rest of computation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default parameter for any function which requires the input files
Yeah, I thought about that too, but then it often requires propagating the whole thing down the call stack.. I guess it's a classic 'state' problem
@@ -168,7 +195,7 @@ def _commits(_repos: List[Path]) -> Iterator[Commit]: | |||
def _cached_commits_path(p: Path) -> str: | |||
# compute a reduced simple filepath using the absolute path of the repo | |||
simple_path = ''.join(filter(lambda c: c in _allowed_letters, str(p.absolute()))) | |||
return cache_dir() / simple_path / '_cached_commits' | |||
return str(cache_dir() / 'commits' / simple_path / '_cached_commits') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder why mypy hasn't caught missing str
before!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, strange. I think initially I didn't have this split into a separate function, it was just a lambda in the @cachew
decorator, and there its fine as cachew
expects a PathProvider
? but mypy
should've definitely caught that this wasn't returning a str
...
Followup of #132 - add REQUIRES section - use 'commits' config section & add proper schema - use dedicated subdirectory for cache
45eb018
to
ce26bc5
Compare
No description provided.