-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
function node can return the source of function rather than only the source of file #4554
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #4554 +/- ##
===========================================
+ Coverage 79.40% 79.48% +0.09%
===========================================
Files 480 482 +2
Lines 35087 35364 +277
===========================================
+ Hits 27856 28107 +251
- Misses 7231 7257 +26
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Thanks @unkcpz . As you said yourself in your comment, we shouldn't store the function's source code in the database. It should be stored in the repository, but there it already is stored, so storing it again would be inefficient and unnecessary. Your other suggestion may actually be better, to, in addition to the starting line number which is already stored, add the total number of lines. The only problem is backwards compatibility. Whatever function that may use this new total number of lines attribute, will have to deal with it not existing, since all existing process function nodes won't have that attribute and I don't think we can or should write a database migration for this. On that note of backwards compatibility, your current PR also breaks it as you change the behavior of the |
Thanks, @sphuber! I notice as you mentioned there is a backwards compatibility problem. But I have no experience in database or orm stuff, what am I suppose to do? Can you point out which part of code is involved and I should look at? |
You should revert the change of Secondly, as we discussed, it is not a good idea to store the source code of the function in the database. The best we can do is store the number of lines as an attribute. This you can then use together with the |
fb74cc8
to
b9e9d35
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @sphuber I change the code as your recommend.
Just out of curiosity, what to do if I want to change the method name?
You will have to deprecate the Now, the problem is, that ideally, in the future when we remove the deprecation and make |
b9e9d35
to
b234623
Compare
b234623
to
af362d8
Compare
hi @sphuber, anymore to add of this? I also add the deprecated warning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @unkcpz , nice work! I have a couple of comments to discuss with you and @sphuber. Also, to justify why I'm suddenly interested in this, it might become necessary for an upcoming PR to solve #4401 (see this comment, the row for source_code
).
aiida/orm/utils/mixins.py
Outdated
:returns: the number of lines or None | ||
""" | ||
return self.get_attribute(self.FUNCTION_NUMBER_OF_LINES_KEY, -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This says it will return the number of lines or None
but the default value of the get
is -1. Shouldn't it be None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that is ambiguous. Originally try to use -1
here to return the whole file code if the key is not found. I think None would much more make sense. Changed.
aiida/orm/utils/mixins.py
Outdated
warnings.warn( | ||
'in the future this will return only the source of the function and' | ||
' please use `get_function_source_file` to get' | ||
' the entire content of the file instead', AiidaDeprecationWarning | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, there are a couple of comments regarding this depracation you are doing:
- You need to already include the
get_function_source_file
method so that users can immediately start fixing their scripts with the new way of doing it. - The depracation path for this is a bit complicated. I know you discussed with @sphuber in the general thread what would be the hypothetical way to do this, but I'm not sure he was actually agreeing it would be a good idea to go this way (maybe he can confirm in case I misinterpreted?). I personally agree that it is better "namewise", but changing to that might generate a bit of annoyance on the users.
- The warning itself it is a bit unclear to me, I would maybe add a legthier description if we are going to go this way. Something like:
warnings.warn( | |
'in the future this will return only the source of the function and' | |
' please use `get_function_source_file` to get' | |
' the entire content of the file instead', AiidaDeprecationWarning | |
) | |
warnings.warn( | |
'you are currently using `get_function_source_code` to get the entire' | |
' file where the process is defined, but in the future this method will' | |
' be used to return only the part of that file that corresponds to the' | |
' process itself. Please use the method `get_function_source_file`' | |
' instead.', AiidaDeprecationWarning | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to already include the get_function_source_file method so that users can immediately start fixing their scripts with the new way of doing it.
My bad. add it as the implemented function and forward it to get_function_source_code
.
The depracation path for this is a bit complicated. I know you discussed with @sphuber in the general thread what would be the hypothetical way to do this, but I'm not sure he was actually agreeing it would be a good idea to go this way (maybe he can confirm in case I misinterpreted?). I personally agree that it is better "namewise", but changing to that might generate a bit of annoyance on the users.
However, I can not figure out a better way to do this. I think it can reduce the burden to users only by a clear docstring and comprehensive warning information.
The warning itself it is a bit unclear to me, I would maybe add a legthier description if we are going to go this way.
Thanks! changed as you recommend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I can not figure out a better way to do this
So, I think the alternative is to accept that get_function_source_code
now returns the full file and create a new name for the one that just returns the function (get_function_source_def
?). This is less consistent with the original information (where source_file
had the file and source_code
had the definition) but more consistent with the current status (where source_code
is the name of the file) and also causes less friction with the users because there is no deprecations needed. This was what I understood from the first comment that @sphuber made:
You should revert the change of get_function_source_code and keep it as it was, i.e. return the entire file. You can then add a method that returns just the source code of the function, but you will have to find a different method name for this.
There was no mention of any further change there, just to rename the new function and leave the old one as it was. Then the longer path in the latter comment read more like a hipothetical situation than a prescription for this case.
Again, I am personally in favour of doing the longer deprecation path, I just want to make sure that we are all on the same page and there is not misunderstandings or something we are not considering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ramirezfranciscof Thanks for the clarification. So in this simple pr, only need to add a new method for section of source code, then left deprecation and renaming to the future, correct? I rebase the PR and please have a look at your available.
aiida/orm/utils/mixins.py
Outdated
return self.get_object_content(self.FUNCTION_SOURCE_FILE_PATH) | ||
|
||
def get_function_source(self): | ||
"""Return the source code of the function and only of the function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are doing the deprecation path, I would add to this docstring that this is a temporary method.
I'm also wondering if we should add a test for this or not, considering that it is going to be temporary. I guess it would be the same that adding a unit test for the warning. @sphuber would it be a good idea to have a unit test file with all the deprecation warnings, considering it would have to be explicitly updated when removing the deprecations? Somehow it sounds both a bit insane but not a bad idea at the same time, as a way to keep track of all deprecations in a single place...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The three relevant functions are closely packed in the code. So I think the source code itself with its docstring makes enough explanation for how to do the future migration. Let's see if @sphuber has any better ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ramirezfranciscof My apologise for the delay and thanks for such details reviewing.
aiida/orm/utils/mixins.py
Outdated
:returns: the number of lines or None | ||
""" | ||
return self.get_attribute(self.FUNCTION_NUMBER_OF_LINES_KEY, -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that is ambiguous. Originally try to use -1
here to return the whole file code if the key is not found. I think None would much more make sense. Changed.
aiida/orm/utils/mixins.py
Outdated
warnings.warn( | ||
'in the future this will return only the source of the function and' | ||
' please use `get_function_source_file` to get' | ||
' the entire content of the file instead', AiidaDeprecationWarning | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to already include the get_function_source_file method so that users can immediately start fixing their scripts with the new way of doing it.
My bad. add it as the implemented function and forward it to get_function_source_code
.
The depracation path for this is a bit complicated. I know you discussed with @sphuber in the general thread what would be the hypothetical way to do this, but I'm not sure he was actually agreeing it would be a good idea to go this way (maybe he can confirm in case I misinterpreted?). I personally agree that it is better "namewise", but changing to that might generate a bit of annoyance on the users.
However, I can not figure out a better way to do this. I think it can reduce the burden to users only by a clear docstring and comprehensive warning information.
The warning itself it is a bit unclear to me, I would maybe add a legthier description if we are going to go this way.
Thanks! changed as you recommend.
aiida/orm/utils/mixins.py
Outdated
return self.get_object_content(self.FUNCTION_SOURCE_FILE_PATH) | ||
|
||
def get_function_source(self): | ||
"""Return the source code of the function and only of the function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The three relevant functions are closely packed in the code. So I think the source code itself with its docstring makes enough explanation for how to do the future migration. Let's see if @sphuber has any better ideas.
ce440f5
to
cd00a8f
Compare
This method returns the source code of just the process function itself, i.e., the wrapped function plus the decorator itself. The existing method `get_function_source_code` would return the source code of the entire file in which the process function was defined, but this can at times include a lot of other code that is not always useful to have. The `get_function_source_code` method is deprecated and replaced by the `get_source_code_file` method to have consistent naming. The `get_source_code_function` implementation retrieves the source code of the function by calling `get_source_code_file` and extracting just the lines of the function. This is accomplished by using the starting line of the function, which already used to be stored, combined with the total number of lines that make up the function, which is an attribute that is stored from now on for process functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @unkcpz . I think this is good to go in. I have just suggested some changes to deal with the inconvenient name. I would deprecate get_function_source_code
and rename it get_source_code_file
and then add get_source_code_function
with your new implementation.
aiida/orm/utils/mixins.py
Outdated
def get_function_source_def(self): | ||
""" | ||
The temporary method that return the section of code for the | ||
function only correspond to the function process. It will be | ||
replaced with `get_function_source_code` in future. | ||
|
||
:returns: the function source code. | ||
""" | ||
content_list = self.get_function_source_code().splitlines() | ||
start_line = self.function_starting_line_number | ||
end_line = start_line + self.function_number_of_lines | ||
|
||
# start_line - 1 to include the decorator | ||
function_source = '\n'.join(content_list[start_line - 1:end_line]) | ||
|
||
return function_source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_function_source_def(self): | |
""" | |
The temporary method that return the section of code for the | |
function only correspond to the function process. It will be | |
replaced with `get_function_source_code` in future. | |
:returns: the function source code. | |
""" | |
content_list = self.get_function_source_code().splitlines() | |
start_line = self.function_starting_line_number | |
end_line = start_line + self.function_number_of_lines | |
# start_line - 1 to include the decorator | |
function_source = '\n'.join(content_list[start_line - 1:end_line]) | |
return function_source | |
def get_function_source_code(self): | |
"""Return the absolute path to the source file in the repository. | |
:returns: the absolute path of the source file in the repository, or None if it does not exist | |
""" | |
warn_deprecation('This method will be removed, use `get_source_code_file` instead.', version=3) | |
return self.get_object_content(self.FUNCTION_SOURCE_FILE_PATH) | |
def get_source_code_file(self) -> str | None: | |
"""Return the source code of the file in which the function was defined. | |
:returns: The source code of the file or ``None`` if not available. | |
""" | |
return self.get_object_content(self.FUNCTION_SOURCE_FILE_PATH, None) | |
def get_source_code_function(self) -> str | None: | |
"""Return the source code of the function including the decorator. | |
:returns: The source code of the function or ``None`` if not available. | |
""" | |
source_code = self.get_source_code_file() | |
if source_code is None: | |
return None | |
content_list = source_code.splitlines() | |
start_line = self.function_starting_line_number | |
end_line = start_line + self.function_number_of_lines | |
# Start at ``start_line - 1`` to include the decorator | |
return '\n'.join(content_list[start_line - 1:end_line]) |
958c451
to
450376c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good, thanks @unkcpz , just a question on the method naming
@@ -179,6 +181,9 @@ def test_process_function(data): | |||
assert node.function_name == function_name | |||
assert isinstance(node.function_starting_line_number, int) | |||
|
|||
# Check the source code of the function is stored | |||
assert node.get_function_source_def() == inspect.getsource(test_process_function) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert node.get_function_source_def() == inspect.getsource(test_process_function) | |
assert node.get_function_source_file() == inspect.getsource(test_process_function) |
aiida/orm/utils/mixins.py
Outdated
:returns: The source code of the function or ``None`` if it could not be determined when storing the node. | ||
""" | ||
try: | ||
return self.base.repository.get_object_content(self.FUNCTION_SOURCE_FILE_PATH) | ||
except FileNotFoundError: | ||
return None | ||
|
||
def get_function_source_function(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally find get_function_source_function
a bit weird, with the double function
in there. I would propose get_source_code_function
and get_source_code_file
for the entire file source code. Think that makes more sense no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mess up when resolve confliction on github interface. Will change it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed, I hope it is all fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @unkcpz , two more tiny things I missed. They are suggestions so you can just commit them directly, and I will merge. Thanks!
No worries, I am also supposed to find it 😆. Done! |
7376723
to
1454ce9
Compare
@unkcpz while typing up the commit message, I realized there still was a bug. The new method |
You are right. The |
1454ce9
to
548215b
Compare
No it didn't automatically fix it. I fixed it by checking whether those attributes return def get_source_code_function(self) -> str | None:
"""Return the source code of the function including the decorator.
:returns: The source code of the function or ``None`` if not available.
"""
source_code = self.get_source_code_file()
if source_code is None or self.function_number_of_lines is None or self.function_starting_line_number is None:
return None
content_list = source_code.splitlines()
start_line = self.function_starting_line_number
end_line = start_line + self.function_number_of_lines
# Start at ``start_line - 1`` to include the decorator
return '\n'.join(content_list[start_line - 1:end_line]) See the conditional |
Thanks a lot! I retrigger the failed test (3.10) which I guess it is failed randomly. Only the docs tests failed, but I have no idea why it shows here, I don't think there is any code related to that. |
I don't understand either why it is failing all of a sudden. I can reproduce it locally, but the problem seems to be with the tutorial notebook. It creates and loads a temporary profile but doesn't set it as a default. So later when it calls |
c1d718a
to
ce1ad67
Compare
Ok I found the problem, but I understand literally nothing. In the constructor if we do try:
source_file_path = inspect.getsourcefile(func)
if source_file_path:
with open(source_file_path, 'rb') as handle:
self.base.repository.put_object_from_filelike( # type: ignore[attr-defined]
handle, self.FUNCTION_SOURCE_FILE_PATH
)
except (IOError, OSError):
pass it works without any provblems. However if we do try:
source_file_path = inspect.getsourcefile(func)
except (IOError, OSError):
pass
else:
if source_file_path is not None:
with open(source_file_path, 'rb') as handle:
self.base.repository.put_object_from_filelike( # type: ignore[attr-defined]
handle, self.FUNCTION_SOURCE_FILE_PATH
) somehow the docs build fails 😕 what the f? |
It seems the
It has some timestamps in the log, but you can find it point to the code we changed in this PR. I suspect there will be issue with create
BTW, is the "f" for "fix" as I did for the commit message? 😆 |
No, in this case it was something else 😅 |
I understand more why it was going on for the docs fail. By testing the behavior in the IPython shell, it proves my guess that It seems in ipython or Jupyter notebook, the classes/functions defined or the main module does not have a file attribute associated with them so inspect wasn't able to retrieve a source file. The issue can be reproduced by: In [1]: from aiida import orm
In [2]: from aiida.engine import calcfunction
In [3]: @calcfunction
...: def add(a, b):
...: return a + b
...:
In [4]: d = add(orm.Int(4), orm.Int(5))
<ipython-input-3-f301de63af89>
['@calcfunction\n', 'def add(a, b):\n', ' return a + b\n']
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-4-677ea2e3e348> in <module>
----> 1 d = add(orm.Int(4), orm.Int(5))
.....
~/Projects/WP-aiida/aiida-core/aiida/engine/processes/functions.py in _setup_db_record(self)
361 """Set up the database record for the process."""
362 super()._setup_db_record()
--> 363 self.node.store_source_info(self._func)
364
365 @override
~/Projects/WP-aiida/aiida-core/aiida/orm/utils/mixins.py in store_source_info(self, func)
66 print(source_file_path)
67 print(source_list)
---> 68 with open(source_file_path, 'rb') as handle:
69 self.base.repository.put_object_from_filelike( # type: ignore[attr-defined]
70 handle, self.FUNCTION_SOURCE_FILE_PATH
FileNotFoundError: [Errno 2] No such file or directory: '<ipython-input-3-f301de63af89>' So in principle, the try-catch block with |
ah yeah of course, source code storing for interactive shells and notebooks has never been supported. I would rather call this a feature request then. |
Fixes #4543
When checking the source of the process function over MaterialCloud discover(or when explore the aiida archive of others), the whole source file returns and show makes it hard to quickly find the code if there are many other things in the source file. @ltalirz You may want to comment on this ;)
In this PR, I simply rename the
node.get_function_source_code
toget_function_source_file
and retrieve the source of the function byinspect.getsource
and store it as an attribute of the node.Not sure about whether to store it as a node attribute is a good idea, since if the code of the function is very long it seems not wise to store a long string in attributes. The other choice is along with storing the 'function_starting_line_number' also storing the number of lines of function, and clip the source code of function from these two value.