-
Notifications
You must be signed in to change notification settings - Fork 37
Conversation
This provides a convenience API for #196. The idea is that people want to do a parameterization and extract columns operator at once -- this should be easy. The cool thing is that this just uses the parameterize and extract APIs. It also has a from_df() function to allow for passing in a dataframe to be more concise.
This provides a convenience API for #196. The idea is that people want to do a parameterization and extract columns operator at once -- this should be easy. The cool thing is that this just uses the parameterize and extract APIs. It also has a from_df() function to allow for passing in a dataframe to be more concise.
91b12f2
to
9bbcebd
Compare
757aa05
to
4ad63e7
Compare
This adds testing for the decorator, and changes it to a simpler naming scheme. Currently the naming scheme appends __{i} to the node name. This is a reserved pattern that won't be used later.
4ad63e7
to
b3881f4
Compare
OK, this is ready. Calling |
@elijahbenizzy as discussed, I think making this live under experimental would be better short term. |
Not quite production-ready yet...
46fd862
to
5c49311
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems reasonable -- just a few minor things. Fix them and I approve.
decorators.md
Outdated
) | ||
``` | ||
|
||
Note that we have a double-index. Note that this is still in experimental. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we have a double-index. Note that this is still in experimental. | |
Note that we have a double-index. Note that this is still experimental, and has the possibility of being changed; we'd love feedback on this API if you end up using it! |
df = pd.DataFrame( | ||
[ | ||
["outseries1a", "outseries2a", "inseries1a", "inseries2a", 10], | ||
["outseries1b", "outseries2b", "inseries1b", "inseries2b", 100], | ||
# ... | ||
], | ||
# Have to switch as indices have to be unique | ||
columns=[ | ||
[ | ||
"output1", | ||
"output2", | ||
"input1", | ||
"input2", | ||
"input3", | ||
], # configure whether column is source or value and also whether it's input ("source", "value") or output ("out") | ||
["out", "out", "source", "source", "value"], | ||
], | ||
) | ||
|
||
@parameterize_frame(df) | ||
def my_func(input1: pd.Series, input2: pd.Series, input3: float) -> pd.DataFrame: | ||
return pd.DataFrame( | ||
[input1 * input2 * input3, input1 + input2 + input3] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation off
df = pd.DataFrame( | |
[ | |
["outseries1a", "outseries2a", "inseries1a", "inseries2a", 10], | |
["outseries1b", "outseries2b", "inseries1b", "inseries2b", 100], | |
# ... | |
], | |
# Have to switch as indices have to be unique | |
columns=[ | |
[ | |
"output1", | |
"output2", | |
"input1", | |
"input2", | |
"input3", | |
], # configure whether column is source or value and also whether it's input ("source", "value") or output ("out") | |
["out", "out", "source", "source", "value"], | |
], | |
) | |
@parameterize_frame(df) | |
def my_func(input1: pd.Series, input2: pd.Series, input3: float) -> pd.DataFrame: | |
return pd.DataFrame( | |
[input1 * input2 * input3, input1 + input2 + input3] | |
) | |
df = pd.DataFrame( | |
[ | |
["outseries1a", "outseries2a", "inseries1a", "inseries2a", 10], | |
["outseries1b", "outseries2b", "inseries1b", "inseries2b", 100], | |
# ... | |
], | |
# Have to switch as indices have to be unique | |
columns=[ | |
[ | |
"output1", | |
"output2", | |
"input1", | |
"input2", | |
"input3", | |
], # configure whether column is source or value and also whether it's input ("source", "value") or output ("out") | |
["out", "out", "source", "source", "value"], | |
], | |
) | |
@parameterize_frame(df) | |
def my_func(input1: pd.Series, input2: pd.Series, input3: float) -> pd.DataFrame: | |
return pd.DataFrame( | |
[input1 * input2 * input3, input1 + input2 + input3] | |
) |
raise ValueError(f"Invalid dep type: {dep_type}") | ||
|
||
|
||
def _get_index_levels(index: pd.MultiIndex) -> List[list]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc string please
from hamilton.function_modifiers.expanders import ParameterizedExtract | ||
|
||
|
||
def _get_dep_type(dep_type: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
short doc string -- was not clear that this function returns the right function modifier type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or return type annotation.
[Short description explaining the high-level reason for the pull request]
Changes
How I tested this
Notes
Checklist