-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregate network objects #607
Conversation
docs/source/4-developer.md
Outdated
@@ -90,6 +90,10 @@ The API exposes four routes: | |||
|
|||
At present, the following query types are accessible through FlowAPI: | |||
|
|||
- `aggregate_network_objects` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -1022,3 +1022,31 @@ def total_network_objects( | |||
"end_date": end_date, | |||
"aggregation_unit": aggregation_unit, | |||
} | |||
|
|||
|
|||
def aggregate_network_objects( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been giving the params for this some thought - I actually think what would be most useful is to have it take: network_objects, statistic, by. That can be achieved without touching the underlying Flowmachine class, by constructing a tno object, and then calling aggregate on it with the by and statistic arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll have to expand on that a bit, in terms of roughly how the call would look. I gather tno is TotalNetworkObject, and "by" is possibly the date range?! But also, I thought you were trying to do away with calls to aggregate() (#599)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you were, I understand now what you are driving at! (in view of the review comment further down)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguments need to be total_network_objects_query: dict
, statistic
and by: str
(or a more explanatory name than by
would also be fine!)
@@ -203,7 +203,7 @@ def aggregate(self, by=None, statistic="avg"): | |||
) | |||
|
|||
|
|||
class AggregateNetworkObjects(GeoDataMixin, Query): | |||
class AggregateNetworkObjects(TotalNetworkObjects, Query): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relates to your comment in #601 "Refactor AggregateNetworkObjects to take TotalNetworkObjects as main argument"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I think I've failed to communicate what I meant there - the suggestion was that rather than taking all the arguments necessary to create a TotalNetworkObjects
object in the __init__
method, that this class should instead just take a TotalNetworkObjects
object instead of those arguments.
This change actually alters the class hierarchy of the class (https://www.digitalocean.com/community/tutorials/understanding-class-inheritance-in-python-3 is a decent overview of class inheritance in python), and I imagine will cause a few issues because TotalNetworkObjects
also inherits from Query
, which would give you a somewhat odd inheritance hierarchy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I'll amend the code accordingly. (I do understand python class hierarchies, but I'll have a quick skim of that page)
Codecov Report
@@ Coverage Diff @@
## master #607 +/- ##
==========================================
+ Coverage 93.38% 93.44% +0.05%
==========================================
Files 122 123 +1
Lines 6199 6253 +54
Branches 666 668 +2
==========================================
+ Hits 5789 5843 +54
+ Misses 286 285 -1
- Partials 124 125 +1
Continue to review full report at Codecov.
|
subscriber_identifier=subscriber_identifier, | ||
) | ||
def __init__(self, total_objs, statistic="avg", by=None): | ||
self.total_objs = copy.deepcopy(total_objs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need a deepcopy here, just an assign is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below this, period will now need to refer to the period attribute of the passed in object.
|
||
|
||
class AggregateNetworkObjectsExposed(BaseExposedQuery): | ||
def __init__(self, *, start_date, end_date, aggregation_unit): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to match the AggregateNetworkObjects
form now
|
||
class AggregateNetworkObjectsSchema(Schema): | ||
|
||
start_date = fields.Date(required=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to have the right fields, one of which is a bit tricky because it should refer to another schema. See FlowsSchema, or ModalLocationSchema for an example of how to do that.
flowmachine/flowmachine/core/server/query_schemas/aggregate_network_objects.py
Show resolved
Hide resolved
@OwlHute FYI, PRs #609 (spatial aggregate) and #611 (joined spatial aggregate) have now been merged, so hopefully that helps with this PR as well (see Jono's inline comment). Github is indicating a couple of merge conflicts with this branch due to the changes in #609 and #611 so these will need to be resolved (but should be very simple). |
Hi Max. I've merged the lastest "master" branch with my aggregate_network_objects private branch & pushed the latter to github for #607 Running the tests on my workspace, there is one failure. But I am not clear on the reason for this & could use a bit of help to diagnose & fix it. (I think the code is 99% OK and expect only a very small fix is needed, but the trick is knowing where and what! ) |
Thanks @OwlHute. Which tests are failing for you locally? When I click on the "Details" link next to the integration checks failure here on Github (see here) it shows two failing tests:
The first one seems to be from a missing update to the API spec. Have you been able to get the approvaltests running locally? If so, it should be just a matter of running this particular test (e.g. via For the second failing tests, if you run it via |
Thanks for the review comment Max. I have fixed the first error. However, I am still slightly puzzled by the second issue. From the test output, the following is the query being sent to flowapi :
and referring to total_network_objects() in flowclient/flowclient/client.py that 'total_network_objects' dictionary is correct:
It also matches the parameters in the AggregateNetworkObjects class in flowmachine/flowmachine/core/server/query_schemas/total_network_objects.py :
Finally, in flowmachine/flowmachine/core/server/query_schemas/aggregate_network_objects.py the actual query is run as:
The point is "spatial_aggregate" is not present in the input query, only in the output results. I assume the latter call must introduce SpatialAggregate into the mix, but I'm not clear where and what need amending for this. (Once I've fixed this issue, I should be able to handle any similar changes in future.) It's easy to see what the code is trying to achieve in principle, but all the same the details are damned intricate! |
I've got a bit further, and identified the reason for the test failure. (A bug in the test code prevents this being displayed properly!):
The bug is that the dict returned by response.json() contains no 'payload' key, so the error was being displayed as just "KeyError: 'payload'". So the next step is to find out where that message "Aggregation unit must be specified when running a query" is generated. (For the new aggregate_network_objects() API route, it is no longer appropriate!), This turns out to be in /flowapi/flowapi/user_model.py function _get_query_kinds_and_aggregation_units() and the JSON it objects to is:
and I can see nothing wrong with that query! |
Thanks @OwlHute! Yes, you are right that the missing I had a look and the error happens in However, This is a bit of a hack for the time being, but unless I'm missing something it would require larger refactorings to have a more elegant solution and I guess this is outside the scope of this particular PR. Let me know if this explanation makes sense and is sufficient to make the required tweaks. |
tot_network_objs = self.total_network_objects._flowmachine_query_obj | ||
|
||
return AggregateNetworkObjects( | ||
tot_network_objs, statistic=self.statistic, by=self.by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jono's latest changes require this to be a keyword argument now:
tot_network_objs, statistic=self.statistic, by=self.by | |
total_network_objects=tot_network_objs, statistic=self.statistic, by=self.by |
flowclient/flowclient/client.py
Outdated
TotalNetworkObjects query result | ||
statistic : str | ||
Statistic type | ||
aggregate_by : str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aggregate_by : str | |
aggregate_by : {"second", "minute", "hour", "day", "month", "year", "century" |
CHANGELOG.md
Outdated
@@ -5,6 +5,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). | |||
|
|||
## [Unreleased] | |||
### Added | |||
- Added new flowclient API entrypoint, aggregate_network_objects(), to access (with simplified parameters) equivalent flowmachine query [#601](https://github.com/Flowminder/FlowKit/issues/601) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Added new flowclient API entrypoint, aggregate_network_objects(), to access (with simplified parameters) equivalent flowmachine query [#601](https://github.com/Flowminder/FlowKit/issues/601) | |
- Added new flowclient API entrypoint, `aggregate_network_objects`, to access equivalent flowmachine query [#601](https://github.com/Flowminder/FlowKit/issues/601) |
CHANGELOG.md
Outdated
@@ -5,6 +5,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). | |||
|
|||
## [Unreleased] | |||
### Added | |||
- Added new flowclient API entrypoint, aggregate_network_objects(), to access (with simplified parameters) equivalent flowmachine query [#601](https://github.com/Flowminder/FlowKit/issues/601) | |||
|
|||
### Changed | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a changed entry here as well to reflect that the names of the arguments to TotalNetworkObjects
and AggregateNetworkObjects
have been altered?
CHANGELOG.md
Outdated
@@ -14,10 +15,12 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). | |||
|
|||
## [0.5.3] | |||
### Added | |||
- Amended time aggregation parameter names used in existing `total_network_objects` query from "period" in flowclient & "by" in flowmachine to "total_by" in both. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to go in the [Unreleased] section, and should mention the underlying TotalNetworkObjects
and AggregateNetworkObjects
classes
CHANGELOG.md
Outdated
@@ -5,6 +5,9 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). | |||
|
|||
## [Unreleased] | |||
### Added | |||
- Amended time aggregation parameter names used in existing `total_network_objects` query from "period" in flowclient & "by" in flowmachine to "total_by" in both. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be pedantic, but can this go in the changed section of [Unreleased], and can we explicitly refer to TotalNetworkObjects
and AggregateNetworkObjects
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
The CHANGELOG Unreleased "Changed" entry now reads:
So I think in that 4th line |
Closes #601
I have:
Description
Add API route aggregate_network_objects()