-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API functionality revamp, text fixes, README revamp #15
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Merge UNSW-CEEM master into fork master - pocket rocket nemosis changes made
prakaa
changed the title
CLI data type inferal, FCAS 4s test fixes, minor test refactoring + updates
API type inferral, API caching function, test fixes
May 22, 2021
prakaa
changed the title
API type inferral, API caching function, test fixes
API type inference, API caching function, test fixes
May 22, 2021
prakaa
changed the title
API type inference, API caching function, test fixes
API functionality revamp, test fixes
May 22, 2021
API revamped functionaly merge
prakaa
changed the title
API functionality revamp, test fixes
API functionality revamp, text fixes, README revamp
May 23, 2021
@nick-gorman see outline of all changes above. GUI still needs to be tested (checkbox unticked) |
Looks good Abi, I'll merge, compile the GUI, draft a release and publish to pypi |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
API functionality revamp (type inference for some API functions) and fix tests, major README changes
Initial PR made 8/5/2021. Leaving PR open though further changes are being made - as these are incorportated into the PR, I will tick tasks off.
API (Type inference & other changes)
Initial fixes
parse_data_types
, which is default True for API and set to false in a gui wrapper function (follows structure of other functions wrapped for GUI). This parses data types on reading the AEMO csv.parse_data_types
will not parse the data types when reading existing files.Further functionality
cache_compiler
option that has typical cache args fromdynamic_data_compiler
built in (e.g.keep_csv=False
,fformat=parquet
orfformat=feather
anddata_merge=False
. It will infer data types when CSVs from AEMO are downloaded and read in.cache_compiler
parse_data_types
will remain but will parse data types of the DataFrame regardless of file type (i.e. parsing when cache or new file read, not just when new file read). Data from csv will always be read in as stringdynamic_data_compiler
has concatenated the list of DataFrames that_dynamic_data_fetch_loop
returns. Parsing before concatenation can lead to typed columns being reverted to object once concatenation occurs (e.g. INTERVENTION went from Int to object).filter_cols
andfilter_values
. If a user provides a numeric filter value (e.g. RAISE5MIN=5), the pre-parsed DataFrame will have all columns as objects and therefore return an empty DataFrame (unless the user provides RAISE5MIN="5"). This is not expected behaviour, so parsing occurs before filtering. Datetimes can be filtered using user-provided datetime strings or datetime objectsparse_data_types=False
since GUI uses string joinsparse_data_types=True
since operations on columns may require them to be numericfrom nemosis import dynamic_data_compiler
is possible)Code readability
dynamic_data_compiler
andcache_compiler
will be broken out into private functions.Readme
dynamic_data_compiler
section with more advanced filtering examples.cache_compiler
, with note that it will delete csvs in a cache. However, if it detects pre-cached feather or parquet files, it will not do anything (e.g. ifcache_compiler
is run in the GUI cache, it will print that the cache has already been compiled)Changes to tests
Other
data_fetch_methods.py
,filters.py
andtest_data_fetch_methods.py
styled (flake8)Testing
Test suite run to ensure newer changes to data_fetch_methods work. Report for tests:
Test Report.pdf
f963eb0 passed
New changes tested for API (spot checks) with fresh install of Python on Ubuntu 20.04
dynamic_data_compiler
downloads DISPATCHLOAD csv, releases feather file. The returned DataFrame is typed (which should happen for API users), but the saved feather had columns as objects/strings.cache_compiler
releases parquet/feather for DISPATCHLOAD and deletes csv in cache. The remaining file is typed. Different compression engines were passed to the write function and this worked. The file was then reloaded usingdynamic_data_compiler
and this worked, with a typed DataFrame loaded.Quick performance test:
%timeit data_fetch_methods.dynamic_data_compiler("2018/01/01 00:00:00", "2018/01/01 23:55:00", "DISPATCHLOAD", './alt_data')
with precompiled feather cacheNew changes tested for GUI (spot checks)