Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #459: In CI, polars error: col timestamp_right already exists #460

Merged
merged 9 commits into from
Dec 18, 2023

Conversation

trentmc
Copy link
Member

@trentmc trentmc commented Dec 17, 2023

No description provided.

@trentmc trentmc merged commit bbfef0d into yaml-cli2 Dec 18, 2023
5 checks passed
@trentmc trentmc deleted the issue459 branch December 18, 2023 14:32
@@ -120,6 +119,9 @@ def create_xy(
assert X.shape[1] == ss.n
assert isinstance(x_df, pd.DataFrame)

assert "timestamp" not in x_df.columns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming no timestamp because now it's the pd index

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

merged_df = newraw_df
else:
merged_df = merged_df.join(newraw_df, on="timestamp", how="outer")
merged_df = merge_cols(merged_df, "timestamp", "timestamp_right")
Copy link
Member

@idiom-bytes idiom-bytes Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the reason why we are getting _right columns, is because of the outer join. Generally, I don't like using outer joins as they can introduce unexpected data.

As long as we use left joins on expected timestamps, this should not happen. This can be done in two ways:

  1. With a dataset you trust (i.e. kaiko that may have guaranteed timestamps)
  2. With a dataset you build (i.e. just populate 1 column w/ expect values)

example

# Create a range of timestamps with 5-minute intervals
timestamps = pd.date_range(start='2023-01-01', periods=100, freq='5T')
timestamps = timestamps.astype(int) // 10**9  # Convert to Unix timestamps in seconds

# Create a Polars DataFrame
df = pl.DataFrame({
    "timestamp": timestamps
})

merged_df.extend(df)
merged_df = merged_df.join(newraw_df, on="timestamp", how="left")

This avoids:

  • situations where you introduce bad records into the data structure due to outer join
  • situations where you get _right/_left columns due to outer join
  • the need to run merge_cols() to drop _left/_right

This also enforces strict timestamp records in your data structure that you're expecting. I.E. It either joins w/ expected timestamps, or it's dropped.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fyi I was building off the outer join code that you had introduced.

We can't just use "data we trust". There's always going to be missing values etc and we need to account for it. Eg most people using pdr-backend will not be buying Kaiko data.

The example improvement you gave us to a good idea. If you want to make it happen, I recommend to make a new github issue.

trizin added a commit that referenced this pull request Jan 16, 2024
* First stab at porting various functions over to polars... lots to go

* TOHLCV df initialization and type checking added. 2/8 pdutil tests passing

* black formatted

* Fixing initialization and improving test. datetime is not generated if no timestamp is present

* Restructured pdutil a bit to reduce DRY and utilize schema more strictly.

* test initializing the df and datetime

* improve init test to show exception without timestamp

* fixing test_concat such that it verifies that schemas must match, and how transform handles datetime

* saving parquet enforces datetime and transform. updated test_load_append and test_load_filtered.

* black formatted

* data_eng tests are passing

* initial data_eng tests are passing w/ black, mypy, and pylint.

* _merge_parquet_dfs updated and create_xy test_1 is passing. all data_eng tests that are enabled are passing.

* 2exch_2coins_2signals is passing

* Added polars support for fill_nans, has_nans, and create_xy__handle_nan is passing.

* Starting to deprecate references to pandas and csv in data_factory.

* Black formatted

* Deprecated csv logic in DataFactory and created tests around get_hist_df() to verify that its working as intended. I believe kraken data is returning null at the moment.

* All tests should be passing.

* Fix #370: YAML & CLI (#371)

* Towards #232: Refactoring towards ppss.yaml part 3/3
* move everything in model_eng/ to data_eng/
* Fix #352: [SW eng] High DRY violation in test_predictoor_agent.py <> test_predictoor_agent3.py
* Deprecate backend-dev.md (long obsolete), macos.md (obsolete due to vps), and envvars.md (obsolete because of ppss.yaml).
* Rename BaseConfig to web3_pp.py and make it yaml-based
* Move scripts into util/, incorporate them into pdr cli, some refactoring.
* revamp READMEs for cli. And, tighten up text for getting OCEAN & ROSE
* Deprecated ADDRESS_FILE and RPC_URL envvars.
* deprecate Predictoor approach 2. Pita to maintain 


Co-authored-by: trizin <[email protected]>

* Update CI to use pdr instead of scripts/ (#399)

* Update check script CI

* Update cron topup

* Workflow dispatch

* Nevermind, revert previous commit

* Run on push to test

* Pass ppss.web3_pp instead of web3_config

* Don't run on push

* Replace long try/except with _safe*() function; rename pdutil -> plutil; get linters to pass

* Update entrypoint script to use pdr cli (#406)

* Add main.py back (#404)

* Add main.py back

* Black

* Linter

* Linter

* Remove "switch back to version v0.1.1"

* Black

* make black happy

* small bug fix

* many bug fixes. Still >=1 left

* fix warning

* Add support for polars where needed

* tweak docstring

* Fix #408: test_sim_engine failing in yaml-cli2, bc hist_df is s not ms. Proper testing and documentation was added, as part of the fix

* BaseContract tests that Web3PP type is input

* goes with previous commit

* tweak - lowercase

* Bug fix - fix failing tests

* Remove unwanted file

* (a) better organize ppss.yaml for usability (b) ensure user isn't annoyed by git with their copy of ppss.yaml being my_ppss.yaml

* add a more precise test for modeling

* make black happy

* Small refactor: make transform_df() part of helper routine

* Fix #414: Split data_factory into (1) CEX -> parquet -> df (2) df -> X,y for models

* Fix #415: test_cli_do_dfbuyer.py is hanging #415

* test create_xy() even more. Clarify the order of timestamps

* Add a model-building test, using data shaped like data from test_model_data_factory

* Fix #416: [YAML branch] No Feeds Found - data_pp.py changes pair standards

* For barge#391: update to *not* use barge's predictoor branch

* Update vps.md: nicer order of operations

* For #417, #418 in yaml-cli2 branch. publisher TUSD -> USDT

* remove default_network from ppss.yaml (obsolete)

* Fix #427 - time now

* Fix #428: test_get_hist_df - FileNotFoundError. Includes lots of extra robustness checking

* remove dependency that we don't need, which caused problems

* Fix #421: Add cli + logic to calculate and plot traction metrics (PR #422)

Also: mild cleanup of CLI.

* bug fix: YAML_FILE

* fix breaking test; clean it up too

* add barge-calls.md

* Fix #433. Calculate metrics and draw plots for epoch-based stats (PR #434)

#433 : "Plot daily global (pair_timeframe x20) <average predictoors> and <average stake>, by sampling slots from each day."

* Tweak barge-calls.md

How: show origin of NETWORK_RPC_URL

* Tweak barge-calls.md: more compactly show RPC_URL calc

* update stake_token

* bug fix

* Update release-process.md: bug fix

* Tweak barge-calls.md

* Tune #405 (PR #406): Update entrypointsh script to use pdr CLI

* Update vps.md: docker doesn't need to prompt to delete

* Update vps.md: add docker-stop instrs

* allow CLI to have NETWORK_OVERRIDE, for more flexiblity from barge

* fix pylint issue

* Update barge-calls.md: link to barge.md

* Update release-process.md: fix typo

* touch

* Update vps.md: more instrs around waiting for barge to be ready

* add unit tests for cli_module

* Towards #437: [YAML] Publisher error 'You must set RPC_URL environment variable'

* Bug fixes

* refactor tweaks to predictoor and trader

* Clean up some envvar stuff. Document ppss vars better.

* publish_assets.py now supports barge-pytest and barge-predictoor-bot

* bug fix

* bug fix the previous 'bug fix'

* Clean up how dfbuyer/predictoor/trader agents get feeds: web3_pp.query_feed_contracts() -> data_pp.filter_feeds(); no more filtering within subgraph querying; easier printing & logging. Add timeframestr.Timeframe. Add feed.mock_feed. All tests pass.

* fix breaking subgraph tests. Still breakage in trader & dfbuyer (that's next)

* Fix failing tests in tradder, dfbuyer. And greatly speed up the tests, via better mocking.

* Fix bugs for failing tests of https://github.com/oceanprotocol/pdr-backend/actions/runs/7156603163/job/19486494815

* fix tmpdir bug

* Fix (hopefully) failing unit test - restricted region in querying binance api

* consolidate gas_price setting, make it consistent; set gas_price to 0 for development/barge

* fix linter complaints

* Fix remaining failing unit tests for predictoor_batcher

* Finish the consolidation of gas pricing. All tests pass

* Update vps.md: add debugging info

- Where to find queries
- Key docker debugging commands

* add to/from wei utility. Copied from ocean.py

* tweak docs in conftest_ganache

* tweaks from black for wei

* Make fixed_rate.py and its test easier to understand via better var naming & docs

* Make predictoor_contract.py easier to understandn via better var anming & docs

* test fixed_rate calcBaseInGivenOutDT

* Refactor predictoor_contract: push utility methods out of the class, and into more appropriate utility modules. And, move to/from_wei() from wei.py to mathutil.py. Test it all.

* Tweak docstrings for fixed_rate.py

* Improve DX: show dev what the parameters are. Improve UX: print when done.

* Improve DX & UX for predictoor_contract

* Tweak UX (prints)

* Update vps.md: export PATH

* Logging for predictoor is way better: more calm yet more informative. Predictoors only do 1 feed now.

* TraderAgent -> BaseTraderAgent

* Rename parquet_dfs -> rawohlcv_dfs; hist_df -> mergedohlcv_df; update related

* apply black to test_plutil.py

* apply black to test_model_data_factory.py

* apply black to ohlcv_data_factory.py

* refactor test_ohlcv_data_factory: cleanup mocks; remove redundant test; pq_data_factory -> factory

* Fix #443: [YAML] yaml timescale is 5m, yet predictoor logs s_per_epoch=3600 (1h)

* Update feed str() to give full address; and order to be similar to predict_feeds_strs. Show all info used in filtering feeds.

* Small bug fix: not printing properly

* Tweak: logging in predictoor_contract.py

* Tweak: logging in trueval_agent_single.py

* Two bug fixes: pass in web3_pp not web3_config to PredictoorContract constructor

* enhance typechecking

* tweak payout.py: make args passed more obvious

* fix broken unit test

* make black happy

* fix breaking unit test

* Tweak predictoor_contract DX & UX

* Improve trueval: Have fewer layers of try/except, better DX via docstrings and more, better UX via logs

* Rename TruevalAgentBase -> BaseTruevalAgent

* (a) Fix #445: merge 3 trueval agent files into 1. (b) Fix #448 contract.py::get_address() doesn't handle 'sapphire-testnet' etc #448. (c) test_contract.py doesn't test across all networks we use, and it's missing places where we have errors (d) clean up trueval agent testing (e) move test_contract.py into test_noganache since it doesn't use ganache

* Fix #450: test_contract_main[barge-pytest] fails

* renaming pq_data_factory to ohlcv_data_factory

* Removing all TODOs

* Fix #452: Add clean code guidelines README

* removing dangling _ppss() inside predictoor_agent_runner.py

* Fixing linter

* Fix #454: Refactor: Rename MEXCOrder -> MexcOrder, ERC721Factory

* Fix #455: Cyclic import issue

* Fix #454 redux: the first commit introduced a bug, this one fixes the bug

* Fix #436 - Implement GQL data factory (PR #438)

* First pass on gql data factory

Co-authored-by: trentmc <[email protected]>

* Fix #350: [Sim] Tweaks to plot title

* make black happy

* Fix #446: [YAML] Rename/move files & dirs for proper separation among lake, AI models, analytics (#458)

* rename data_eng/ -> lake/
* rename model_factory -> aimodel_factory, model_data_factory -> aimodel_data_factory, model_ss -> aimodel_ss
* for any module in util/ that should be in analytics/, move it
* for any module in util/ that should be in lake/, move it. Including get_*_info.py
* create dir subgraph/ and move all *subgraph*.py into it. Split apart subgraph.py into core_subgraph.py and more
* apply mathutil.to_wei() and from_wei() everywhere
* move contents of util/test_data.py (a bunch of sample predictions) into models/predictions.py. Fix DRY violations in related conftest.pys
* note: there are 2 failing unit tests related to polars and "timestamp_right" column. However they were failing before. I just created a separate issue for that: #459

* Fix #459: In CI, polars error: col timestamp_right already exists (#460)

Plus:
* remove datetime from all DFs, it's too problematic, and unneeded
* bug fix: wasn't mocking check_dfbuyer(), so CI was failing

* Fix #397: Remove need to specify 'stake_token' in ppss.yaml (#461)

* Docs fixes (#456)

* Make Feeds objects instead of tuples. (#464)

* Make Feeds objects instead of tuples.
* Add namings for different feed objects.
* Move signal at the end.

* Move and rename utils (#467)

* Move and rename utils

* Objectify pairstr. (#470)

* Objectify pairstr.
* Add possibility for empty signal in feeds.
* Move and add some timeframe functions.
* Move exchangestr.

* Towards #462: Separate lake and aimodel SS, lake command (#473)

* Split aimodel and lake ss.
* Split data ss tests.
* Add aimodel ss into predictoor ss.
* Remove stray data_ss.
* Moves test_n to sim ss.
* Trader ss to use own feed instead of data pp.
* Remove data pp entirely.
* Correct ohlcv data factory.
* Add timeframe into arg feeds.
* Refine and add tests for timeframe in arg feed.
* Remove timeframe dependency in trader and predictoor.
* Remove timeframe from lake ss keys.
* Singleify trader agents.
* Adds lake command, assert timeframe in lake (needed for columns).
* Process all signals in lake.

* [Lake] integrate pdr_subscriptions into GQL Data Factory (#469)

* first commit for subscriptions

* hook up pdr_subscriptions to gql_factory

* Tests passing, expanding tests to support multiple tables

* Adding tests and improving handling of empty parquet files

* Subscriptions test

* Updating logic to use predictSubscriptions, take lastPriceValue, and to not query the subgraph more than needed.

* Moving models from contract/ -> subgraph/

* Fixing pylint

* fixing tests

* adding @enforce_types

* Improve DRY (#475)

* Improve DRY in cli module.
* Add common functionality to single and multifeed entries.
* Remove trader pp and move necessary lines into trader ss.
* Adds dfbuyer filtering.
* Remove exchange dict from multifeed mixin.
* Replace name of predict_feed.
* Add base_ss tests.
* Adds trueval filtering.

* Add Code climate. (#484)

* Adds manual trigger to pytest workflow.

* issue483: move the logic from subgraph_slot.py (#489)

* Add some test coverage (#488)

* Adds a line of coverage to test.
* Add coverage for csvs module.
* Add coverage to check_network.
* Add coverage to predictions and traction info.
* Adds coverage to predictoor stats.
* Adds full coverage to arg cli classes.
* Adds cli arguments coverage and fix a wrong parameter in cli arguments.
* Adds coverage to cli module and timeframe.
* Some reformats and coverage in contract module.
* Adds coverage and simplifications to contracts, except token.
* Add some coverage to tokens to complete contract coverage work.

* Fix #501: ModuleNotFoundError: No module named 'flask' (PR #504)

* Fix #509: Refactor test_update_rawohlcv_files (PR #508)

* Fix #505: polars.exceptions.ComputeError: datatypes of join keys don't match (PR #510)

* Refactor: new function clean_raw_ohlcv() that moves code from _update_rawohlcv_files_at_feed(). It has sub-functions with precise responsibilities. It has tests.
* Add more tests for merge_raw_ohlcv_dfs, including one that replicates the original issue
* Fix the core bug, now the new tests pass. The main fix is at the top of merge_df::_add_df_col()
* Fix failing test due to network override. NOTE: this may have caused the remaining pytest error. Will fix that after this merge

* Fix #517: aimodel_data_factory.py missing data: binance:BTC/USDT:None (PR #518)

Fixes #517

Root cause: ppss.yaml's aimodel_ss feeds section didn't have eg "c" or "ohlcv"; it assumed that they didn't need to be specified. This was an incorrect assumption: aimodel_ss needs it. In fact aimodel_ss class supports these signals, but the yaml file didn't have it.

What this PR does:
- add a test to aimodel_ss class constructor that complains if not specified
- do specify signals in the ppss.yaml file

Note: the PR pytest failed, but for an unrelated reason. Just created #520 for follow-up.

* Towards #494: Improve coverage 2 (#498)

* Adds some coverage to dfbuyer agent.
* Add dfbuyer and ppss coverage.
* Adds predictoor and sim coverage.
* Adds coverage to util.
* Add some trueval coverage.
* Adds coverage to trader agents.
* Add coverage to portfolio.
* Add coverage to subgraph consume_so_far and fix an infinite loop bug.
* More subgraph coverage.

* Fix #519: aimodel_data_factory.py missing data col: binance:ETH/USDT:close (#524)

Fix #519

Changes:
- do check for dependencies among various ppss ss feeds
- if any of those checks fails, give a user-friendly error message
  - greatly improved printing of ArgFeeds, including merging across pairs and signals. This was half the change of this PR
- appropriate unit tests

* Replace `dftool` with `pdr` (#522)

* Print texts: dftool -> pdrcli

* pdrcli -> pdr

* Fix #525: Plots pop up unwanted in tests. (PR #528)

Fix by mocking plt.show().

* Issue 519 feed dependencies (#529)

* Make missing attributes message more friendly and integrate ai ss part to multimixin.

* Update to #519: remove do_verify, it's redundant (#532)

* Fix #507: fix asyncio issues (PR #531)

How fixed: use previous ascynio version.

Calina: Asyncio has some known issues, per their changelog. Namely issues with fixture handling etc., which I believe causes the warnings and test skips in our runs. They recommend using the previous version until they are fixed. It is also why my setup didn't spew up any warnings, my asyncio version was 21.1.

https://pytest-asyncio.readthedocs.io/en/latest/reference/changelog.html

* #413 - YAML thorough system level tests (#527)

* Fix web3_config.rpc_url in test_send_encrypted_tx

* Add conftest.py for system tests

* Add system test for get_traction_info

* Add system test for get_predictions_info

* Add system test for get_predictoors_info

* Add "PDRS" argument to _ArgParser_ST_END_PQDIR_NETWORK_PPSS_PDRS class

* Fix feed.exchange type conversion in publish_assets.py

* Add print statement for payout completion

* Add system level test for pdr topup

* Add conditional break for testing via env

* Add conditional break for testing via env

* Black

* Add test for pdr rose payout system

* System level test pdr check network

* System level test pdr claim OCEAN

* System level test pdr trueval agent

* Remove unused patchs

* Fix wrong import position in conftest.py

* Remove unused imports

* System level test for pdr dfbuyer

* System level tests for pdr trader

* System level tests for publisher

* Rename publisher test file

* Add conditional break in take_step() method

* Update dftool->pdr names in system tests

* Refactor test_trader_agent_system.py

* Add mock fixtures for SubgraphFeed and PredictoorContract

* Add system tests for predictoor

* Black

* Refactor system test files - linter fixes

* Linter fixes

* Black

* Add missing mock

* Add savefig assertion in test_topup

* Update VPS configuration to use development entry

* Patch verify_feed_dependencies

* Refactor test_predictoor_system.py to use a common test function

* Refactor trader approach tests to improve DRY

* Black

* Indent

* Ditch NETWORK_OVERRIDE

* Black

* Remove unused imports

* Adds incremental waiting for subgraph tries. (#534)

* Add publisher feeds filtering. (#533)

* Add publisher feeds filtering.

* Pass the ppss.web3_pp instead of web3_config into WrappedToken class (#537)

* Fix #542: Add code climate usage to developer flow READMEs

* #538 - check network main subgraph query fails (#539)

* Use current time in seconds utc

* Remove unused import

* Fix the check_network test

* Divide current_ut by 1000

* test_check_network_without_mock, WIP

* Add missing import

* Implement current_ut_s

* Use current_ut_s

* Update tests

* Formatting

* Return int

* Remove balanceOf assert

* Remove unused import

* current_ut -> current_ut_ms

* #540 - YAML CLI topup and check network actions require address file (#541)

* GH workflow: Fetch the address file and move it to contracts directory

* Fetch and move the address file to address dir

* Remove predictoor2 ref from pytest

---------

Co-authored-by: idiom-bytes <[email protected]>
Co-authored-by: trizin <[email protected]>
Co-authored-by: Idiom <[email protected]>
Co-authored-by: Călina Cenan <[email protected]>
Co-authored-by: Mustafa Tunçay <[email protected]>
idiom-bytes added a commit that referenced this pull request Jan 19, 2024
* First stab at porting various functions over to polars... lots to go

* TOHLCV df initialization and type checking added. 2/8 pdutil tests passing

* black formatted

* Fixing initialization and improving test. datetime is not generated if no timestamp is present

* Restructured pdutil a bit to reduce DRY and utilize schema more strictly.

* test initializing the df and datetime

* improve init test to show exception without timestamp

* fixing test_concat such that it verifies that schemas must match, and how transform handles datetime

* saving parquet enforces datetime and transform. updated test_load_append and test_load_filtered.

* black formatted

* data_eng tests are passing

* initial data_eng tests are passing w/ black, mypy, and pylint.

* _merge_parquet_dfs updated and create_xy test_1 is passing. all data_eng tests that are enabled are passing.

* 2exch_2coins_2signals is passing

* Added polars support for fill_nans, has_nans, and create_xy__handle_nan is passing.

* Starting to deprecate references to pandas and csv in data_factory.

* Black formatted

* Deprecated csv logic in DataFactory and created tests around get_hist_df() to verify that its working as intended. I believe kraken data is returning null at the moment.

* All tests should be passing.

* Fix #370: YAML & CLI (#371)

* Towards #232: Refactoring towards ppss.yaml part 3/3
* move everything in model_eng/ to data_eng/
* Fix #352: [SW eng] High DRY violation in test_predictoor_agent.py <> test_predictoor_agent3.py
* Deprecate backend-dev.md (long obsolete), macos.md (obsolete due to vps), and envvars.md (obsolete because of ppss.yaml).
* Rename BaseConfig to web3_pp.py and make it yaml-based
* Move scripts into util/, incorporate them into pdr cli, some refactoring.
* revamp READMEs for cli. And, tighten up text for getting OCEAN & ROSE
* Deprecated ADDRESS_FILE and RPC_URL envvars.
* deprecate Predictoor approach 2. Pita to maintain 


Co-authored-by: trizin <[email protected]>

* Update CI to use pdr instead of scripts/ (#399)

* Update check script CI

* Update cron topup

* Workflow dispatch

* Nevermind, revert previous commit

* Run on push to test

* Pass ppss.web3_pp instead of web3_config

* Don't run on push

* Replace long try/except with _safe*() function; rename pdutil -> plutil; get linters to pass

* Update entrypoint script to use pdr cli (#406)

* Add main.py back (#404)

* Add main.py back

* Black

* Linter

* Linter

* Remove "switch back to version v0.1.1"

* Black

* make black happy

* small bug fix

* many bug fixes. Still >=1 left

* fix warning

* Add support for polars where needed

* tweak docstring

* Fix #408: test_sim_engine failing in yaml-cli2, bc hist_df is s not ms. Proper testing and documentation was added, as part of the fix

* BaseContract tests that Web3PP type is input

* goes with previous commit

* tweak - lowercase

* Bug fix - fix failing tests

* Remove unwanted file

* (a) better organize ppss.yaml for usability (b) ensure user isn't annoyed by git with their copy of ppss.yaml being my_ppss.yaml

* add a more precise test for modeling

* make black happy

* Small refactor: make transform_df() part of helper routine

* Fix #414: Split data_factory into (1) CEX -> parquet -> df (2) df -> X,y for models

* Fix #415: test_cli_do_dfbuyer.py is hanging #415

* test create_xy() even more. Clarify the order of timestamps

* Add a model-building test, using data shaped like data from test_model_data_factory

* Fix #416: [YAML branch] No Feeds Found - data_pp.py changes pair standards

* For barge#391: update to *not* use barge's predictoor branch

* Update vps.md: nicer order of operations

* For #417, #418 in yaml-cli2 branch. publisher TUSD -> USDT

* remove default_network from ppss.yaml (obsolete)

* Fix #427 - time now

* Fix #428: test_get_hist_df - FileNotFoundError. Includes lots of extra robustness checking

* remove dependency that we don't need, which caused problems

* Fix #421: Add cli + logic to calculate and plot traction metrics (PR #422)

Also: mild cleanup of CLI.

* bug fix: YAML_FILE

* fix breaking test; clean it up too

* add barge-calls.md

* Fix #433. Calculate metrics and draw plots for epoch-based stats (PR #434)

#433 : "Plot daily global (pair_timeframe x20) <average predictoors> and <average stake>, by sampling slots from each day."

* Tweak barge-calls.md

How: show origin of NETWORK_RPC_URL

* Tweak barge-calls.md: more compactly show RPC_URL calc

* update stake_token

* bug fix

* Update release-process.md: bug fix

* Tweak barge-calls.md

* Tune #405 (PR #406): Update entrypointsh script to use pdr CLI

* Update vps.md: docker doesn't need to prompt to delete

* Update vps.md: add docker-stop instrs

* allow CLI to have NETWORK_OVERRIDE, for more flexiblity from barge

* fix pylint issue

* Update barge-calls.md: link to barge.md

* Update release-process.md: fix typo

* touch

* Update vps.md: more instrs around waiting for barge to be ready

* add unit tests for cli_module

* Towards #437: [YAML] Publisher error 'You must set RPC_URL environment variable'

* Bug fixes

* refactor tweaks to predictoor and trader

* Clean up some envvar stuff. Document ppss vars better.

* publish_assets.py now supports barge-pytest and barge-predictoor-bot

* bug fix

* bug fix the previous 'bug fix'

* Clean up how dfbuyer/predictoor/trader agents get feeds: web3_pp.query_feed_contracts() -> data_pp.filter_feeds(); no more filtering within subgraph querying; easier printing & logging. Add timeframestr.Timeframe. Add feed.mock_feed. All tests pass.

* fix breaking subgraph tests. Still breakage in trader & dfbuyer (that's next)

* Fix failing tests in tradder, dfbuyer. And greatly speed up the tests, via better mocking.

* Fix bugs for failing tests of https://github.com/oceanprotocol/pdr-backend/actions/runs/7156603163/job/19486494815

* fix tmpdir bug

* Fix (hopefully) failing unit test - restricted region in querying binance api

* consolidate gas_price setting, make it consistent; set gas_price to 0 for development/barge

* fix linter complaints

* Fix remaining failing unit tests for predictoor_batcher

* Finish the consolidation of gas pricing. All tests pass

* Update vps.md: add debugging info

- Where to find queries
- Key docker debugging commands

* add to/from wei utility. Copied from ocean.py

* tweak docs in conftest_ganache

* tweaks from black for wei

* Make fixed_rate.py and its test easier to understand via better var naming & docs

* Make predictoor_contract.py easier to understandn via better var anming & docs

* test fixed_rate calcBaseInGivenOutDT

* Refactor predictoor_contract: push utility methods out of the class, and into more appropriate utility modules. And, move to/from_wei() from wei.py to mathutil.py. Test it all.

* Tweak docstrings for fixed_rate.py

* Improve DX: show dev what the parameters are. Improve UX: print when done.

* Improve DX & UX for predictoor_contract

* Tweak UX (prints)

* Update vps.md: export PATH

* Logging for predictoor is way better: more calm yet more informative. Predictoors only do 1 feed now.

* TraderAgent -> BaseTraderAgent

* Rename parquet_dfs -> rawohlcv_dfs; hist_df -> mergedohlcv_df; update related

* apply black to test_plutil.py

* apply black to test_model_data_factory.py

* apply black to ohlcv_data_factory.py

* refactor test_ohlcv_data_factory: cleanup mocks; remove redundant test; pq_data_factory -> factory

* Fix #443: [YAML] yaml timescale is 5m, yet predictoor logs s_per_epoch=3600 (1h)

* Update feed str() to give full address; and order to be similar to predict_feeds_strs. Show all info used in filtering feeds.

* Small bug fix: not printing properly

* Tweak: logging in predictoor_contract.py

* Tweak: logging in trueval_agent_single.py

* Two bug fixes: pass in web3_pp not web3_config to PredictoorContract constructor

* enhance typechecking

* tweak payout.py: make args passed more obvious

* fix broken unit test

* make black happy

* fix breaking unit test

* Tweak predictoor_contract DX & UX

* Improve trueval: Have fewer layers of try/except, better DX via docstrings and more, better UX via logs

* Rename TruevalAgentBase -> BaseTruevalAgent

* (a) Fix #445: merge 3 trueval agent files into 1. (b) Fix #448 contract.py::get_address() doesn't handle 'sapphire-testnet' etc #448. (c) test_contract.py doesn't test across all networks we use, and it's missing places where we have errors (d) clean up trueval agent testing (e) move test_contract.py into test_noganache since it doesn't use ganache

* Fix #450: test_contract_main[barge-pytest] fails

* renaming pq_data_factory to ohlcv_data_factory

* Removing all TODOs

* Fix #452: Add clean code guidelines README

* removing dangling _ppss() inside predictoor_agent_runner.py

* Fixing linter

* Fix #454: Refactor: Rename MEXCOrder -> MexcOrder, ERC721Factory

* Fix #455: Cyclic import issue

* Fix #454 redux: the first commit introduced a bug, this one fixes the bug

* Fix #436 - Implement GQL data factory (PR #438)

* First pass on gql data factory

Co-authored-by: trentmc <[email protected]>

* Fix #350: [Sim] Tweaks to plot title

* make black happy

* Fix #446: [YAML] Rename/move files & dirs for proper separation among lake, AI models, analytics (#458)

* rename data_eng/ -> lake/
* rename model_factory -> aimodel_factory, model_data_factory -> aimodel_data_factory, model_ss -> aimodel_ss
* for any module in util/ that should be in analytics/, move it
* for any module in util/ that should be in lake/, move it. Including get_*_info.py
* create dir subgraph/ and move all *subgraph*.py into it. Split apart subgraph.py into core_subgraph.py and more
* apply mathutil.to_wei() and from_wei() everywhere
* move contents of util/test_data.py (a bunch of sample predictions) into models/predictions.py. Fix DRY violations in related conftest.pys
* note: there are 2 failing unit tests related to polars and "timestamp_right" column. However they were failing before. I just created a separate issue for that: #459

* Fix #459: In CI, polars error: col timestamp_right already exists (#460)

Plus:
* remove datetime from all DFs, it's too problematic, and unneeded
* bug fix: wasn't mocking check_dfbuyer(), so CI was failing

* Fix #397: Remove need to specify 'stake_token' in ppss.yaml (#461)

* Docs fixes (#456)

* transform data into polar data

* refactored predictoor summary stats function to use polar operations

* update feed summary function

* Make Feeds objects instead of tuples. (#464)

* Make Feeds objects instead of tuples.
* Add namings for different feed objects.
* Move signal at the end.

* Move and rename utils (#467)

* Move and rename utils

* Objectify pairstr. (#470)

* Objectify pairstr.
* Add possibility for empty signal in feeds.
* Move and add some timeframe functions.
* Move exchangestr.

* Towards #462: Separate lake and aimodel SS, lake command (#473)

* Split aimodel and lake ss.
* Split data ss tests.
* Add aimodel ss into predictoor ss.
* Remove stray data_ss.
* Moves test_n to sim ss.
* Trader ss to use own feed instead of data pp.
* Remove data pp entirely.
* Correct ohlcv data factory.
* Add timeframe into arg feeds.
* Refine and add tests for timeframe in arg feed.
* Remove timeframe dependency in trader and predictoor.
* Remove timeframe from lake ss keys.
* Singleify trader agents.
* Adds lake command, assert timeframe in lake (needed for columns).
* Process all signals in lake.

* group data by timeframe also

* fix filtering and code formatting issues

* run black to format code

* removed duplicated imports

* [Lake] integrate pdr_subscriptions into GQL Data Factory (#469)

* first commit for subscriptions

* hook up pdr_subscriptions to gql_factory

* Tests passing, expanding tests to support multiple tables

* Adding tests and improving handling of empty parquet files

* Subscriptions test

* Updating logic to use predictSubscriptions, take lastPriceValue, and to not query the subgraph more than needed.

* Moving models from contract/ -> subgraph/

* Fixing pylint

* fixing tests

* adding @enforce_types

* Improve DRY (#475)

* Improve DRY in cli module.
* Add common functionality to single and multifeed entries.
* Remove trader pp and move necessary lines into trader ss.
* Adds dfbuyer filtering.
* Remove exchange dict from multifeed mixin.
* Replace name of predict_feed.
* Add base_ss tests.
* Adds trueval filtering.

* Add Code climate. (#484)

* Adds manual trigger to pytest workflow.

* fixed failing tests

* fix mypy

* use contract address inside id instead of pair

* fix lines to long issues

* issue483: move the logic from subgraph_slot.py (#489)

* Add some test coverage (#488)

* Adds a line of coverage to test.
* Add coverage for csvs module.
* Add coverage to check_network.
* Add coverage to predictions and traction info.
* Adds coverage to predictoor stats.
* Adds full coverage to arg cli classes.
* Adds cli arguments coverage and fix a wrong parameter in cli arguments.
* Adds coverage to cli module and timeframe.
* Some reformats and coverage in contract module.
* Adds coverage and simplifications to contracts, except token.
* Add some coverage to tokens to complete contract coverage work.

* Fix #501: ModuleNotFoundError: No module named 'flask' (PR #504)

* rename prediction address field and change prediction id format

* Fix #509: Refactor test_update_rawohlcv_files (PR #508)

* fix failing test

* Fix #505: polars.exceptions.ComputeError: datatypes of join keys don't match (PR #510)

* Refactor: new function clean_raw_ohlcv() that moves code from _update_rawohlcv_files_at_feed(). It has sub-functions with precise responsibilities. It has tests.
* Add more tests for merge_raw_ohlcv_dfs, including one that replicates the original issue
* Fix the core bug, now the new tests pass. The main fix is at the top of merge_df::_add_df_col()
* Fix failing test due to network override. NOTE: this may have caused the remaining pytest error. Will fix that after this merge

* Fix #517: aimodel_data_factory.py missing data: binance:BTC/USDT:None (PR #518)

Fixes #517

Root cause: ppss.yaml's aimodel_ss feeds section didn't have eg "c" or "ohlcv"; it assumed that they didn't need to be specified. This was an incorrect assumption: aimodel_ss needs it. In fact aimodel_ss class supports these signals, but the yaml file didn't have it.

What this PR does:
- add a test to aimodel_ss class constructor that complains if not specified
- do specify signals in the ppss.yaml file

Note: the PR pytest failed, but for an unrelated reason. Just created #520 for follow-up.

* Towards #494: Improve coverage 2 (#498)

* Adds some coverage to dfbuyer agent.
* Add dfbuyer and ppss coverage.
* Adds predictoor and sim coverage.
* Adds coverage to util.
* Add some trueval coverage.
* Adds coverage to trader agents.
* Add coverage to portfolio.
* Add coverage to subgraph consume_so_far and fix an infinite loop bug.
* More subgraph coverage.

* remove filtering and other fixes

* Fix #519: aimodel_data_factory.py missing data col: binance:ETH/USDT:close (#524)

Fix #519

Changes:
- do check for dependencies among various ppss ss feeds
- if any of those checks fails, give a user-friendly error message
  - greatly improved printing of ArgFeeds, including merging across pairs and signals. This was half the change of this PR
- appropriate unit tests

* moved filters and prints to analytics level

* Replace `dftool` with `pdr` (#522)

* Print texts: dftool -> pdrcli

* pdrcli -> pdr

* Fix #525: Plots pop up unwanted in tests. (PR #528)

Fix by mocking plt.show().

* Issue 519 feed dependencies (#529)

* Make missing attributes message more friendly and integrate ai ss part to multimixin.

* Update to #519: remove do_verify, it's redundant (#532)

* Fix #507: fix asyncio issues (PR #531)

How fixed: use previous ascynio version.

Calina: Asyncio has some known issues, per their changelog. Namely issues with fixture handling etc., which I believe causes the warnings and test skips in our runs. They recommend using the previous version until they are fixed. It is also why my setup didn't spew up any warnings, my asyncio version was 21.1.

https://pytest-asyncio.readthedocs.io/en/latest/reference/changelog.html

* #413 - YAML thorough system level tests (#527)

* Fix web3_config.rpc_url in test_send_encrypted_tx

* Add conftest.py for system tests

* Add system test for get_traction_info

* Add system test for get_predictions_info

* Add system test for get_predictoors_info

* Add "PDRS" argument to _ArgParser_ST_END_PQDIR_NETWORK_PPSS_PDRS class

* Fix feed.exchange type conversion in publish_assets.py

* Add print statement for payout completion

* Add system level test for pdr topup

* Add conditional break for testing via env

* Add conditional break for testing via env

* Black

* Add test for pdr rose payout system

* System level test pdr check network

* System level test pdr claim OCEAN

* System level test pdr trueval agent

* Remove unused patchs

* Fix wrong import position in conftest.py

* Remove unused imports

* System level test for pdr dfbuyer

* System level tests for pdr trader

* System level tests for publisher

* Rename publisher test file

* Add conditional break in take_step() method

* Update dftool->pdr names in system tests

* Refactor test_trader_agent_system.py

* Add mock fixtures for SubgraphFeed and PredictoorContract

* Add system tests for predictoor

* Black

* Refactor system test files - linter fixes

* Linter fixes

* Black

* Add missing mock

* Add savefig assertion in test_topup

* Update VPS configuration to use development entry

* Patch verify_feed_dependencies

* Refactor test_predictoor_system.py to use a common test function

* Refactor trader approach tests to improve DRY

* Black

* Indent

* Ditch NETWORK_OVERRIDE

* Black

* Remove unused imports

* updated and extended tests

* fix pylint issue

* fixed new pulled tests

* changed names to use snake case

* fix failing publisher ss test

* fix black failing issue

* removing aggregate_prediction_statistics as well, since this isn't used anywhere

* cleaning up pylint

---------

Co-authored-by: idiom-bytes <[email protected]>
Co-authored-by: Trent McConaghy <[email protected]>
Co-authored-by: trizin <[email protected]>
Co-authored-by: Idiom <[email protected]>
Co-authored-by: Călina Cenan <[email protected]>
Co-authored-by: Mustafa Tunçay <[email protected]>
idiom-bytes added a commit that referenced this pull request Jan 25, 2024
…571)

* First stab at porting various functions over to polars... lots to go

* TOHLCV df initialization and type checking added. 2/8 pdutil tests passing

* black formatted

* Fixing initialization and improving test. datetime is not generated if no timestamp is present

* Restructured pdutil a bit to reduce DRY and utilize schema more strictly.

* test initializing the df and datetime

* improve init test to show exception without timestamp

* fixing test_concat such that it verifies that schemas must match, and how transform handles datetime

* saving parquet enforces datetime and transform. updated test_load_append and test_load_filtered.

* black formatted

* data_eng tests are passing

* initial data_eng tests are passing w/ black, mypy, and pylint.

* _merge_parquet_dfs updated and create_xy test_1 is passing. all data_eng tests that are enabled are passing.

* 2exch_2coins_2signals is passing

* Added polars support for fill_nans, has_nans, and create_xy__handle_nan is passing.

* Starting to deprecate references to pandas and csv in data_factory.

* Black formatted

* Deprecated csv logic in DataFactory and created tests around get_hist_df() to verify that its working as intended. I believe kraken data is returning null at the moment.

* All tests should be passing.

* Fix #370: YAML & CLI (#371)

* Towards #232: Refactoring towards ppss.yaml part 3/3
* move everything in model_eng/ to data_eng/
* Fix #352: [SW eng] High DRY violation in test_predictoor_agent.py <> test_predictoor_agent3.py
* Deprecate backend-dev.md (long obsolete), macos.md (obsolete due to vps), and envvars.md (obsolete because of ppss.yaml).
* Rename BaseConfig to web3_pp.py and make it yaml-based
* Move scripts into util/, incorporate them into pdr cli, some refactoring.
* revamp READMEs for cli. And, tighten up text for getting OCEAN & ROSE
* Deprecated ADDRESS_FILE and RPC_URL envvars.
* deprecate Predictoor approach 2. Pita to maintain 


Co-authored-by: trizin <[email protected]>

* Update CI to use pdr instead of scripts/ (#399)

* Update check script CI

* Update cron topup

* Workflow dispatch

* Nevermind, revert previous commit

* Run on push to test

* Pass ppss.web3_pp instead of web3_config

* Don't run on push

* Replace long try/except with _safe*() function; rename pdutil -> plutil; get linters to pass

* Update entrypoint script to use pdr cli (#406)

* Add main.py back (#404)

* Add main.py back

* Black

* Linter

* Linter

* Remove "switch back to version v0.1.1"

* Black

* make black happy

* small bug fix

* many bug fixes. Still >=1 left

* fix warning

* Add support for polars where needed

* tweak docstring

* Fix #408: test_sim_engine failing in yaml-cli2, bc hist_df is s not ms. Proper testing and documentation was added, as part of the fix

* BaseContract tests that Web3PP type is input

* goes with previous commit

* tweak - lowercase

* Bug fix - fix failing tests

* Remove unwanted file

* (a) better organize ppss.yaml for usability (b) ensure user isn't annoyed by git with their copy of ppss.yaml being my_ppss.yaml

* add a more precise test for modeling

* make black happy

* Small refactor: make transform_df() part of helper routine

* Fix #414: Split data_factory into (1) CEX -> parquet -> df (2) df -> X,y for models

* Fix #415: test_cli_do_dfbuyer.py is hanging #415

* test create_xy() even more. Clarify the order of timestamps

* Add a model-building test, using data shaped like data from test_model_data_factory

* Fix #416: [YAML branch] No Feeds Found - data_pp.py changes pair standards

* For barge#391: update to *not* use barge's predictoor branch

* Update vps.md: nicer order of operations

* For #417, #418 in yaml-cli2 branch. publisher TUSD -> USDT

* remove default_network from ppss.yaml (obsolete)

* Fix #427 - time now

* Fix #428: test_get_hist_df - FileNotFoundError. Includes lots of extra robustness checking

* remove dependency that we don't need, which caused problems

* Fix #421: Add cli + logic to calculate and plot traction metrics (PR #422)

Also: mild cleanup of CLI.

* bug fix: YAML_FILE

* fix breaking test; clean it up too

* add barge-calls.md

* Fix #433. Calculate metrics and draw plots for epoch-based stats (PR #434)

#433 : "Plot daily global (pair_timeframe x20) <average predictoors> and <average stake>, by sampling slots from each day."

* Tweak barge-calls.md

How: show origin of NETWORK_RPC_URL

* Tweak barge-calls.md: more compactly show RPC_URL calc

* update stake_token

* bug fix

* Update release-process.md: bug fix

* Tweak barge-calls.md

* Tune #405 (PR #406): Update entrypointsh script to use pdr CLI

* Update vps.md: docker doesn't need to prompt to delete

* Update vps.md: add docker-stop instrs

* allow CLI to have NETWORK_OVERRIDE, for more flexiblity from barge

* fix pylint issue

* Update barge-calls.md: link to barge.md

* Update release-process.md: fix typo

* touch

* Update vps.md: more instrs around waiting for barge to be ready

* add unit tests for cli_module

* Towards #437: [YAML] Publisher error 'You must set RPC_URL environment variable'

* Bug fixes

* refactor tweaks to predictoor and trader

* Clean up some envvar stuff. Document ppss vars better.

* publish_assets.py now supports barge-pytest and barge-predictoor-bot

* bug fix

* bug fix the previous 'bug fix'

* Clean up how dfbuyer/predictoor/trader agents get feeds: web3_pp.query_feed_contracts() -> data_pp.filter_feeds(); no more filtering within subgraph querying; easier printing & logging. Add timeframestr.Timeframe. Add feed.mock_feed. All tests pass.

* fix breaking subgraph tests. Still breakage in trader & dfbuyer (that's next)

* Fix failing tests in tradder, dfbuyer. And greatly speed up the tests, via better mocking.

* Fix bugs for failing tests of https://github.com/oceanprotocol/pdr-backend/actions/runs/7156603163/job/19486494815

* fix tmpdir bug

* Fix (hopefully) failing unit test - restricted region in querying binance api

* consolidate gas_price setting, make it consistent; set gas_price to 0 for development/barge

* fix linter complaints

* Fix remaining failing unit tests for predictoor_batcher

* Finish the consolidation of gas pricing. All tests pass

* Update vps.md: add debugging info

- Where to find queries
- Key docker debugging commands

* add to/from wei utility. Copied from ocean.py

* tweak docs in conftest_ganache

* tweaks from black for wei

* Make fixed_rate.py and its test easier to understand via better var naming & docs

* Make predictoor_contract.py easier to understandn via better var anming & docs

* test fixed_rate calcBaseInGivenOutDT

* Refactor predictoor_contract: push utility methods out of the class, and into more appropriate utility modules. And, move to/from_wei() from wei.py to mathutil.py. Test it all.

* Tweak docstrings for fixed_rate.py

* Improve DX: show dev what the parameters are. Improve UX: print when done.

* Improve DX & UX for predictoor_contract

* Tweak UX (prints)

* Update vps.md: export PATH

* Logging for predictoor is way better: more calm yet more informative. Predictoors only do 1 feed now.

* TraderAgent -> BaseTraderAgent

* Rename parquet_dfs -> rawohlcv_dfs; hist_df -> mergedohlcv_df; update related

* apply black to test_plutil.py

* apply black to test_model_data_factory.py

* apply black to ohlcv_data_factory.py

* refactor test_ohlcv_data_factory: cleanup mocks; remove redundant test; pq_data_factory -> factory

* Fix #443: [YAML] yaml timescale is 5m, yet predictoor logs s_per_epoch=3600 (1h)

* Update feed str() to give full address; and order to be similar to predict_feeds_strs. Show all info used in filtering feeds.

* Small bug fix: not printing properly

* Tweak: logging in predictoor_contract.py

* Tweak: logging in trueval_agent_single.py

* Two bug fixes: pass in web3_pp not web3_config to PredictoorContract constructor

* enhance typechecking

* tweak payout.py: make args passed more obvious

* fix broken unit test

* make black happy

* fix breaking unit test

* Tweak predictoor_contract DX & UX

* Improve trueval: Have fewer layers of try/except, better DX via docstrings and more, better UX via logs

* Rename TruevalAgentBase -> BaseTruevalAgent

* (a) Fix #445: merge 3 trueval agent files into 1. (b) Fix #448 contract.py::get_address() doesn't handle 'sapphire-testnet' etc #448. (c) test_contract.py doesn't test across all networks we use, and it's missing places where we have errors (d) clean up trueval agent testing (e) move test_contract.py into test_noganache since it doesn't use ganache

* Fix #450: test_contract_main[barge-pytest] fails

* renaming pq_data_factory to ohlcv_data_factory

* Removing all TODOs

* Fix #452: Add clean code guidelines README

* removing dangling _ppss() inside predictoor_agent_runner.py

* Fixing linter

* Fix #454: Refactor: Rename MEXCOrder -> MexcOrder, ERC721Factory

* Fix #455: Cyclic import issue

* Fix #454 redux: the first commit introduced a bug, this one fixes the bug

* Fix #436 - Implement GQL data factory (PR #438)

* First pass on gql data factory

Co-authored-by: trentmc <[email protected]>

* Fix #350: [Sim] Tweaks to plot title

* make black happy

* Fix #446: [YAML] Rename/move files & dirs for proper separation among lake, AI models, analytics (#458)

* rename data_eng/ -> lake/
* rename model_factory -> aimodel_factory, model_data_factory -> aimodel_data_factory, model_ss -> aimodel_ss
* for any module in util/ that should be in analytics/, move it
* for any module in util/ that should be in lake/, move it. Including get_*_info.py
* create dir subgraph/ and move all *subgraph*.py into it. Split apart subgraph.py into core_subgraph.py and more
* apply mathutil.to_wei() and from_wei() everywhere
* move contents of util/test_data.py (a bunch of sample predictions) into models/predictions.py. Fix DRY violations in related conftest.pys
* note: there are 2 failing unit tests related to polars and "timestamp_right" column. However they were failing before. I just created a separate issue for that: #459

* Fix #459: In CI, polars error: col timestamp_right already exists (#460)

Plus:
* remove datetime from all DFs, it's too problematic, and unneeded
* bug fix: wasn't mocking check_dfbuyer(), so CI was failing

* Fix #397: Remove need to specify 'stake_token' in ppss.yaml (#461)

* Docs fixes (#456)

* Make Feeds objects instead of tuples. (#464)

* Make Feeds objects instead of tuples.
* Add namings for different feed objects.
* Move signal at the end.

* Move and rename utils (#467)

* Move and rename utils

* Objectify pairstr. (#470)

* Objectify pairstr.
* Add possibility for empty signal in feeds.
* Move and add some timeframe functions.
* Move exchangestr.

* Towards #462: Separate lake and aimodel SS, lake command (#473)

* Split aimodel and lake ss.
* Split data ss tests.
* Add aimodel ss into predictoor ss.
* Remove stray data_ss.
* Moves test_n to sim ss.
* Trader ss to use own feed instead of data pp.
* Remove data pp entirely.
* Correct ohlcv data factory.
* Add timeframe into arg feeds.
* Refine and add tests for timeframe in arg feed.
* Remove timeframe dependency in trader and predictoor.
* Remove timeframe from lake ss keys.
* Singleify trader agents.
* Adds lake command, assert timeframe in lake (needed for columns).
* Process all signals in lake.

* [Lake] integrate pdr_subscriptions into GQL Data Factory (#469)

* first commit for subscriptions

* hook up pdr_subscriptions to gql_factory

* Tests passing, expanding tests to support multiple tables

* Adding tests and improving handling of empty parquet files

* Subscriptions test

* Updating logic to use predictSubscriptions, take lastPriceValue, and to not query the subgraph more than needed.

* Moving models from contract/ -> subgraph/

* Fixing pylint

* fixing tests

* adding @enforce_types

* Improve DRY (#475)

* Improve DRY in cli module.
* Add common functionality to single and multifeed entries.
* Remove trader pp and move necessary lines into trader ss.
* Adds dfbuyer filtering.
* Remove exchange dict from multifeed mixin.
* Replace name of predict_feed.
* Add base_ss tests.
* Adds trueval filtering.

* Add Code climate. (#484)

* Adds manual trigger to pytest workflow.

* issue483: move the logic from subgraph_slot.py (#489)

* added truevals to gql data factory

* Add some test coverage (#488)

* Adds a line of coverage to test.
* Add coverage for csvs module.
* Add coverage to check_network.
* Add coverage to predictions and traction info.
* Adds coverage to predictoor stats.
* Adds full coverage to arg cli classes.
* Adds cli arguments coverage and fix a wrong parameter in cli arguments.
* Adds coverage to cli module and timeframe.
* Some reformats and coverage in contract module.
* Adds coverage and simplifications to contracts, except token.
* Add some coverage to tokens to complete contract coverage work.

* Fix #501: ModuleNotFoundError: No module named 'flask' (PR #504)

* Fix #509: Refactor test_update_rawohlcv_files (PR #508)

* rename trueVal to trueval and slot type

* added tests and mocked values for subgraph trueval code

* Fix #505: polars.exceptions.ComputeError: datatypes of join keys don't match (PR #510)

* Refactor: new function clean_raw_ohlcv() that moves code from _update_rawohlcv_files_at_feed(). It has sub-functions with precise responsibilities. It has tests.
* Add more tests for merge_raw_ohlcv_dfs, including one that replicates the original issue
* Fix the core bug, now the new tests pass. The main fix is at the top of merge_df::_add_df_col()
* Fix failing test due to network override. NOTE: this may have caused the remaining pytest error. Will fix that after this merge

* Fix #517: aimodel_data_factory.py missing data: binance:BTC/USDT:None (PR #518)

Fixes #517

Root cause: ppss.yaml's aimodel_ss feeds section didn't have eg "c" or "ohlcv"; it assumed that they didn't need to be specified. This was an incorrect assumption: aimodel_ss needs it. In fact aimodel_ss class supports these signals, but the yaml file didn't have it.

What this PR does:
- add a test to aimodel_ss class constructor that complains if not specified
- do specify signals in the ppss.yaml file

Note: the PR pytest failed, but for an unrelated reason. Just created #520 for follow-up.

* Towards #494: Improve coverage 2 (#498)

* Adds some coverage to dfbuyer agent.
* Add dfbuyer and ppss coverage.
* Adds predictoor and sim coverage.
* Adds coverage to util.
* Add some trueval coverage.
* Adds coverage to trader agents.
* Add coverage to portfolio.
* Add coverage to subgraph consume_so_far and fix an infinite loop bug.
* More subgraph coverage.

* Fix #519: aimodel_data_factory.py missing data col: binance:ETH/USDT:close (#524)

Fix #519

Changes:
- do check for dependencies among various ppss ss feeds
- if any of those checks fails, give a user-friendly error message
  - greatly improved printing of ArgFeeds, including merging across pairs and signals. This was half the change of this PR
- appropriate unit tests

* Replace `dftool` with `pdr` (#522)

* Print texts: dftool -> pdrcli

* pdrcli -> pdr

* Fix #525: Plots pop up unwanted in tests. (PR #528)

Fix by mocking plt.show().

* Issue 519 feed dependencies (#529)

* Make missing attributes message more friendly and integrate ai ss part to multimixin.

* Update to #519: remove do_verify, it's redundant (#532)

* Fix #507: fix asyncio issues (PR #531)

How fixed: use previous ascynio version.

Calina: Asyncio has some known issues, per their changelog. Namely issues with fixture handling etc., which I believe causes the warnings and test skips in our runs. They recommend using the previous version until they are fixed. It is also why my setup didn't spew up any warnings, my asyncio version was 21.1.

https://pytest-asyncio.readthedocs.io/en/latest/reference/changelog.html

* #413 - YAML thorough system level tests (#527)

* Fix web3_config.rpc_url in test_send_encrypted_tx

* Add conftest.py for system tests

* Add system test for get_traction_info

* Add system test for get_predictions_info

* Add system test for get_predictoors_info

* Add "PDRS" argument to _ArgParser_ST_END_PQDIR_NETWORK_PPSS_PDRS class

* Fix feed.exchange type conversion in publish_assets.py

* Add print statement for payout completion

* Add system level test for pdr topup

* Add conditional break for testing via env

* Add conditional break for testing via env

* Black

* Add test for pdr rose payout system

* System level test pdr check network

* System level test pdr claim OCEAN

* System level test pdr trueval agent

* Remove unused patchs

* Fix wrong import position in conftest.py

* Remove unused imports

* System level test for pdr dfbuyer

* System level tests for pdr trader

* System level tests for publisher

* Rename publisher test file

* Add conditional break in take_step() method

* Update dftool->pdr names in system tests

* Refactor test_trader_agent_system.py

* Add mock fixtures for SubgraphFeed and PredictoorContract

* Add system tests for predictoor

* Black

* Refactor system test files - linter fixes

* Linter fixes

* Black

* Add missing mock

* Add savefig assertion in test_topup

* Update VPS configuration to use development entry

* Patch verify_feed_dependencies

* Refactor test_predictoor_system.py to use a common test function

* Refactor trader approach tests to improve DRY

* Black

* Indent

* Ditch NETWORK_OVERRIDE

* Black

* Remove unused imports

* Adds incremental waiting for subgraph tries. (#534)

* Add publisher feeds filtering. (#533)

* Add publisher feeds filtering.

* Pass the ppss.web3_pp instead of web3_config into WrappedToken class (#537)

* Fix #542: Add code climate usage to developer flow READMEs

* added test for trueval

* add test for truevals table

* fix failing test after merge from main

* filter inside subgraph query by timestamp instead of slot

* Improving subgraph error handling, rather than throwing exception on dupes, just handle them

* Fixing tests such that they are simulating 1000s records, and working correctly with subgraph chunk_size

* fixing tests

* fixing pylint errors

* Updated fetch logic to loop/fetch all records and to handle errors. Cleaned up logic to follow latest implementations and verified tests are working.

* fixing black, mypy, pylint, etc...

* Fixing up pylint

* Adjusting tests such that ppss is initialized ahead of time by passing it through the args. This should automatically play with berkays update, and helps us validate its working on here

* Removing print

* Adjusted tests to properly reflect that data_factory saves + loads data as ms. Plots are now charting as expected.

* Updated tests to reflect that they should be expected a dataframe w/ ms timestamp, not raw_subgraph/Predictions... this cold be improved further

* Fixing system tests, such that they are properly checking for checksummed addresses, ms, and other details that were lost in the test configuration.

* Fixing pylint

* black formatted

* cherry-pick 2a36373

* black fix

* issue481: del_network_override is removed from tests

* requested changes

* issue-481: pdr payout queries and merge with new structure

* Fixed tests, test data, and logic such that it's properly working with the dataframes. A lot of the tests are mixing up _ms with _s, and causing bugs. These bugs are being masked because tests are not setup in a clean way. subgraph_data = _s, parquet_data = _ms

* Updated cli_module, analytics, and documented functionality, such that cli.args are used in post-lake filters. Not to build the lake.

---------

Co-authored-by: Trent McConaghy <[email protected]>
Co-authored-by: trizin <[email protected]>
Co-authored-by: Călina Cenan <[email protected]>
Co-authored-by: Mustafa Tunçay <[email protected]>
Co-authored-by: Norbert <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants