-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalize Binance futures orderbook data #27
Comments
There isn't. If you provide an example file for me to look into its format, I would add an example converter. |
By the way, without a local timestamp indicating when you received the feed, accurate backtesting is not possible, as there is no feed latency information. While you can artificially generate a local timestamp by assuming feed latency, it is preferable to collect the data yourself for more reliable results. |
Here is example LOB data for a single day: https://drive.google.com/file/d/1rVaDblmYJL0aPpgvdJ-fU9QFhMDga6f_/view?usp=sharing Btw, I was also happy to write it, but wanted to make sure I wasn't "reinventing the wheel". |
Yes, good point about the local timestamp. Thanks for the tip. The artificial local timestamps are fine for my purposes at the moment. |
trade data is also required. still it's possible to backtest only based on depth data. it's meaningless especially in high freq. backtesting. |
Right. I was not suggesting trying to use OB data alone. Actually, I found your repo while looking for an implementation for inventory models, which of course need trade data to fit them. The trade data is available from the Binance Public Data: wget https://data.binance.vision/data/futures/um/daily/trades/BTCUSDT/BTCUSDT-trades-2020-07-01.zip Here is the trade data corresponding to the above depth data. |
I added the converter. hftbacktest/data/utils/binancehistmktdata.py (a5d3f91) could you check if it works as expected? again, in my experience, I have observed that backtest results can exhibit significant discrepancies unless precise feed latency and order latency are used. |
Excellent! My plan was to look into the inventory MM model (as you gave an example of). I will report it if anything unexpected shows up. I think you mean significant discrepancies between backtest and live trading results, but I am not doing any live trading at the moment. If you want to me to try out one of your other examples with the binance historical data, please let me know. |
I am getting an error using the following trade data, for ETHUSDT on 2022-10-03, as in your example notebook. I think it is because the first row contains the column names, unlike the previous example. My guess is that the format has changed with newer data. |
Thanks for the report. Please see the latest commit. 740feee |
For your information, I used |
I'm not sure I understand, since I also used Unless you are suggesting that the Anyhow, my plan is to collect my own data from the stream and then I can compare with the historical data from binance. |
No. But |
Another issue showed up: I was working with more recent data, and it has an additional undocumented field Here is an example of the recent snapshot data: https://drive.google.com/file/d/1y-9nt9V-eB_OV3uSq4-dzBe-eOsQDt4S/view?usp=sharing |
…hether the first row is a header or not.
See 2b3137c and let me know if it works as expected. |
Code looks much better now without hard-coded indices, and it processes the snapshot fine. But now it fails on the Here is the lob data and trade data to reproduce this. |
See 7299d9a. I fixed the mingled timestamp issue but since the data hasn't local timestamp, there is no way but sorting. That can cause another discrepancy. Beware of that. |
Thanks! I tested it out and there were no more errors. I'm not sure exactly what discrepancy you mean, but perhaps it will become more clear as I continue working on it. |
What I meant by that is that any difference from the live trading environment can cause a discrepancy. |
you can find it on tutorials page or examples directory. |
thanks |
…hether the first row is a header or not.
I wanted to take advantage of the freely available historical futures orderbook level 2 data from binance.
It should be possible by combining this with historical trade data (also available from binance I believe) to obtain normalized data for
htfbacktest
.But I couldn't find this in the repo examples. I wanted to check if it has already been done, so I don't waste time redoing it?
The text was updated successfully, but these errors were encountered: