Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add csv data source #54

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions crates/rbuilder/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ derivative = "2.2.0"
mockall = "0.12.1"
shellexpand = "3.1.0"
async-trait = "0.1.80"
hex = "0.4.3"

[build-dependencies]
built = { version = "0.7.1", features = ["git2", "chrono"] }
Expand Down
113 changes: 113 additions & 0 deletions crates/rbuilder/src/backtest/fetch/csv.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
use crate::primitives::Order;
use crate::{
backtest::{
fetch::datasource::{BlockRef, DataSource},
OrdersWithTimestamp,
},
primitives::{Bundle, TransactionSignedEcRecoveredWithBlobs},
};
use alloy_rlp::Decodable;
use async_trait::async_trait;
use csv::Reader;
use eyre::Context;
use reth::primitives::TransactionSignedEcRecovered;
use std::{collections::HashMap, fs::File, path::PathBuf};
use tracing::trace;
use uuid::Uuid;

#[derive(Debug, Clone)]
pub struct CSVDatasource {
batches: HashMap<u64, Vec<TransactionSignedEcRecovered>>,
}

impl CSVDatasource {
pub fn new(filename: impl Into<PathBuf>) -> eyre::Result<Self> {
let batches = Self::load_transactions_from_csv(filename.into())?;
Ok(Self { batches })
}

fn load_transactions_from_csv(
filename: PathBuf,
) -> eyre::Result<HashMap<u64, Vec<TransactionSignedEcRecovered>>> {
let file = File::open(&filename)
.wrap_err_with(|| format!("Failed to open file: {}", filename.display()))?;
let mut reader = Reader::from_reader(file);
let mut batches: HashMap<u64, Vec<TransactionSignedEcRecovered>> = HashMap::new();

for result in reader.records() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result -> record

let record = result?;
if record.len() != 2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0,1,and 2 could be constants also documenting the format
eg:
/// col 1 contains bla bla
const COL_RAW_RLP_RX = 1
Is col 0 an arbitrary number? Is this some block number?

return Err(eyre::eyre!("Invalid CSV format"));
}

let batch_number: u64 = record[0].parse()?;
let rlp_hex = &record[1];
let rlp_bytes = hex::decode(rlp_hex)?;
let tx = TransactionSignedEcRecovered::decode(&mut &rlp_bytes[..])?;

batches.entry(batch_number % 10).or_default().push(tx);
}

Ok(batches)
}
}

#[async_trait]
impl DataSource for CSVDatasource {
async fn get_orders(&self, block: BlockRef) -> eyre::Result<Vec<OrdersWithTimestamp>> {
// The csv datasource is one with 10 batches, where batch is a list of transactions
// Since we don't have full "real" blocks, we'll just use the block number to determine the batch
// Thus the usage of mod 10 is just to determine the batch number that we get transactions from, e.g. block 100 corresponds to 0, 101 to 1, 109 to 9, etc.
let batch_number = block.block_number % 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this 10 should be a constant.
Maybe also some function like batch_number_from_u64() to isolate the % 10?

let transactions = self.batches.get(&batch_number).cloned().unwrap_or_default();

let mut uuid_num = 0;
let orders: Vec<OrdersWithTimestamp> = transactions
.into_iter()
.map(|tx| {
let order = transaction_to_order(block.block_number, &mut uuid_num, tx);
OrdersWithTimestamp {
timestamp_ms: block.block_timestamp,
order,
sim_value: None,
}
})
.collect();

trace!(
"Fetched synthetic transactions from CSV for block {}, batch {}, count: {}",
block.block_number,
batch_number,
orders.len()
);

Ok(orders)
}

fn clone_box(&self) -> Box<dyn DataSource> {
Box::new(self.clone())
}
}

fn transaction_to_order(
block: u64,
uuid_num: &mut u128,
tx: TransactionSignedEcRecovered,
) -> Order {
let uuid_bytes = uuid_num.to_be_bytes();
let tx_with_blobs = TransactionSignedEcRecoveredWithBlobs::new_no_blobs(tx).unwrap();
let bundle = Bundle {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bundle creation is tricky (didn't have time to correct it :().
You should create it with a dummy hash/uuid (Default::default()) and then call hash_slow().

txs: vec![tx_with_blobs.clone()],
hash: tx_with_blobs.hash(),
reverting_tx_hashes: vec![],
block,
uuid: Uuid::from_bytes(uuid_bytes),
min_timestamp: None,
max_timestamp: None,
replacement_data: None,
signer: None,
metadata: Default::default(),
};
*uuid_num += 1;
Order::Bundle(bundle)
}
1 change: 1 addition & 0 deletions crates/rbuilder/src/backtest/fetch/mod.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
pub mod csv;
pub mod datasource;
pub mod flashbots_db;
pub mod mempool;
Expand Down
Loading