Skip to content

Data used in our paper entitled "An Online, Machine Learning-Based Wide Area Data Transfer Performance Predictor".

Notifications You must be signed in to change notification settings

ramsesproject/wide-area-data-transfer-logs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

wide-area-data-transfer-logs

Data used in our paper entitled "An Online, Machine Learning-Based Wide Area Data Transfer Performance Predictor". Some features are defined in paper: https://doi.org/10.1145/3078597.3078605

Please contact Zhengchun Liu <zhengchun.liu-AT-anl.gov> if you have any questions about the data.

Data explanation:

src: The source endpoint name. It has been anonymized. In which, GCP denotes Globus Connect Personal and GCS represents Globus Connect Server.

dst: The destination endpoint name. It has been anonymized. In which, GCP denotes Globus Connect Personal and GCS represents Globus Connect Server.

request_time: the time when transfer was initiated by user. The minute and second have been anonymized to zero.

R: transfer throughput

Ksout: Contending outgoing transfer rate on source endpoint

Kdin: Contending incoming transfer rate on destination endpoint

C: Concurrency: Number of GridFTP processes

P: Parallelism: Number of TCP channels per process

D: Pipeline depth

Ssout: Number of outgoing TCP streams on source endpoint

Ssin: Number of incoming TCP streams on source endpoint

Sdout: Number of outgoing TCP streams on destination endpoint

Sdin: Number of incoming TCP streams on destination endpoint

Ksin: Contending incoming transfer rate on source endpoint

Kdout: Contending outgoing transfer rate on destination endpoint

Nd: Number of directories transferred.

Nb: Total number of bytes transferred.

Qsout: Contending outgoing transfer rate on the institution of the source endpoint

Qdin: Contending incoming transfer rate on the institution of the destination endpoint

Qsin: Contending incoming transfer rate on the institution of the source endpoint

Qdout: Contending outgoing transfer rate on the institution of the destination endpoint

Gsrc: GridFTP instance count on source endpoint

Gdst: GridFTP instance count on destination endpoint

Nf: Number of files transferred.

All values are normalized to zero-mean, by each source-to-destination endpoint pair.

Please contact zhengchun.liu-AT-anl.gov if you have any questions about the data.

About

Data used in our paper entitled "An Online, Machine Learning-Based Wide Area Data Transfer Performance Predictor".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published