Status: Actively maintained. Any question or suggestion is welcome. Contact us by email or raise an issue. We will give response within 48 hours.
Code and dataset for the paper "A Doubly Stochastic Simulator with Applications in Arrivals Modeling and Simulation".
We recommend using Anaconda to manage the environment. The code is tested on the latest version of python and pytorch (2023, March 24).
The following packages are required:
conda create --name dsc
conda activate dsc
conda install pytorch -c pytorch -y
conda install -c conda-forge jupyterlab -y
conda install numpy -y
conda install pandas -y
conda install -c conda-forge matplotlib -y
pip install progressbar2
conda install -c anaconda scipy -y
conda install -c conda-forge colored -y
conda install -c anaconda seaborn -y
We provide two ways to train the generative model. The first one is to use a discriminator, where code is provided in train_gan.py
. The second one is to use sinkhorn distance, with code in train.py
. The second method runs faster but may not be as stable as the first one. If you want to use the second method, you need to install the following package:
pip install geomloss
For the oakland call center dataset, please download it from Kaggle. After unzip the file, you should have a folder named service-requests-received-by-the-oakland-call-center.csv
. Put the file in dataset/callcenter/
.
The experiment consists of three steps: prepare dataset, train model, evaluate model. We highlight the steps for reproducing the results on call center dataset. The steps for other datasets are similar.
- Prepare dataset: run the jupyter notebook in dataset/callcenter_dataset.ipynb.
- Train model: run train.py with the corresponding exp_label. Specifically,
python train_gan.py --exp_label callcenter_gan_0
- Evaluate model:
python evaluate/bimodal_callcenter_evaluate.py --exp_label callcenter_0
For experiments on other dataset, please refer to the corresponding jupyter notebook in dataset/
, the comments in train_gan.py
and train.py
, and the comments in evaluate/bimodal_callcenter_evaluate.py
, evaluate/infinite_server_queue.py
and evaluate/pgnorta_evaluate.py
.
The call center dataset and bike sharing dataset are both downloaded from Kaggle.