AssetManagerGym: A Custom Gym Environment for Stock Trading with RL

Welcome to AssetManagerGym, a custom OpenAI Gym environment designed for training reinforcement learning agents on stock trading tasks. This environment simulates stock market dynamics, allowing agents to learn trading strategies by interacting with historical stock data.

Introduction

AssetManagerGym is a simulation environment that provides a platform for developing and testing reinforcement learning (RL) agents on stock trading tasks. It supports features like:

Position management (long, short, flat)
Stop loss and take profit mechanisms
Customizable risk parameters
Randomized stock selection for each episode
Support for training and testing data splits

The environment is compatible with standard RL libraries and can be integrated with various deep RL algorithms.

Environment Description

Features

Random Stock Selection: Each episode starts with a randomly selected stock from a provided dataset.
Position Management: The agent can open long or short positions or choose to remain flat.
Risk Management: Supports stop loss and take profit levels to simulate realistic trading conditions.
Customizable Horizon: Set the maximum number of steps per episode.
Data Splitting: Ability to split data into training and testing sets based on a specified year.

Action Space

The action space is continuous, representing the agent's trading decisions:

Range: [-1.0, 1.0]
- -1.0: Strong sell (open a short position)
- 0.0: Hold (maintain current position)
- 1.0: Strong buy (open a long position)
Shape: (1,)

Observation Space

The observation space consists of a feature vector extracted from stock data:

Type: Box
Shape: (n_features,) where n_features is determined dynamically based on the dataset.
Data: Includes features like price (prc) and other indicators relevant to trading.

Reward Function

The reward is calculated based on the agent's actions and market movements:

Opening a Position: No immediate reward upon opening a position.
Holding a Position:
- Unrealized Profit/Loss: Calculated as the percentage change in price since the position was opened, adjusted for position type (long or short).
Closing a Position:
- Take Profit Hit: Agent receives a reward of +1 and the episode may terminate.
- Stop Loss Hit: Agent receives a reward of -1 (penalty) and the episode may terminate.
Choosing Not to Trade:
- Agent can choose to do nothing, and the episode continues unless specified otherwise.

Episode Termination

An episode may terminate when:

Stop Loss or Take Profit is Hit: Position is closed due to risk parameters being triggered.
Horizon is Reached: The maximum number of steps per episode is reached.
No More Data: No further data is available for the selected stock.
Custom Conditions: Agent-specific logic can dictate episode termination.

Installation

Clone the Repository

git clone https://github.com/anashoussaini/assetmanagergym.git
cd assetmanagergym

Install Dependencies
```
pip install -r requirements.txt
```
Ensure you have the following installed:
- gym
- numpy
- matplotlib
- torch (for integrating with PyTorch agents)
- json (standard library)
- os (standard library)

Prepare the Dataset

Place your stock data in a JSON file named feature_dict.json in the working directory.

The data should be structured as:

{
  "AAPL": {
    "2008": {
      "1": {"prc": 150.0, "feature1": value, ...},
      "2": {"prc": 155.0, "feature1": value, ...},
      ...
    },
    "2009": { ... }
  },
  "GOOG": { ... },
  ...
}

Usage

Environment Initialization

from assetmanagergym import assetmanagergym

json_file = 'feature_dict.json'
risk_params = {'stop_loss': 0.05, 'take_profit': 0.1}  # 5% stop loss, 10% take profit
env = assetmanagergym(json_file_path=json_file, risk_params=risk_params, mode='train', train_test_split_year=2010)
env.set_horizon(12)  # Set the maximum number of steps per episode

Running an Episode

observation = env.reset()
done = False

while not done:
    action = env.action_space.sample()  # Replace with agent's action
    observation, reward, done, info = env.step(action)
    env.render()  # Optional: Visualize the current state

Parameters

Parameter	Type	Description
`json_file_path`	`str`	Path to the JSON file containing stock data.
`risk_params`	`dict`	Dictionary with risk parameters: `stop_loss` and `take_profit`.
`mode`	`str`	`'train'` or `'test'`, determines which data split to use.
`train_test_split_year`	`int`	The year used to split the data into training and testing sets.
`discount_factor`	`float`	Discount factor for calculating discounted rewards.
`horizon`	`int`	Maximum number of steps per episode. Set using `env.set_horizon(horizon)`.

Agent Integration

assetmanagergym is designed to be compatible with deep reinforcement learning agents. Below is an example of how to integrate a custom agent.

Example Agent Loop

agent = YourCustomAgent(env.observation_space, env.action_space)

for episode in range(num_episodes):
    state = env.reset()
    done = False

    while not done:
        action = agent.select_action(state)
        next_state, reward, done, info = env.step(action)
        agent.store_transition(state, action, reward, next_state, done)
        agent.learn()  # Update agent's networks
        state = next_state

    agent.reset()  # Reset agent's state if necessary

Example

Here's a full example combining environment setup, agent interaction, and visualization.

import gym
import numpy as np
from assetmanagergym import assetmanagergym

def main():
    json_file = 'feature_dict.json'
    risk_params = {'stop_loss': 0.05, 'take_profit': 0.1}
    env = assetmanagergym(json_file_path=json_file, risk_params=risk_params, mode='train', train_test_split_year=2010)
    env.set_horizon(12)

    observation = env.reset()
    done = False

    while not done:
        action = env.action_space.sample()  # Replace with agent's action
        observation, reward, done, info = env.step(action)
        env.render()

    env.close()

if __name__ == "__main__":
    main()

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository.

Create a new branch:

git checkout -b feature/your-feature-name

Commit your changes:
```
git commit -am 'Add new feature'
```

Push to the branch:

git push origin feature/your-feature-name

Open a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Note: This README provides an overview of the assetmanagergym environment, including its setup and usage. For detailed documentation and advanced configurations, please refer to the code comments and additional documentation files in the repository.

Appendix

Key Classes and Methods

`assetmanagergym` Class

Initialization

env = assetmanagergym(
    json_file_path='feature_dict.json',
    risk_params={'stop_loss': 0.05, 'take_profit': 0.1},
    mode='train',
    train_test_split_year=2010,
    discount_factor=0.1
)

Methods

env.reset(): Resets the environment to a new initial state.
env.step(action): Advances the environment by one step based on the action.
env.render(): Renders the current state of the environment.
env.close(): Performs any necessary cleanup.
env.set_horizon(horizon): Sets the maximum number of steps per episode.

Action Interpretation

Action Value	Interpretation
`> 0.5`	Open a long position
`< -0.5`	Open a short position
`-0.5` to `0.5`	Hold / Do nothing

Reward Calculation

Unrealized P&L:

[ \text{price_change} = \frac{\text{next_price} - \text{entry_price}}{\text{entry_price}} ]

[ \text{unrealized_pnl} = \text{price_change} \times \text{position} ]
Stop Loss Triggered:
- If unrealized_pnl <= -stop_loss, reward is -1.
Take Profit Triggered:
- If unrealized_pnl >= take_profit, reward is +1.

Visualization

The environment includes methods for visualizing rewards and actions.

Reward Plot with Action Arrows

env.render_with_arrows(rewards, discounted_rewards, actions, steps)

Arrows:
- ↑: Buy action
- ↓: Sell action
- –: Hold action

Data Structure

JSON Data Format

The environment expects the stock data in a specific JSON format.

Example Structure

{
  "AAPL": {
    "2008": {
      "1": {
        "prc": 150.0,
        "feature1": value,
        "feature2": value,
        ...
      },
      "2": { ... },
      ...
    },
    "2009": { ... },
    ...
  },
  "GOOG": { ... },
  ...
}

Stock Ticker: Top-level keys (e.g., "AAPL", "GOOG").
Year: Second-level keys (e.g., "2008", "2009").
Month: Third-level keys (e.g., "1" for January).
Features: Dictionary containing "prc" and other features.

Tables of Parameters

Default Risk Parameters

Parameter	Default Value	Description
`stop_loss`	`0.05`	Stop loss threshold (5%)
`take_profit`	`0.10`	Take profit threshold (10%)

Environment Parameters

Parameter	Type	Default	Description
`discount_factor`	`float`	`0.1`	Discount factor for rewards
`horizon`	`int`	`12`	Max steps per episode
`mode`	`str`	`'train'`	Mode: `'train'` or `'test'`
`train_test_split_year`	`int`	`2010`	Year to split training and testing data

Goodluck Trading.

Happy trading and reinforcement learning!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
StockDictGenerator.ipynb		StockDictGenerator.ipynb
StockDictGenerator.py		StockDictGenerator.py
correlation_with_prc.csv		correlation_with_prc.csv
factor_char_list.csv		factor_char_list.csv
feature_dict.json		feature_dict.json
feature_dict_norm.json		feature_dict_norm.json
heatmap_output.pdf		heatmap_output.pdf
metrics_acronyms.csv		metrics_acronyms.csv
mkt_ind.csv		mkt_ind.csv
penalized_linear_hackathon.py		penalized_linear_hackathon.py
portfolio_analysis_hackathon.py		portfolio_analysis_hackathon.py
test.ipynb		test.ipynb

anashoussaini/AssetManagerGym

Folders and files

Latest commit

History

Repository files navigation