Gym Trading Environment

Qtrade provides a highly customizable Gym trading environment to facilitate research on reinforcement learning in trading.

Initialize Gym Environment

The following example demonstrates how to create a basic trading environment. For advanced customization of Actions, Rewards, and Observers, please refer to Customizing Trading Environment Guide.

import yfinance as yf
import talib as ta
from qtrade.env import TradingEnv
from qtrade.core.commission import PercentageCommission

# Download historical gold futures data
data = yf.download(
    "GC=F", 
    start="2023-01-01", 
    end="2024-01-01", 
    interval="1d",
    multi_level_index=False
)

# Calculate technical indicators
df['Rsi'] = ta.rsi(df['Close'], length=14)  # 14-period RSI
df['Diff'] = df['Close'].diff()             # Price difference
df.dropna(inplace=True)

commission = PercentageCommission(0.001)     # 0.1% commission per trade

# Initialize trading environment
env = TradingEnv(
    data=df, 
    cash=3000,                # Initial capital
    window_size=10,          # Observation window size
    max_steps=550,           # Maximum steps per episode
    commission=commission,    # Commission scheme
)

The example above uses the DefaultObserver, which includes all columns except OHLCV by default.

Training

We’ll use stable-baselines3 (sb3) for training our trading agent. First, install the library:

$ pip install stable-baselines3

Then train the model using PPO (Proximal Policy Optimization):

from stable_baselines3 import PPO

model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=500000)

Evaluation

After training, evaluate the model’s performance:

obs, _ = env.reset()
for _ in range(400):
    env.render('human')           # render live trading
    action, _states = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        break

# print result stats
env.show_stats()
# plot a result chart 
env.plot()

You can visualize the trading process using env.render('human'). For recording purposes, you can save the renders as a video using sb3’s VecVideoRecorder wrapper.

Trading Environment Render