Algorithmic Trading System

2024 | Machine Learning, Time Series Analysis, Unsupervised Clustering

algo

Overview

Project Link

This Algorithmic Trading System was developed to identify, cluster, and optimize trading opportunities within the S&P 500 universe. By using advanced techniques like Rolling Ordinary Least Squares (OLS), factor modeling, and K-Means clustering, the system dynamically ranks equities based on technical indicators and fundamental factors. It then constructs an optimized portfolio that adapts to market conditions on a monthly basis.

Challenge

Traders and investment firms are often overwhelmed by the complexity of financial markets, especially when analyzing large datasets from thousands of securities. Traditional approaches struggle to handle complex factor modeling and the volatility inherent in high-frequency market data. The challenge was to systematically incorporate multiple data sources (technical, fundamental, and factor-based) into a cohesive framework that delivers actionable trading signals.

Solution

Our Algorithmic Trading System addresses these challenges through several innovative steps:

  • Rolling Factor Regression: Uses RollingOLS from statsmodels to calculate time-varying betas against the Fama-French factors, capturing the evolving risk exposure of each security.
  • Technical Feature Engineering: Computes RSI, Bollinger Bands, ATR, and Garman-Klass Volatility to capture both volatility and momentum indicators.
  • K-Means Clustering: Groups stocks with similar RSI profiles or volatility regimes to differentiate between momentum-driven and mean-reverting securities.
  • Monthly Liquidity Filter: Retains only the top 150 most liquid stocks to ensure viable trade execution with minimal slippage.
  • Portfolio Optimization: Leverages PyPortfolioOpt to generate maximum Sharpe ratio allocations within user-defined constraints.
  • Continuous Adaptation: Uses a rolling window of 12 months for optimization, retraining every month to respond rapidly to market shifts.

Technical Implementation

Built on a robust Python ecosystem, this project incorporates several libraries for data retrieval, feature calculation, clustering, and optimization:

  • Data Retrieval: yfinance for seamless downloading of historical prices and volumes
  • Factor Modeling: Fama-French factors from pandas_datareader to compute rolling betas
  • Clustering: Scikit-learn KMeans for grouping stocks by common technical or fundamental patterns
  • Optimization: PyPortfolioOpt for efficient frontier-based portfolio construction
  • Visualization: Matplotlib to chart cluster allocations and compare portfolio returns to benchmarks

Results and Impact

The system has shown promising performance in backtests and experimental live trading:

  • Delivered returns that outperformed the S&P 500 over multiple test periods
  • Maintained a more favorable risk profile by dynamically reducing exposure to volatile or illiquid stocks
  • Identified hidden factor exposures through rolling regressions, minimizing potential drawdowns
  • Provided a scalable framework that can be extended to include alternative data sources or new market factors

Project Details

ROLE

Quantitative Analyst

DURATION

2 months

TEAM

solo (Data Scientists)

TECHNOLOGIES

statsmodelsscikit-learnyfinancePyPortfolioOptPandasNumpy

OUTCOME

The algorithmic strategy outperformed the S&P 500 during economic growth period. Learnt alot about the importance of data preprocessing and feature engineering in building a trading strategy. Morever, I learnt alot about financial modelling and the importance of risk management in trading.