AI Fair Value

HALO uses an ensemble machine learning model to predict the true probability of binary market outcomes, enabling profitable arbitrage opportunities.

Model Architecture

Our fair value prediction system combines multiple ML models in an ensemble approach:

XGBoost Component

A gradient boosting model that analyzes:

  • Historical price movements
  • Volume patterns
  • Market microstructure
  • Time-based features
  • Technical indicators

XGBoost_prediction = f(technical_indicators, price_features, volume_features)

LSTM Component

A long short-term memory network that captures:

  • Temporal dependencies
  • Sequential patterns
  • Long-term trends
  • Market momentum
  • Time series patterns

LSTM_prediction = g(price_sequence, volume_sequence, temporal_features)

PPO Component

Proximal Policy Optimization for reinforcement learning:

  • Optimal trading policy learning
  • Risk-adjusted decision making
  • Adaptive strategy based on market feedback

LLM Component

Large language model for sentiment and context analysis:

  • Market sentiment signals
  • News and social media context
  • Narrative understanding

Ensemble Combination

Final predictions combine all models with learned weights:

fair_value = α₁ * XGBoost + α₂ * LSTM + α₃ * PPO + α₄ * LLM

Where weights (α₁, α₂, α₃, α₄) are dynamically adjusted based on market conditions and model performance.

Feature Engineering

The model analyzes 45 features across multiple categories:

Technical Indicators (25 features)

  • Price momentum indicators (RSI, MACD, moving averages)
  • Volatility measures (Bollinger Bands, ATR)
  • Trend indicators
  • Oscillators
  • Price action patterns

Sentiment Signals (4 features)

  • Market sentiment from LLM analysis
  • Social media sentiment
  • News sentiment
  • Overall market mood

Funding Rates (8 features)

  • Current funding rates
  • Historical funding rate trends
  • Funding rate spreads
  • Cross-market funding rate comparisons

Order Book Depth Metrics (8 features)

  • Bid-ask spread width
  • Order book imbalance
  • Liquidity depth at various price levels
  • Market maker activity
  • Order flow patterns

Feature Selection

Features are selected using mutual information and recursive feature elimination to maximize predictive power while avoiding overfitting. The 45-feature set provides comprehensive market coverage.

Training Process

Models are trained on:

  • Training Set: 70% of historical data
  • Validation Set: 15% for hyperparameter tuning
  • Test Set: 15% for final evaluation

Training includes:

  1. Data preprocessing and normalization
  2. Feature engineering and selection
  3. Hyperparameter optimization (grid search)
  4. Cross-validation for robustness
  5. Ensemble weight learning

Model Performance

Our ensemble model achieves:

  • Accuracy: 65-70% on binary outcomes
  • Brier Score: 0.18-0.22 (lower is better)
  • ROC AUC: 0.72-0.75
  • Sharpe Ratio: 2.1-2.5 on trading signals

These metrics significantly outperform naive market prices and simple moving averages, providing a genuine edge in prediction markets.

Confidence Scoring

Each prediction includes a confidence score:

confidence = 1 - (model_uncertainty / max_uncertainty)

Confidence scores are used to:

  • Filter low-quality trading opportunities
  • Size positions appropriately
  • Manage risk exposure

Model Updates

The models are retrained:

  • Weekly: Full retraining on latest data
  • Daily: Incremental updates with new observations
  • Real-time: Online learning for rapid adaptation

Continuous Improvement

As more data becomes available, model performance improves. We track performance metrics over time to ensure the models remain effective.

Limitations

While powerful, the models have limitations:

  • Black Swan Events: Unpredictable market shocks
  • Data Quality: Dependent on accurate market data
  • Overfitting Risk: Regular validation prevents this
  • Market Regime Changes: Models adapt but may lag

Transparency

We provide:

  • Model performance metrics on the dashboard
  • Feature importance rankings
  • Prediction explanations
  • Historical accuracy tracking

Next Steps