Most prediction sites hide their process. We publish ours. Every data source, every feature, every training decision, every backtest result, all open for you to scrutinize. Click into any model below to read the full methodology.
Our MLB methodology is fully documented, from Statcast data ingestion to feature construction, model training, Monte Carlo game simulation, and complete backtest results across the full 2025 season.
Every detail of our tennis model is public: the data sources, how we engineer features, the ML training process, simulation methodology, and full out-of-sample backtest results. Nothing is hidden.
Our PGA model is currently in development. When it launches, the full methodology (data sources, Strokes Gained modeling, course fit scoring, and backtest results) will be published here just like the others.
Everything you need to know about how Obsidic works.
Obsidic is a predictive sports analytics platform. We run machine learning models and thousands of Monte Carlo simulations on every event to generate win probabilities, edge calculations, player prop projections, and full probability distributions, all surfaced in a clean, daily-updated dashboard.
Most prediction sites give you a number with no methodology behind it. Obsidic is fully transparent. Every model's data sources, feature engineering, training process, and backtest results are publicly documented. We show you the model probability, the market implied probability, and let you decide.
Currently Tennis (ATP and WTA) and MLB (game-level + player props) with full backtested results. PGA Golf is in development and will launch next. All three sports are accessible from a single dashboard.
Backtesting means we test the model on historical data it has never seen before. The model makes predictions, then we score them against actual results. This is the gold standard for validating any predictive model. Our Tennis model was backtested on 2,381 matches and MLB on 2,416 games, all out-of-sample, no hindsight bias.
Monte Carlo simulations model thousands of random outcomes based on the underlying probabilities. For tennis, we run 10,000 simulations per match, simulating each set and game to produce win probabilities, set score distributions, and tiebreak likelihoods. For MLB, we run 5,000 simulations per game to model run distributions, first-five outcomes, and more.
Edge is the difference between our model's probability and the implied probability from the market odds. For example, if our model says a team has a 65% chance of winning but the market implies only 55%, the edge is +10%. Positive edge means the market is undervaluing that outcome relative to our model.
For Tennis: 46,400+ historical matches across ATP and WTA, 1,335 player Elo profiles, surface-specific stats, recent form, and head-to-head records, all engineered into 113 ML features. For MLB: 3.4M+ Statcast pitches, 709,281 plate appearances, park factors, pitcher/batter platoon splits, bullpen usage, and rolling performance windows.
The PGA Golf model is currently in development. It will feature Strokes Gained decomposition (6 components), course-fit scoring, head-to-head matchup projections, and full-field tournament simulations. Sign up for early access to be notified the moment it goes live.