Open Access

All predictions, models and accuracy data are currently open so you can verify our results first-hand. No signup, no paywall — full transparency.

Subscription plans & API access launch May 14, 2026.

About Ganhar

Machine learning football predictions

What is Ganhar?

Ganhar is an advanced sports prediction platform that uses machine learning and statistical modelling to generate probabilistic forecasts for football matches worldwide. Our pipeline processes over 1.4 million historical fixtures to train specialist models across 14 betting markets.

How It Works

Our prediction pipeline runs in six stages, each building on the previous:

  • Data Collection — Fixtures, scores, odds, weather, injuries, lineups, and team statistics from multiple sources.
  • Elo Ratings — Dynamic team strength ratings updated after every match, with home advantage correction.
  • Feature Engineering — 179 features across 17 groups including form, goals, head-to-head, weather, squad quality, and more.
  • Dixon-Coles Model — A classical statistical model that estimates team attack and defence parameters.
  • XGBoost Models — 14 specialist gradient-boosted models, one per market, trained with ablation-tested feature selection.
  • Predictions — Probabilistic forecasts generated daily for upcoming matches.

Models & Accuracy

Each of our 14 market models is individually optimised. Feature groups are selected through ablation testing, and models are retrained weekly with the latest data. All accuracy figures shown on this site are out-of-sample — computed on matches the models had never seen during training.

Technology

Ganhar is a purpose-built data engineering platform, not a wrapper around third-party APIs.

  • 1.4 million historical matches indexed, covering 900+ leagues and international competitions since 2010.
  • 179 statistical features organised into 17 groups — Elo ratings, form, goals, league position, head-to-head, weather, market odds, squad quality, and more.
  • 14 specialised XGBoost models, each individually optimised via ablation testing. Weekly retraining with 80/20 temporal validation and all metrics computed out-of-sample.
  • Robust web architecture with relational database. Python-powered machine learning engine using gradient boosting and numerical optimisation. Dixon-Coles model for exact score probabilities.
  • Fully automated pipeline: score sync every 15 minutes, daily feature recomputation, weekly model retraining, and monthly full database refresh.