Methodology

A complete description of how Beeks.ai collects, normalizes, matches, and aggregates prediction market data.

Data collection

A Cloudflare Worker runs on a one-minute cron schedule. On each run it fetches open markets from Polymarket's CLOB API, Kalshi's Trade API, and Manifold's REST API in parallel. Up to 300 markets per run (100 per platform) are ingested.

Each raw API response is normalized into a common schema: market_id, platform, title, probability_yes, volume_usd, closes_at, and source_url. Kalshi prices are converted from cents to 0–1 probability using the mid-price (bid + ask) / 2. Polymarket uses the YES token price directly. Manifold reports probability natively.

Market matching

Markets across platforms are matched into "consensus events" using a two-signal fuzzy similarity score:

A combined score ≥ 0.85 triggers a match. Markets below that threshold become their own consensus event. Markets with an ambiguous score (0.70–0.85) are queued in pending_matches for manual review.

Matching runs within the same category first (e.g., a sports market is only matched against other sports markets), which dramatically reduces false positives.

Consensus probability

For each consensus event, the probability is computed as a volume-weighted average across all matched markets that have a non-null probability:

consensus = Σ(probability_i × volume_i) / Σ(volume_i)

When volume is unknown (e.g., Manifold, which uses play-money "mana"), an equal weight is applied. This means Manifold markets have less influence on the consensus probability when paired with high-volume Polymarket or Kalshi markets.

Spread calculation

For events tracked on multiple platforms, we compute:

A spread of ≥ 5pp on a market with ≥ $10K volume typically indicates a meaningful disagreement between platforms — either a liquidity imbalance, different fee structures, or genuine information asymmetry.

Price history

On each cron run, a price snapshot is recorded for every market with a known probability. These snapshots power the history chart on each event page and the Movers feed. Snapshots are retained indefinitely.

Latency

The cron Worker runs every minute. API-to-database latency is under 10 seconds on typical runs. Page renders read from a 60-second KV cache, with D1 fallback. End-to-end latency from a market move on Polymarket to display on Beeks.ai is typically 60–90 seconds.

API

The public JSON API is available at /api/v1/markets.json. Parameters: limit (max 100), offset, category (politics, sports, crypto, economics, science, other). Results are sorted by total_volume_usd descending. Rate limit: 60 requests per minute per IP on the free tier.

Limitations

Questions about the methodology: hello@beeks.ai