Back to Research
Telemetry Nov 21, 2025 6 min read

Modeling the Cost of Replay Telemetry

A storage-first cost model for replayable coding matches, written before keystroke playback became part of the product surface.

AlgoArena Research

Modeling the Cost of Replay Telemetry

1
Writes
1
Reads
~600KB
Avg payload

Evidence Shape

The durable design question was whether replay should add read/write pressure or mostly storage pressure.

Snapshotsperiodic full-code state
Eventsminimal replay metadata
Storage growthmain scaling pressure
01Store snapshots
02Attach to match record
03Read once for replay
04Archive later if needed
Writes: 1 (extra data folded into the existing match completion write) | Reads: 1 (match document read for replay view) | Avg payload: ~600KB (rough 2025 design estimate per match)

Replay sounds expensive until you separate operation count from payload size.


The early match replay design stored periodic code snapshots and minimal event metadata on the match record that already existed. That meant the feature did not need a new write path for every keystroke or a new collection read for every replay. The main cost pressure became storage.


What the model assumed


The 2025 design estimate was:


  • 60 to 180 code snapshots per player
  • 500 to 2,000 minimal events per player
  • roughly 120KB to 1.8MB per two-player match, with about 600KB as the planning average

  • Those numbers are not a live vendor pricing table. They are a product design model: where will the cost grow if this feature works?


    What it changed


    The answer was useful because it shaped the feature:


  • do not write every keystroke as a separate document
  • do not reconstruct the entire match from raw events if snapshots are enough
  • keep replay reads simple for the reviewer
  • treat long-term retention as the real lever

  • Product lesson


    Replay is an evidence feature. It helps a learner see how their solution formed and helps a reviewer understand process without pretending one final code string tells the whole story.


    But replay data should still be boring operationally. The best version is easy to store, easy to retrieve, and easy to age out if the product needs retention limits later.