🔍 N-gram Speculation Visualizer

This tool visualizes n-gram speculation effectiveness between full and quantized model responses. We use this tool at doubleword to investigate the forms of speculative decoding specific to batched inference that power our Batched API. Hover over words in the full response to see matching n-grams in the quantized response.
Prompt
Full Model Response
Quantized Model Response
Current N-gram (Full)
Matching N-gram (Quant)
Matching Completion
Has Match Available (underlined)

Try Your Own Texts

Paste any pair of responses to inspect their speculative decoding overlap directly in the browser.