🔍 N-gram Speculation Visualizer

This tool visualizes n-gram speculation effectiveness between full and quantized model responses. We use this tool at doubleword to investigate the forms of speculative decoding specific to batched inference that power our Batched API. Hover over words in the full response to see matching n-grams in the quantized response.

Question: N-gram Size:

Prompt

Full Model Response

Quantized Model Response

Current N-gram (Full)

Matching N-gram (Quant)

Matching Completion

Has Match Available (underlined)

Try Your Own Texts

Paste any pair of responses to inspect their speculative decoding overlap directly in the browser.

Prompt (optional) Full Model Response Quantized Model Response