Can you save on LLM tokens using images instead of text?

21 points by lpellis 6 days ago

bikeshaving 5 hours ago

Does this mean we’ll finally get empirical proof for the aphorism “a picture is worth a thousand words”?

https://en.wikipedia.org/wiki/A_picture_is_worth_a_thousand_...

heltale 4 hours ago

I suppose it’s only worth 256 words at a time right now. ;)
https://arxiv.org/abs/2010.11929
- estebarb 4 hours ago
  
  The CALM paper https://shaochenze.github.io/blog/2025/CALM/ says it is possible to compress 4 tokens in a single embedding, so... image = 4×256=1024 words > 1000 words. QED
  - bikeshaving 2 hours ago
    
    2.4% relative error is not bad.
  - behnamoh 2 hours ago
    
    how do you decompress all those 4 words from one token?

ashed96 an hour ago

In my experience, LLMs tend to take noticeably longer to process images than text.

floodfx 5 hours ago

Why are completion tokens more with image prompts yet the text output was about the same?

Garlef 2 hours ago

"Thinking" Mode