NHacker Next
  • new
  • show
  • ask
  • jobs
  • submit
login
From 300KB to 69KB per Token: How LLM Architectures Solve the KV Cache Problemnews.future-shock.ai
146 points by future-shock-ai 4 days ago | 11 comments