About us

Categories

Tags

Follow us on: X, LinkedIn

Initiated and Officially Supported by Tensormesh

You focus on KV cache research, we make it compatible with vLLM

By

LMCache Team

An illustration comparing traditional KV cache management with LMCache, featuring a worried person on the left using a computer and a happy person on the right with clear labels for KV cache processing and storage.

🚀 Building your system on KV cache? Try building it on LMCache!

By using LMCache for your research, you can focus on KV cache management, while we handle all the vLLM integration and compatibility for you.

Here’s why LMCache works as your research testbed:

  • Simple codebase, all vLLM functionalities. vLLM has over 200k lines of Python code. LMCache? Just 7k lines, specialized for KV cache, so you get much less code to read and work with, while LMCache handles all the vLLM feature compatibility for you.
  • Boost your evaluations with ready-made KV cache research. With implementations of the latest research (think CacheGen and CacheBlend), you’re all set to use them as baselines or show that your system can be used together with them for higher improvements.
  • Showcase your impact, fast. Implement in LMCache, and your research is instantly runnable with vLLM. Plus, we have rich demos for you to showcase your system in conference talks or presentations.

Check our codebase and documentations for more information!

Update: we are delighted to see that there are already papers (like Cake) start to use LMCache as their research testbed.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from LMCache Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading