pymllm.mem_cache.chunk_cache
============================

.. py:module:: pymllm.mem_cache.chunk_cache

.. autoapi-nested-parse::

   No-op prefix cache used when ``disable_radix_cache=True``.

   Every request is fully computed from scratch -- no prefix sharing, no
   tree structure, no eviction logic.  This is the simplest possible
   :class:`~pymllm.mem_cache.base_prefix_cache.BasePrefixCache` implementation.


Classes
-------

.. autoapisummary::

   pymllm.mem_cache.chunk_cache.ChunkCache


Module Contents
---------------

.. py:class:: ChunkCache(token_to_kv_pool_allocator = None, device = torch.device('cpu'))

   Bases: :py:obj:`pymllm.mem_cache.base_prefix_cache.BasePrefixCache`


   No-op prefix cache: no prefix sharing, no eviction.

   When the radix cache is disabled, this class replaces it so that
   the rest of the system can call the same interface without branching.

   :param token_to_kv_pool_allocator: Pool allocator used to free KV indices on request completion.
   :param device: Device for empty tensors returned by :meth:`match_prefix`.


   .. py:attribute:: pool
      :value: None


   .. py:attribute:: device


   .. py:method:: reset()

      Clear all cached state and re-initialise.


   .. py:method:: match_prefix(key)

      Always returns an empty match (no prefix sharing).


   .. py:method:: insert(key, value = None, **kwargs)

      No-op: nothing is cached.


   .. py:method:: evict(num_tokens, swa_num_tokens = 0)

      No-op: nothing to evict.


   .. py:method:: inc_lock_ref(node)

      No-op: nothing to lock.


   .. py:method:: dec_lock_ref(node, **kwargs)

      No-op: nothing to unlock.