pymllm.mem_cache.base_prefix_cache
==================================

.. py:module:: pymllm.mem_cache.base_prefix_cache

.. autoapi-nested-parse::

   Abstract base class and shared data types for prefix cache implementations.

   All concrete caches (:class:`RadixCache`, :class:`ChunkCache`,
   :class:`MambaRadixCache`) inherit from :class:`BasePrefixCache` and share
   the data classes defined here.


Classes
-------

.. autoapisummary::

   pymllm.mem_cache.base_prefix_cache.RadixKey
   pymllm.mem_cache.base_prefix_cache.MatchResult
   pymllm.mem_cache.base_prefix_cache.InsertResult
   pymllm.mem_cache.base_prefix_cache.EvictResult
   pymllm.mem_cache.base_prefix_cache.BasePrefixCache


Functions
---------

.. autoapisummary::

   pymllm.mem_cache.base_prefix_cache.hash_token_ids
   pymllm.mem_cache.base_prefix_cache.hash_to_int64
   pymllm.mem_cache.base_prefix_cache.hash_bytes


Module Contents
---------------

.. py:function:: hash_token_ids(token_ids, prior_hash = None)

   SHA-256 hash of a token-id page with optional chain-hash.

   Each token is encoded as a 4-byte little-endian unsigned integer;
   tuples (bigram / EAGLE) hash each element in order.  When *prior_hash*
   is supplied the digest is seeded with the raw bytes of the previous
   hash, making the result position-aware.


.. py:function:: hash_to_int64(hex_str)

   Convert a hex digest to a signed 64-bit integer (first 16 hex chars).


.. py:function:: hash_bytes(data)

   SHA-256 -> unsigned 64-bit int.  Useful for multimodal embedding keys.


.. py:class:: RadixKey(token_ids, extra_key = None)

   Compound lookup key: token-id sequence + optional namespace tag.

   ``extra_key`` isolates independent namespaces so that sequences with
   identical leading tokens but different adapters / LoRA ids / multimodal
   context hashes never share prefix nodes.


   .. py:attribute:: __slots__
      :value: ('token_ids', 'extra_key')


   .. py:attribute:: token_ids


   .. py:attribute:: extra_key
      :value: None


   .. py:method:: __len__()


   .. py:method:: __iter__()


   .. py:method:: __getitem__(idx)


   .. py:method:: __repr__()


.. py:class:: MatchResult

   Returned by :meth:`BasePrefixCache.match_prefix`.


   .. py:attribute:: indices
      :type:  torch.Tensor


   .. py:attribute:: last_node
      :type:  Any
      :value: None


   .. py:attribute:: prefix_len
      :type:  int
      :value: 0


   .. py:attribute:: mamba_branching_seqlen
      :type:  Optional[int]
      :value: None


.. py:class:: InsertResult

   Returned by :meth:`BasePrefixCache.insert`.


   .. py:attribute:: prefix_len
      :type:  int
      :value: 0


   .. py:attribute:: last_node
      :type:  Any
      :value: None


   .. py:attribute:: mamba_exist
      :type:  bool
      :value: False


.. py:class:: EvictResult

   Returned by :meth:`BasePrefixCache.evict`.


   .. py:attribute:: full_evicted
      :type:  int
      :value: 0


   .. py:attribute:: swa_evicted
      :type:  int
      :value: 0


   .. py:attribute:: mamba_evicted
      :type:  int
      :value: 0


.. py:class:: BasePrefixCache

   Bases: :py:obj:`abc.ABC`


   Abstract interface for all prefix cache implementations.

   Concrete implementations:

   * :class:`~pymllm.mem_cache.radix_cache.RadixCache` -- radix-tree with
     SWA tombstone support
   * :class:`~pymllm.mem_cache.chunk_cache.ChunkCache` -- no-op fallback
     (``disable_radix_cache=True``)
   * :class:`~pymllm.mem_cache.mamba_radix_cache.MambaRadixCache` -- radix-tree
     with independent Mamba/SSM state tracking


   .. py:method:: reset()
      :abstractmethod:


      Clear all cached state and re-initialise.


   .. py:method:: match_prefix(key)
      :abstractmethod:


      Find the longest cached prefix of *key*.


   .. py:method:: insert(key, value = None, **kwargs)
      :abstractmethod:


      Insert *key*/*value* into the cache.


   .. py:method:: evict(num_tokens, swa_num_tokens = 0)
      :abstractmethod:


      Evict tokens to free memory.


   .. py:method:: inc_lock_ref(node)
      :abstractmethod:


      Lock *node* (and ancestors) to prevent eviction.

      Returns an opaque token (e.g. ``swa_boundary_id``) that must be
      passed back to :meth:`dec_lock_ref`.


   .. py:method:: dec_lock_ref(node, **kwargs)
      :abstractmethod:


      Unlock *node* (and ancestors).


   .. py:method:: evictable_size()


   .. py:method:: swa_evictable_size()


   .. py:method:: protected_size()


   .. py:method:: swa_protected_size()


   .. py:method:: total_size()