modeling_qwen3
==============

.. py:module:: modeling_qwen3


Classes
-------

.. autoapisummary::

   modeling_qwen3.Qwen3PreTrainedModel
   modeling_qwen3.Qwen3Model
   modeling_qwen3.Qwen3ForCausalLM
   modeling_qwen3.Qwen3ForSequenceClassification
   modeling_qwen3.Qwen3ForTokenClassification
   modeling_qwen3.Qwen3ForQuestionAnswering


Module Contents
---------------

.. py:class:: Qwen3PreTrainedModel

   Bases: :py:obj:`transformers.modeling_utils.PreTrainedModel`


   .. py:attribute:: config
      :type:  transformers.models.qwen3.configuration_qwen3.Qwen3Config


   .. py:attribute:: base_model_prefix
      :value: 'model'


   .. py:attribute:: supports_gradient_checkpointing
      :value: True


.. py:class:: Qwen3Model(config)

   Bases: :py:obj:`Qwen3PreTrainedModel`


   .. py:attribute:: padding_idx


   .. py:attribute:: vocab_size


   .. py:attribute:: embed_tokens


   .. py:attribute:: layers


   .. py:attribute:: norm


   .. py:attribute:: rotary_emb


   .. py:attribute:: gradient_checkpointing
      :value: False


   .. py:attribute:: has_sliding_layers


   .. py:attribute:: sin_embedding_input_qdq


   .. py:attribute:: cos_embedding_input_qdq


   .. py:attribute:: norm_input_qdq


   .. py:method:: convert_rope_for_deploy()


   .. py:method:: forward(input_ids = None, attention_mask = None, position_ids = None, past_key_values = None, inputs_embeds = None, use_cache = None, cache_position = None, **kwargs)


.. py:class:: Qwen3ForCausalLM(config)

   Bases: :py:obj:`Qwen3PreTrainedModel`, :py:obj:`transformers.generation.GenerationMixin`


   .. py:attribute:: model


   .. py:attribute:: vocab_size


   .. py:attribute:: lm_head


   .. py:attribute:: mllm_qualcomm_max_length
      :value: None


   .. py:attribute:: lm_head_input_qdq


   .. py:attribute:: lm_head_output_qdq


   .. py:method:: forward(input_ids = None, attention_mask = None, position_ids = None, past_key_values = None, inputs_embeds = None, labels = None, use_cache = None, cache_position = None, logits_to_keep = 0, **kwargs)

      labels (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*):
          Labels for computing the masked language modeling loss. Indices should either be in `[0, ...,
          config.vocab_size]` or -100 (see `input_ids` docstring). Tokens with indices set to `-100` are ignored
          (masked), the loss is only computed for the tokens with labels in `[0, ..., config.vocab_size]`.

      Example:

      ```python
      >>> from transformers import AutoTokenizer, Qwen3ForCausalLM

      >>> model = Qwen3ForCausalLM.from_pretrained("Qwen/Qwen3-8B")
      >>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")

      >>> prompt = "Hey, are you conscious? Can you talk to me?"
      >>> inputs = tokenizer(prompt, return_tensors="pt")

      >>> # Generate
      >>> generate_ids = model.generate(inputs.input_ids, max_length=30)
      >>> tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
      "Hey, are you conscious? Can you talk to me?\nI'm not conscious, but I can talk to you."
      ```


.. py:class:: Qwen3ForSequenceClassification

   Bases: :py:obj:`transformers.modeling_layers.GenericForSequenceClassification`, :py:obj:`Qwen3PreTrainedModel`


.. py:class:: Qwen3ForTokenClassification

   Bases: :py:obj:`transformers.modeling_layers.GenericForTokenClassification`, :py:obj:`Qwen3PreTrainedModel`


.. py:class:: Qwen3ForQuestionAnswering

   Bases: :py:obj:`transformers.modeling_layers.GenericForQuestionAnswering`, :py:obj:`Qwen3PreTrainedModel`


   .. py:attribute:: base_model_prefix
      :value: 'transformer'