modeling_qwen3 ============== .. py:module:: modeling_qwen3 Classes ------- .. autoapisummary:: modeling_qwen3.Qwen3PreTrainedModel modeling_qwen3.Qwen3Model modeling_qwen3.Qwen3ForCausalLM modeling_qwen3.Qwen3ForSequenceClassification modeling_qwen3.Qwen3ForTokenClassification modeling_qwen3.Qwen3ForQuestionAnswering Module Contents --------------- .. py:class:: Qwen3PreTrainedModel Bases: :py:obj:`transformers.modeling_utils.PreTrainedModel` .. py:attribute:: config :type: transformers.models.qwen3.configuration_qwen3.Qwen3Config .. py:attribute:: base_model_prefix :value: 'model' .. py:attribute:: supports_gradient_checkpointing :value: True .. py:class:: Qwen3Model(config) Bases: :py:obj:`Qwen3PreTrainedModel` .. py:attribute:: padding_idx .. py:attribute:: vocab_size .. py:attribute:: embed_tokens .. py:attribute:: layers .. py:attribute:: norm .. py:attribute:: rotary_emb .. py:attribute:: gradient_checkpointing :value: False .. py:attribute:: has_sliding_layers .. py:attribute:: sin_embedding_input_qdq .. py:attribute:: cos_embedding_input_qdq .. py:attribute:: norm_input_qdq .. py:method:: convert_rope_for_deploy() .. py:method:: forward(input_ids = None, attention_mask = None, position_ids = None, past_key_values = None, inputs_embeds = None, use_cache = None, cache_position = None, **kwargs) .. py:class:: Qwen3ForCausalLM(config) Bases: :py:obj:`Qwen3PreTrainedModel`, :py:obj:`transformers.generation.GenerationMixin` .. py:attribute:: model .. py:attribute:: vocab_size .. py:attribute:: lm_head .. py:attribute:: mllm_qualcomm_max_length :value: None .. py:attribute:: lm_head_input_qdq .. py:attribute:: lm_head_output_qdq .. py:method:: forward(input_ids = None, attention_mask = None, position_ids = None, past_key_values = None, inputs_embeds = None, labels = None, use_cache = None, cache_position = None, logits_to_keep = 0, **kwargs) labels (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*): Labels for computing the masked language modeling loss. Indices should either be in `[0, ..., config.vocab_size]` or -100 (see `input_ids` docstring). Tokens with indices set to `-100` are ignored (masked), the loss is only computed for the tokens with labels in `[0, ..., config.vocab_size]`. Example: ```python >>> from transformers import AutoTokenizer, Qwen3ForCausalLM >>> model = Qwen3ForCausalLM.from_pretrained("Qwen/Qwen3-8B") >>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B") >>> prompt = "Hey, are you conscious? Can you talk to me?" >>> inputs = tokenizer(prompt, return_tensors="pt") >>> # Generate >>> generate_ids = model.generate(inputs.input_ids, max_length=30) >>> tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0] "Hey, are you conscious? Can you talk to me?\nI'm not conscious, but I can talk to you." ``` .. py:class:: Qwen3ForSequenceClassification Bases: :py:obj:`transformers.modeling_layers.GenericForSequenceClassification`, :py:obj:`Qwen3PreTrainedModel` .. py:class:: Qwen3ForTokenClassification Bases: :py:obj:`transformers.modeling_layers.GenericForTokenClassification`, :py:obj:`Qwen3PreTrainedModel` .. py:class:: Qwen3ForQuestionAnswering Bases: :py:obj:`transformers.modeling_layers.GenericForQuestionAnswering`, :py:obj:`Qwen3PreTrainedModel` .. py:attribute:: base_model_prefix :value: 'transformer'