modeling_qwen3

Classes

Module Contents

class modeling_qwen3.Qwen3PreTrainedModel

Bases: transformers.modeling_utils.PreTrainedModel

config: transformers.models.qwen3.configuration_qwen3.Qwen3Config
base_model_prefix = 'model'
supports_gradient_checkpointing = True
class modeling_qwen3.Qwen3Model(config)

Bases: Qwen3PreTrainedModel

Parameters:

config (transformers.models.qwen3.configuration_qwen3.Qwen3Config)

padding_idx
vocab_size
embed_tokens
layers
norm
rotary_emb
gradient_checkpointing = False
has_sliding_layers
sin_embedding_input_qdq
cos_embedding_input_qdq
norm_input_qdq
convert_rope_for_deploy()
forward(input_ids=None, attention_mask=None, position_ids=None, past_key_values=None, inputs_embeds=None, use_cache=None, cache_position=None, **kwargs)
Parameters:
  • input_ids (Optional[torch.LongTensor])

  • attention_mask (Optional[torch.Tensor])

  • position_ids (Optional[torch.LongTensor])

  • past_key_values (Optional[transformers.cache_utils.Cache])

  • inputs_embeds (Optional[torch.FloatTensor])

  • use_cache (Optional[bool])

  • cache_position (Optional[torch.LongTensor])

  • kwargs (transformers.processing_utils.Unpack[transformers.utils.TransformersKwargs])

Return type:

transformers.modeling_outputs.BaseModelOutputWithPast

class modeling_qwen3.Qwen3ForCausalLM(config)

Bases: Qwen3PreTrainedModel, transformers.generation.GenerationMixin

model
vocab_size
lm_head
mllm_qualcomm_max_length = None
lm_head_input_qdq
lm_head_output_qdq
forward(input_ids=None, attention_mask=None, position_ids=None, past_key_values=None, inputs_embeds=None, labels=None, use_cache=None, cache_position=None, logits_to_keep=0, **kwargs)
labels (torch.LongTensor of shape (batch_size, sequence_length), optional):

Labels for computing the masked language modeling loss. Indices should either be in [0, …, config.vocab_size] or -100 (see input_ids docstring). Tokens with indices set to -100 are ignored (masked), the loss is only computed for the tokens with labels in [0, …, config.vocab_size].

Example:

```python >>> from transformers import AutoTokenizer, Qwen3ForCausalLM

>>> model = Qwen3ForCausalLM.from_pretrained("Qwen/Qwen3-8B")
>>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")
>>> prompt = "Hey, are you conscious? Can you talk to me?"
>>> inputs = tokenizer(prompt, return_tensors="pt")
>>> # Generate
>>> generate_ids = model.generate(inputs.input_ids, max_length=30)
>>> tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
"Hey, are you conscious? Can you talk to me?\nI'm not conscious, but I can talk to you."
```
Parameters:
  • input_ids (Optional[torch.LongTensor])

  • attention_mask (Optional[torch.Tensor])

  • position_ids (Optional[torch.LongTensor])

  • past_key_values (Optional[transformers.cache_utils.Cache])

  • inputs_embeds (Optional[torch.FloatTensor])

  • labels (Optional[torch.LongTensor])

  • use_cache (Optional[bool])

  • cache_position (Optional[torch.LongTensor])

  • logits_to_keep (Union[int, torch.Tensor])

  • kwargs (transformers.processing_utils.Unpack[transformers.utils.TransformersKwargs])

Return type:

transformers.modeling_outputs.CausalLMOutputWithPast

class modeling_qwen3.Qwen3ForSequenceClassification

Bases: transformers.modeling_layers.GenericForSequenceClassification, Qwen3PreTrainedModel

class modeling_qwen3.Qwen3ForTokenClassification

Bases: transformers.modeling_layers.GenericForTokenClassification, Qwen3PreTrainedModel

class modeling_qwen3.Qwen3ForQuestionAnswering

Bases: transformers.modeling_layers.GenericForQuestionAnswering, Qwen3PreTrainedModel

base_model_prefix = 'transformer'