modeling_qwen2 ============== .. py:module:: modeling_qwen2 Classes ------- .. autoapisummary:: modeling_qwen2.Qwen2RMSNorm modeling_qwen2.Qwen2PreTrainedModel modeling_qwen2.Qwen2Model modeling_qwen2.Qwen2ForCausalLM modeling_qwen2.Qwen2ForSequenceClassification modeling_qwen2.Qwen2ForTokenClassification modeling_qwen2.Qwen2ForQuestionAnswering Module Contents --------------- .. py:class:: Qwen2RMSNorm(hidden_size, eps = 1e-06, quant_bits=16) Bases: :py:obj:`pymllm.backends.qualcomm.transformers.core.rms_norm.QRMSNorm` .. py:class:: Qwen2PreTrainedModel Bases: :py:obj:`transformers.modeling_utils.PreTrainedModel` .. py:attribute:: config :type: transformers.models.qwen2.configuration_qwen2.Qwen2Config .. py:attribute:: base_model_prefix :value: 'model' .. py:attribute:: supports_gradient_checkpointing :value: True .. py:class:: Qwen2Model(config) Bases: :py:obj:`Qwen2PreTrainedModel` .. py:attribute:: padding_idx .. py:attribute:: vocab_size .. py:attribute:: embed_tokens .. py:attribute:: layers .. py:attribute:: norm .. py:attribute:: rotary_emb .. py:attribute:: gradient_checkpointing :value: False .. py:attribute:: has_sliding_layers .. py:attribute:: sin_embedding_input_qdq .. py:attribute:: cos_embedding_input_qdq .. py:attribute:: norm_input_qdq .. py:method:: convert_rope_for_deploy() .. py:method:: forward(input_ids = None, attention_mask = None, position_ids = None, past_key_values = None, inputs_embeds = None, use_cache = None, cache_position = None, **kwargs) .. py:class:: Qwen2ForCausalLM(config) Bases: :py:obj:`Qwen2PreTrainedModel`, :py:obj:`transformers.generation.GenerationMixin` .. py:attribute:: config .. py:attribute:: model .. py:attribute:: vocab_size .. py:attribute:: lm_head .. py:attribute:: mllm_qualcomm_max_length :value: None .. py:attribute:: lm_head_input_qdq .. py:attribute:: lm_head_output_qdq .. py:method:: copy_lm_head_weight_from_embed_tokens() .. py:method:: forward(input_ids = None, attention_mask = None, position_ids = None, past_key_values = None, inputs_embeds = None, labels = None, use_cache = None, cache_position = None, logits_to_keep = 0, **kwargs) Example: ```python >>> from transformers import AutoTokenizer, Qwen2ForCausalLM >>> model = Qwen2ForCausalLM.from_pretrained("meta-qwen2/Qwen2-2-7b-hf") >>> tokenizer = AutoTokenizer.from_pretrained("meta-qwen2/Qwen2-2-7b-hf") >>> prompt = "Hey, are you conscious? Can you talk to me?" >>> inputs = tokenizer(prompt, return_tensors="pt") >>> # Generate >>> generate_ids = model.generate(inputs.input_ids, max_length=30) >>> tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0] "Hey, are you conscious? Can you talk to me?\nI'm not conscious, but I can talk to you." ``` .. py:class:: Qwen2ForSequenceClassification Bases: :py:obj:`transformers.modeling_layers.GenericForSequenceClassification`, :py:obj:`Qwen2PreTrainedModel` .. py:class:: Qwen2ForTokenClassification Bases: :py:obj:`transformers.modeling_layers.GenericForTokenClassification`, :py:obj:`Qwen2PreTrainedModel` .. py:class:: Qwen2ForQuestionAnswering Bases: :py:obj:`transformers.modeling_layers.GenericForQuestionAnswering`, :py:obj:`Qwen2PreTrainedModel` .. py:attribute:: base_model_prefix :value: 'transformer'