Replies: 3 comments
-
Hey @ItzBrein! I'm here to help you with your bug, questions, or anything else while you wait for a human maintainer. I'm a bot that can assist you with bug resolution, answer your questions, and guide you to become a contributor. Let's tackle this together! To resolve the Here is the relevant part of the code that addresses this issue: from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id, **model_kwargs)
if tokenizer.pad_token is None:
tokenizer.pad_token_id = model.config.eos_token_id You should include this code snippet when setting up your from langchain_huggingface import HuggingFacePipeline, ChatHuggingFace
from langchain_community.agent_toolkits import create_sql_agent
from transformers import AutoTokenizer, AutoModelForCausalLM
HF_TOKEN = ''
model_id = 'meta-llama/Meta-Llama-3.1-8B-Instruct'
# Load the tokenizer and set pad_token_id if necessary
tokenizer = AutoTokenizer.from_pretrained(model_id)
if tokenizer.pad_token is None:
tokenizer.pad_token_id = AutoModelForCausalLM.from_pretrained(model_id).config.eos_token_id
llm = HuggingFacePipeline.from_model_id(
model_id=model_id,
task="text-generation",
device=None,
model_kwargs=dict(
device_map="auto",
),
pipeline_kwargs=dict(
token=HF_TOKEN,
temperature=0.6,
max_new_tokens=512,
repetition_penalty=1.1
),
)
chat_model = ChatHuggingFace(llm=llm)
agent_executor = create_sql_agent(chat_model, db=db, handle_parsing_errors=True, verbose=True)
agent_executor.invoke({"input": "What state is in the database?"}) This should resolve the error and allow batching to work correctly [1][2]. |
Beta Was this translation helpful? Give feedback.
-
@ItzBrein use chat_model.llm.pipeline.tokenizer.pad_token_id = chat_model.llm.pipeline.tokenizer.eos_token_id |
Beta Was this translation helpful? Give feedback.
-
Is there no automated fix for this, why do we have to keep putting all these settings manually everytime. |
Beta Was this translation helpful? Give feedback.
-
Checked other resources
Commit to Help
Example Code
Description
I am trying to use HuggingFacePipeline with ChatHuggingFace. I am expecting the agent to generate a SQL query and answer the question, but instead, I am getting the following error:
File ~/.local/lib/python3.11/site-packages/transformers/pipelines/base.py:146, in pad_collate_fn(tokenizer, feature_extractor) 144 if tokenizer is not None: 145 if tokenizer.pad_token_id is None: --> 146 raise ValueError( 147 "Pipeline with tokenizer without pad_token cannot do batching. You can try to set it with " 148 "
pipe.tokenizer.pad_token_id = model.config.eos_token_id`."149 )
150 else:
151 t_padding_value = tokenizer.pad_token_id
ValueError: Pipeline with tokenizer without pad_token cannot do batching. You can try to set it with
pipe.tokenizer.pad_token_id = model.config.eos_token_id
.`System Info
OS: Linux
Beta Was this translation helpful? Give feedback.
All reactions