Tracking Token Counter in RefineDocumentChain and MapReduceDocument Chain #26575

VedantR3907 · 2024-09-17T14:50:59Z

VedantR3907
Sep 17, 2024

I want to calculate the total token usage including intermidiate steps and the final output. with Input and output tokens. Is there a way to calculate tokens? I currently for testing am using Groq API with llama3.1-8B model. The code snippet is down below.

from langchain_groq import ChatGroq
from dotenv import load_dotenv
from langchain.chains import RefineDocumentsChain
from langchain.chains.llm import LLMChain
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader

load_dotenv()

llm = ChatGroq(
    model="llama-3.1-8b-instant",
    max_retries=2,
)

document_prompt = PromptTemplate(
    input_variables=["page_content"], template="{page_content}"
)
document_variable_name = "context"
# The prompt here should take as an input variable the
# `document_variable_name`
summarize_prompt = ChatPromptTemplate(
    [
        ("human", "Write a concise summary of the following: {context}"),
    ]
)

initial_llm_chain = LLMChain(llm=llm, prompt=summarize_prompt)
initial_response_name = "existing_answer"

refine_template = """
Produce a final summary.

Existing summary up to this point:
{existing_answer}

New context:
------------
{context}
------------

Given the new context, refine the original summary.
"""
refine_prompt = ChatPromptTemplate([("human", refine_template)])
refine_llm_chain = LLMChain(llm=llm, prompt=refine_prompt)

chain = RefineDocumentsChain(
    initial_llm_chain=initial_llm_chain,
    refine_llm_chain=refine_llm_chain,
    document_prompt=document_prompt,
    document_variable_name=document_variable_name,
    initial_response_name=initial_response_name,
    return_intermediate_steps = True
)

documents = TextLoader('./documents/text2.txt', encoding='utf8')

docs = documents.load()

print(len(docs))
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=1000, chunk_overlap=0,
)

split_docs = text_splitter.split_documents(docs)
print(f"Generated {len(split_docs)} documents.")

result = chain.invoke(split_docs)

Answered by dosubot[bot]

Sep 17, 2024

Yes, there are alternative solutions to calculate the token counter without writing a custom script. You can use a language model's built-in token counting method or other available methods in LangChain. Here are some options:

Using a Language Model's Built-in Token Counting Method

You can use a language model's built-in token counting method, such as ChatOpenAI(model="gpt-4o").

Example:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o")

def token_counter(messages):
    return llm.get_num_tokens(messages)

result = chain.invoke(split_docs)
total_tokens = token_counter(result)
print(f"Total token usage: {total_tokens}")

Using `tiktoken` for Custom Token Counting

If y…

View full answer

@dosu · 2024-09-17T14:52:47Z

dosubot[bot]
bot Sep 17, 2024

To calculate the total token usage, including intermediate steps and the final output, in the RefineDocumentsChain and MapReduceDocumentChain using the Groq API with the llama3.1-8B model, you can follow the steps outlined below. Here is a Python code snippet that demonstrates how to achieve this:

from langchain_groq import ChatGroq
from dotenv import load_dotenv
from langchain.chains import RefineDocumentsChain
from langchain.chains.llm import LLMChain
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader

load_dotenv()

llm = ChatGroq(
    model="llama-3.1-8b-instant",
    max_retries=2,
)

document_prompt = PromptTemplate(
    input_variables=["page_content"], template="{page_content}"
)
document_variable_name = "context"
# The prompt here should take as an input variable the
# `document_variable_name`
summarize_prompt = ChatPromptTemplate(
    [
        ("human", "Write a concise summary of the following: {context}"),
    ]
)

initial_llm_chain = LLMChain(llm=llm, prompt=summarize_prompt)
initial_response_name = "existing_answer"

refine_template = """
Produce a final summary.

Existing summary up to this point:
{existing_answer}

New context:
------------
{context}
------------

Given the new context, refine the original summary.
"""
refine_prompt = ChatPromptTemplate([("human", refine_template)])
refine_llm_chain = LLMChain(llm=llm, prompt=refine_prompt)

chain = RefineDocumentsChain(
    initial_llm_chain=initial_llm_chain,
    refine_llm_chain=refine_llm_chain,
    document_prompt=document_prompt,
    document_variable_name=document_variable_name,
    initial_response_name=initial_response_name,
    return_intermediate_steps=True
)

documents = TextLoader('./documents/text2.txt', encoding='utf8')

docs = documents.load()

print(len(docs))
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=1000, chunk_overlap=0,
)

split_docs = text_splitter.split_documents(docs)
print(f"Generated {len(split_docs)} documents.")

result = chain.invoke(split_docs)

# Calculate total token usage
total_tokens = sum(step['token_usage'] for step in result[1]['intermediate_steps'])
total_tokens += result[1]['final_output']['token_usage']

print(f"Total token usage: {total_tokens}")

This code sets up the RefineDocumentsChain with the llama3.1-8B model and calculates the total token usage by summing the token usage of each intermediate step and the final output ^[1]^[2].

To continue talking to Dosu, mention @dosu.

5 replies

VedantR3907 Sep 17, 2024
Author

@dosu The code snippet you provided doesnt work for me. I get the following error: -

KeyError                                  Traceback (most recent call last)
Cell In[30], [line 1](vscode-notebook-cell:?execution_count=30&line=1)
----> [1](vscode-notebook-cell:?execution_count=30&line=1) total_tokens = sum(step['token_usage'] for step in result[1]['intermediate_steps'])
      [2](vscode-notebook-cell:?execution_count=30&line=2) total_tokens += result[1]['final_output']['token_usage']
      [4](vscode-notebook-cell:?execution_count=30&line=4) print(f"Total token usage: {total_tokens}")

KeyError: 1

dosubot[bot] bot Sep 17, 2024

To fix the KeyError: 1 in your code snippet, you need to ensure that the dictionary keys being accessed are valid and exist. The error likely occurs because the key 1 is not present in the dictionary. Here is the corrected code snippet:

from langchain_groq import ChatGroq
from dotenv import load_dotenv
from langchain.chains import RefineDocumentsChain
from langchain.chains.llm import LLMChain
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader

load_dotenv()

llm = ChatGroq(
    model="llama-3.1-8b-instant",
    max_retries=2,
)

document_prompt = PromptTemplate(
    input_variables=["page_content"], template="{page_content}"
)
document_variable_name = "context"
# The prompt here should take as an input variable the
# `document_variable_name`
summarize_prompt = ChatPromptTemplate(
    [
        ("human", "Write a concise summary of the following: {context}"),
    ]
)

initial_llm_chain = LLMChain(llm=llm, prompt=summarize_prompt)
initial_response_name = "existing_answer"

refine_template = """
Produce a final summary.

Existing summary up to this point:
{existing_answer}

New context:
------------
{context}
------------

Given the new context, refine the original summary.
"""
refine_prompt = ChatPromptTemplate([("human", refine_template)])
refine_llm_chain = LLMChain(llm=llm, prompt=refine_prompt)

chain = RefineDocumentsChain(
    initial_llm_chain=initial_llm_chain,
    refine_llm_chain=refine_llm_chain,
    document_prompt=document_prompt,
    document_variable_name=document_variable_name,
    initial_response_name=initial_response_name,
    return_intermediate_steps=True
)

documents = TextLoader('./documents/text2.txt', encoding='utf8')

docs = documents.load()

print(len(docs))
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=1000, chunk_overlap=0,
)

split_docs = text_splitter.split_documents(docs)
print(f"Generated {len(split_docs)} documents.")

result = chain.invoke(split_docs)

# Calculate total token usage
total_tokens = 0
if 'intermediate_steps' in result[1]:
    total_tokens = sum(step.get('token_usage', 0) for step in result[1]['intermediate_steps'])
if 'final_output' in result[1]:
    total_tokens += result[1]['final_output'].get('token_usage', 0)

print(f"Total token usage: {total_tokens}")

This code ensures that the keys 'intermediate_steps' and 'final_output' exist in the result dictionary before attempting to access them, preventing the KeyError from occurring ^[1].

To continue talking to Dosu, mention @dosu.

VedantR3907 Sep 17, 2024
Author

@dosu there is no key with value 1, Thats why we are getting the error, is there any other solution or I need to write a custom script to calculate the tokens counter

dosubot[bot] bot Sep 17, 2024

Yes, there are alternative solutions to calculate the token counter without writing a custom script. You can use a language model's built-in token counting method or other available methods in LangChain. Here are some options:

Using a Language Model's Built-in Token Counting Method

You can use a language model's built-in token counting method, such as ChatOpenAI(model="gpt-4o").

Example:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o")

def token_counter(messages):
    return llm.get_num_tokens(messages)

result = chain.invoke(split_docs)
total_tokens = token_counter(result)
print(f"Total token usage: {total_tokens}")

Using `tiktoken` for Custom Token Counting

If you prefer to write a custom token counter, you can use the tiktoken library.

Example:

from typing import List
import tiktoken
from langchain_core.messages import BaseMessage, ToolMessage, HumanMessage, AIMessage, SystemMessage

def str_token_counter(text: str) -> int:
    enc = tiktoken.get_encoding("o200k_base")
    return len(enc.encode(text))

def tiktoken_counter(messages: List[BaseMessage]) -> int:
    num_tokens = 3  # every reply is primed with <|start|>assistant<|message|>
    tokens_per_message = 3
    tokens_per_name = 1
    for msg in messages:
        if isinstance(msg, HumanMessage):
            role = "user"
        elif isinstance(msg, AIMessage):
            role = "assistant"
        elif isinstance(msg, ToolMessage):
            role = "tool"
        elif isinstance(msg, SystemMessage):
            role = "system"
        else:
            raise ValueError(f"Unsupported messages type {msg.__class__}")
        num_tokens += (
            tokens_per_message
            + str_token_counter(role)
            + str_token_counter(msg.content)
        )
        if msg.name:
            num_tokens += tokens_per_name + str_token_counter(msg.name)
    return num_tokens

result = chain.invoke(split_docs)
total_tokens = tiktoken_counter(result)
print(f"Total token usage: {total_tokens}")

Using `SentenceTransformersTokenTextSplitter`

You can also use the SentenceTransformersTokenTextSplitter from LangChain, which uses a sentence model tokenizer to count tokens.

Example:

from langchain_text_splitters.sentence_transformers import SentenceTransformersTokenTextSplitter

splitter = SentenceTransformersTokenTextSplitter(
    model_name="sentence-transformers/all-mpnet-base-v2"
)

def sentence_transformers_token_counter(text: str) -> int:
    return splitter.count_tokens(text=text)

result = chain.invoke(split_docs)
total_tokens = sentence_transformers_token_counter(result)
print(f"Total token usage: {total_tokens}")

These methods provide alternative ways to calculate the token counter without writing a custom script ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

Answer selected by VedantR3907

VedantR3907 Sep 17, 2024
Author

I have another doubt, So suppose we are using RefineDocumentChain and we pass the first chunk summary to the next chunk to generate a combined summary now the input text is too long than the model size than what will happen.

VedantR3907 · 2024-09-21T20:52:17Z

VedantR3907
Sep 21, 2024
Author

I guess you might have to manually find the tokens using the llm.get_num_tokens() method. You need a walkthrough to calculate the tokens for your prompt templates, input and output tokens.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking Token Counter in RefineDocumentChain and MapReduceDocument Chain #26575

{{title}}

Replies: 2 comments 5 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Tracking Token Counter in RefineDocumentChain and MapReduceDocument Chain #26575

VedantR3907 Sep 17, 2024

Using a Language Model's Built-in Token Counting Method

Using tiktoken for Custom Token Counting

Replies: 2 comments · 5 replies

dosubot[bot] bot Sep 17, 2024

VedantR3907 Sep 17, 2024 Author

dosubot[bot] bot Sep 17, 2024

VedantR3907 Sep 17, 2024 Author

dosubot[bot] bot Sep 17, 2024

Using a Language Model's Built-in Token Counting Method

Using tiktoken for Custom Token Counting

Using SentenceTransformersTokenTextSplitter

VedantR3907 Sep 17, 2024 Author

VedantR3907 Sep 21, 2024 Author

VedantR3907
Sep 17, 2024

Using `tiktoken` for Custom Token Counting

Replies: 2 comments 5 replies

dosubot[bot]
bot Sep 17, 2024

VedantR3907 Sep 17, 2024
Author

VedantR3907 Sep 17, 2024
Author

Using `tiktoken` for Custom Token Counting

Using `SentenceTransformersTokenTextSplitter`

VedantR3907 Sep 17, 2024
Author

VedantR3907
Sep 21, 2024
Author