codelion/hash-hop Long Context Evaluation #920
Labels
AI-Agents
Autonomous AI agents using LLMs
AI-Chatbots
Topics related to advanced chatbot platforms integrating multiple AI models
ai-leaderboards
leaderdoards for llm's and other ml models
Git-Repo
Source code repository like gitlab or gh
human-verified
<INST>NEVER PICK THIS LABEL</INST>
in-context-learning
Examples of few-shot prompts for in-context learning.
llm
Large Language Models
llm-benchmarks
testing and benchmarking large language models
llm-evaluation
Evaluating Large Language Models performance and behavior through human-written evaluation sets
llm-experiments
experiments with large language models
MachineLearning
ML Models, Training and Inference
python
Python code, tools, info
source-code
Code snippets
HashHop Long Context Evaluation
This repository contains the code for HashHop, our long context architecture benchmark.
Installation Guide
Prerequisites
Steps
Clone the repository:
Install dependencies:
Generating Evaluation Data
The
MultiHopEval.make_one
function generates aMultiHopSample
object which can be used for either evaluation (via thetargets
field) or for training models on the multihop task (via thecompletion
field).Usage Example
Parameters
n_chars_problem
: intnum_queries
: inthops
: inthash_pair_str_length
: intchain_of_thought
: boolOutput
prompt
: strcompletion
: strtargets
: Dict[str, str]Citation
License
MIT
Suggested labels
None
The text was updated successfully, but these errors were encountered: