Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

添加长度外推 #126

Merged
merged 12 commits into from
Oct 26, 2023
Merged

添加长度外推 #126

merged 12 commits into from
Oct 26, 2023

Conversation

i4never
Copy link
Contributor

@i4never i4never commented Oct 26, 2023

Update

  • 添加YaRN rope scaling
  • 添加Dynamic rope scaling,修复transformers #25306中的问题
  • 使用streamlit重构web demo(介于长文本需要更user friendly的输入方式)
  • 添加chat函数,支持shell/web的infer stream输出

Usage

  • shell infer:
    CUDA_VISIBLE_DEVICES=0 python infer.py \
    --model_path tigerbot-13b-chat \
    --max_input_length 1024 \
    --max_generate_length 1024 \
    --streaming True
  • web demo:
    export PYTHONPATH='../' ; export CUDA_VISIBLE_DEVICES="2" ;streamlit run apps/web_demo.py

Known issue

  • #25104,由于generate使用kv cache,并且无法预先知道生成总长度,生成token的pos embed,以及嵌入kv cache中的pos embed与scaling的理论计算有差别,降低了scaling的鲁棒性。

@chentigerye chentigerye merged commit 6dcd051 into main Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants