Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'rope_type' #32167

Closed
2 of 4 tasks
pseudotensor opened this issue Jul 23, 2024 · 5 comments · Fixed by #32182
Closed
2 of 4 tasks

KeyError: 'rope_type' #32167

pseudotensor opened this issue Jul 23, 2024 · 5 comments · Fixed by #32182
Labels

Comments

@pseudotensor
Copy link

System Info

transformers 4.31.1 is broken, 4.24.4 was ok

Who can help?

@ArthurZucker

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

from transformers import AutoConfig

model = 'deepseek-ai/deepseek-coder-33b-instruct'
model = 'togethercomputer/Llama-2-7B-32K-Instruct'

AutoConfig.from_pretrained(model)

for either model, gives now:

Traceback (most recent call last):
  File "/home/jon/h2ogpt/checkropetype.py", line 6, in <module>
    AutoConfig.from_pretrained(model, rope_scaling=1)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 996, in from_pretrained
    return config_class.from_dict(config_dict, **unused_kwargs)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/configuration_utils.py", line 772, in from_dict
    config = cls(**config_dict)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/models/llama/configuration_llama.py", line 192, in __init__
    rope_config_validation(self)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/modeling_rope_utils.py", line 546, in rope_config_validation
    validation_fn(config)
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/modeling_rope_utils.py", line 379, in _validate_linear_scaling_rope_parameters
    rope_type = rope_scaling["rope_type"]
KeyError: 'rope_type'

Expected behavior

no errors

@alat-rights
Copy link

Hi! Can confirm that I encountered this too, with deepseek-coder-1.3b-base.

I've encountered this issue both on 4.43.1 and 4.43.0.

image

@mxkopy
Copy link
Contributor

mxkopy commented Jul 24, 2024

It seems like the issue lies in the following sections still indexing "rope_type" instead of config.rope_scaling["type"].

def _validate_default_rope_parameters(config: PretrainedConfig):
rope_scaling = config.rope_scaling
rope_type = rope_scaling["rope_type"]
required_keys = {"rope_type"}
received_keys = set(rope_scaling.keys())
_check_received_keys(rope_type, received_keys, required_keys)
def _validate_linear_scaling_rope_parameters(config: PretrainedConfig):
rope_scaling = config.rope_scaling
rope_type = rope_scaling["rope_type"]
required_keys = {"rope_type", "factor"}
received_keys = set(rope_scaling.keys())
_check_received_keys(rope_type, received_keys, required_keys)
factor = rope_scaling["factor"]
if factor is None or not isinstance(factor, float) or factor < 1.0:
raise ValueError(f"`rope_scaling`'s factor field must be a float >= 1, got {factor}")
def _validate_dynamic_scaling_rope_parameters(config: PretrainedConfig):
rope_scaling = config.rope_scaling
rope_type = rope_scaling["rope_type"]
required_keys = {"rope_type", "factor"}
# TODO (joao): update logic for the inclusion of `original_max_position_embeddings`
optional_keys = {"original_max_position_embeddings"}
received_keys = set(rope_scaling.keys())
_check_received_keys(rope_type, received_keys, required_keys, optional_keys)
factor = rope_scaling["factor"]
if factor is None or not isinstance(factor, float) or factor < 1.0:
raise ValueError(f"`rope_scaling`'s factor field must be a float >= 1, got {factor}")
def _validate_yarn_parameters(config: PretrainedConfig):
rope_scaling = config.rope_scaling
rope_type = rope_scaling["rope_type"]
required_keys = {"rope_type", "factor"}
optional_keys = {"attention_factor", "beta_fast", "beta_slow"}
received_keys = set(rope_scaling.keys())
_check_received_keys(rope_type, received_keys, required_keys, optional_keys)
factor = rope_scaling["factor"]
if factor is None or not isinstance(factor, float) or factor < 1.0:
raise ValueError(f"`rope_scaling`'s factor field must be a float >= 1, got {factor}")
attention_factor = rope_scaling.get("attention_factor")
if attention_factor is not None and (not isinstance(attention_factor, float) or attention_factor < 0):
raise ValueError(
f"`rope_scaling`'s attention_factor field must be a float greater than 0, got {attention_factor}"
)
beta_fast = rope_scaling.get("beta_fast")
if beta_fast is not None and not isinstance(beta_fast, float):
raise ValueError(f"`rope_scaling`'s beta_fast field must be a float, got {beta_fast}")
beta_slow = rope_scaling.get("beta_slow")
if beta_slow is not None and not isinstance(beta_slow, float):
raise ValueError(f"`rope_scaling`'s beta_slow field must be a float, got {beta_slow}")
if (beta_fast or 32) < (beta_slow or 1):
raise ValueError(
f"`rope_scaling`'s beta_fast field must be greater than beta_slow, got beta_fast={beta_fast} "
f"(defaults to 32 if None) and beta_slow={beta_slow} (defaults to 1 if None)"
)
def _validate_longrope_parameters(config: PretrainedConfig):
rope_scaling = config.rope_scaling
rope_type = rope_scaling["rope_type"]
required_keys = {"rope_type", "short_factor", "long_factor"}
# TODO (joao): update logic for the inclusion of `original_max_position_embeddings`
optional_keys = {"attention_factor", "factor", "original_max_position_embeddings"}
received_keys = set(rope_scaling.keys())
_check_received_keys(rope_type, received_keys, required_keys, optional_keys)
partial_rotary_factor = config.partial_rotary_factor if hasattr(config, "partial_rotary_factor") else 1.0
dim = int((config.hidden_size // config.num_attention_heads) * partial_rotary_factor)
short_factor = rope_scaling.get("short_factor")
if not isinstance(short_factor, list) and all(isinstance(x, (int, float)) for x in short_factor):
raise ValueError(f"`rope_scaling`'s short_factor field must be a list of numbers, got {short_factor}")
if not len(short_factor) == dim // 2:
raise ValueError(f"`rope_scaling`'s short_factor field must have length {dim // 2}, got {len(short_factor)}")
long_factor = rope_scaling.get("long_factor")
if not isinstance(long_factor, list) and all(isinstance(x, (int, float)) for x in long_factor):
raise ValueError(f"`rope_scaling`'s long_factor field must be a list of numbers, got {long_factor}")
if not len(long_factor) == dim // 2:
raise ValueError(f"`rope_scaling`'s long_factor field must have length {dim // 2}, got {len(long_factor)}")
# Handle Phi3 divergence: prefer the use of `attention_factor` and/or `factor` over
# `original_max_position_embeddings` to compute internal variables. The latter lives outside `rope_scaling` and is
# unique to longrope (= undesirable)
if hasattr(config, "original_max_position_embeddings"):
logger.warning_once(
"This model has set a `original_max_position_embeddings` field, to be used together with "
"`max_position_embeddings` to determine a scaling factor. Please set the `factor` field of `rope_scaling`"
"with this ratio instead -- we recommend the use of this field over `original_max_position_embeddings`, "
"as it is compatible with most model architectures."
)
else:
factor = rope_scaling.get("factor")
if factor is None:
raise ValueError("Missing required keys in `rope_scaling`: 'factor'")
elif not isinstance(factor, float) or factor < 1.0:
raise ValueError(f"`rope_scaling`'s factor field must be a float >= 1, got {factor}")
attention_factor = rope_scaling.get("attention_factor")
if attention_factor is not None and not isinstance(attention_factor, float) or attention_factor < 0:
raise ValueError(
f"`rope_scaling`'s attention_factor field must be a float greater than 0, got {attention_factor}"
)
def _validate_llama3_parameters(config: PretrainedConfig):
rope_scaling = config.rope_scaling
rope_type = rope_scaling["rope_type"]
required_keys = {"rope_type", "factor", "original_max_position_embeddings", "low_freq_factor", "high_freq_factor"}
received_keys = set(rope_scaling.keys())
_check_received_keys(rope_type, received_keys, required_keys)
factor = rope_scaling["factor"]
if factor is None or not isinstance(factor, float) or factor < 1.0:
raise ValueError(f"`rope_scaling`'s factor field must be a float >= 1, got {factor}")
low_freq_factor = rope_scaling["low_freq_factor"]
high_freq_factor = rope_scaling["high_freq_factor"]
if low_freq_factor is None or not isinstance(low_freq_factor, float):
raise ValueError(f"`rope_scaling`'s low_freq_factor field must be a float, got {low_freq_factor}")
if high_freq_factor is None or not isinstance(high_freq_factor, float):
raise ValueError(f"`rope_scaling`'s high_freq_factor field must be a float, got {high_freq_factor}")
if high_freq_factor < low_freq_factor:
raise ValueError(
"`rope_scaling`'s high_freq_factor field must be greater than low_freq_factor, got high_freq_factor="
f"{high_freq_factor} and low_freq_factor={low_freq_factor}"
)
original_max_position_embeddings = rope_scaling["original_max_position_embeddings"]
if original_max_position_embeddings is None or not isinstance(original_max_position_embeddings, int):
raise ValueError(
"`rope_scaling`'s original_max_position_embeddings field must be an integer, got "
f"{original_max_position_embeddings}"
)
if original_max_position_embeddings >= config.max_position_embeddings:
raise ValueError(
"`rope_scaling`'s original_max_position_embeddings field must be less than max_position_embeddings, got "
f"{original_max_position_embeddings} and max_position_embeddings={config.max_position_embeddings}"
)

@ArthurZucker
Copy link
Collaborator

Indeed we will have a patch today to fix this

@shoang22
Copy link

Hi, I'm trying to use deepseek-ai/deepseek-coder-33b-instruct with TGI via docker, but I'm getting similar errors in the docker logs:

2024-09-22 09:50:15 2024-09-22T13:50:15.480520Z  INFO download: text_generation_launcher: Successfully downloaded weights for deepseek-ai/deepseek-coder-33b-instruct
2024-09-22 09:50:15 2024-09-22T13:50:15.480769Z  INFO shard-manager: text_generation_launcher: Starting shard rank=0
2024-09-22 09:50:19 2024-09-22T13:50:19.254647Z ERROR text_generation_launcher: Error when initializing model
2024-09-22 09:50:19 Traceback (most recent call last):
2024-09-22 09:50:19   File "/opt/conda/bin/text-generation-server", line 8, in <module>
2024-09-22 09:50:19     sys.exit(app())
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in __call__
2024-09-22 09:50:19     return get_command(self)(*args, **kwargs)
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
2024-09-22 09:50:19     return self.main(*args, **kwargs)
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
2024-09-22 09:50:19     return _main(
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
2024-09-22 09:50:19     rv = self.invoke(ctx)
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
2024-09-22 09:50:19     return _process_result(sub_ctx.command.invoke(sub_ctx))
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
2024-09-22 09:50:19     return ctx.invoke(self.callback, **ctx.params)
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
2024-09-22 09:50:19     return __callback(*args, **kwargs)
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
2024-09-22 09:50:19     return callback(**use_params)  # type: ignore
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 118, in serve
2024-09-22 09:50:19     server.serve(
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 297, in serve
2024-09-22 09:50:19     asyncio.run(
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
2024-09-22 09:50:19     return loop.run_until_complete(main)
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
2024-09-22 09:50:19     self.run_forever()
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
2024-09-22 09:50:19     self._run_once()
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
2024-09-22 09:50:19     handle._run()
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
2024-09-22 09:50:19     self._context.run(self._callback, *self._args)
2024-09-22 09:50:19 > File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 231, in serve_inner
2024-09-22 09:50:19     model = get_model(
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init__.py", line 706, in get_model
2024-09-22 09:50:19     return FlashCausalLM(
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_causal_lm.py", line 878, in __init__
2024-09-22 09:50:19     config = config_class.from_pretrained(
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 996, in from_pretrained
2024-09-22 09:50:19     return config_class.from_dict(config_dict, **unused_kwargs)
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/transformers/configuration_utils.py", line 772, in from_dict
2024-09-22 09:50:19     config = cls(**config_dict)
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/transformers/models/llama/configuration_llama.py", line 192, in __init__
2024-09-22 09:50:19     rope_config_validation(self)
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_rope_utils.py", line 546, in rope_config_validation
2024-09-22 09:50:19     validation_fn(config)
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_rope_utils.py", line 379, in _validate_linear_scaling_rope_parameters
2024-09-22 09:50:19     rope_type = rope_scaling["rope_type"]
2024-09-22 09:50:19 KeyError: 'rope_type'
2024-09-22 09:50:19 2024-09-22T13:50:19.786989Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:
2024-09-22 09:50:19 
2024-09-22 09:50:19 2024-09-22 13:50:16.492 | INFO     | text_generation_server.utils.import_utils:<module>:75 - Detected system cuda
2024-09-22 09:50:19 Traceback (most recent call last):
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/bin/text-generation-server", line 8, in <module>
2024-09-22 09:50:19     sys.exit(app())
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 118, in serve
2024-09-22 09:50:19     server.serve(
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 297, in serve
2024-09-22 09:50:19     asyncio.run(
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
2024-09-22 09:50:19     return loop.run_until_complete(main)
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
2024-09-22 09:50:19     return future.result()
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 231, in serve_inner
2024-09-22 09:50:19     model = get_model(
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init__.py", line 706, in get_model
2024-09-22 09:50:19     return FlashCausalLM(
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_causal_lm.py", line 878, in __init__
2024-09-22 09:50:19     config = config_class.from_pretrained(
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 996, in from_pretrained
2024-09-22 09:50:19     return config_class.from_dict(config_dict, **unused_kwargs)
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/transformers/configuration_utils.py", line 772, in from_dict
2024-09-22 09:50:19     config = cls(**config_dict)
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/transformers/models/llama/configuration_llama.py", line 192, in __init__
2024-09-22 09:50:19     rope_config_validation(self)
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_rope_utils.py", line 546, in rope_config_validation
2024-09-22 09:50:19     validation_fn(config)
2024-09-22 09:50:19 
2024-09-22 09:50:19   File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_rope_utils.py", line 379, in _validate_linear_scaling_rope_parameters
2024-09-22 09:50:19     rope_type = rope_scaling["rope_type"]
2024-09-22 09:50:19 
2024-09-22 09:50:19 KeyError: 'rope_type'
2024-09-22 09:50:19  rank=0
2024-09-22 09:50:19 2024-09-22T13:50:19.886203Z ERROR text_generation_launcher: Shard 0 failed to start
2024-09-22 09:50:19 2024-09-22T13:50:19.886384Z  INFO text_generation_launcher: Shutting down shards
2024-09-22 09:50:19 Error: ShardCannotStart

@LysandreJik
Copy link
Member

Hey @shoang22, this is linked to text-generation-inference, not transformers; do you mind opening an issue there? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants