Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1) #225

Open
ZiluLii opened this issue Jul 29, 2022 · 3 comments

Comments

@ZiluLii
Copy link

ZiluLii commented Jul 29, 2022

Hi, when running dense_retriever.py using the model checkpoint for validation. I got the error messages saying: IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1). What should I do to solve this?

My command:
python dense_retriever.py
model_file=/home/zl327/DPR/dpr/downloads/checkpoint/retriever/single/nq/bert-base-encoder.cp
qa_dataset=nq_test
ctx_datatsets=dpr_wiki
encoded_ctx_files="/home/zl327/DPR/dpr/downloads/data/retriever_results/nq/single/wikipedia_passages_*"
out_file=/home/zl327/DPR/dpr/result

Log:
2022-07-29 14:07:31,496 [INFO] faiss.loader: Loading faiss with AVX2 support.
2022-07-29 14:07:31,496 [INFO] faiss.loader: Could not load library with AVX2 support due to:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'")
2022-07-29 14:07:31,496 [INFO] faiss.loader: Loading faiss.
2022-07-29 14:07:31,512 [INFO] faiss.loader: Successfully loaded faiss.
/home/zl327/DPR/dense_retriever.py:472: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_path="conf", config_name="dense_retriever")
/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/_internal/defaults_list.py:251: UserWarning: In 'dense_retriever': Defaults list is missing _self_. See https://hydra.cc/docs/upgrades/1.0_to_1.1/default_composition_order for more information
warnings.warn(msg, UserWarning)
/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/core/default_element.py:124: UserWarning: In 'ctx_sources/default_sources': Usage of deprecated keyword in package header '# @Package group'.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/changes_to_package_header for more information
deprecation_warning(
/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/core/default_element.py:124: UserWarning: In 'datasets/retriever_default': Usage of deprecated keyword in package header '# @Package group'.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/changes_to_package_header for more information
deprecation_warning(
/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/core/default_element.py:124: UserWarning: In 'encoder/hf_bert': Usage of deprecated keyword in package header '# @Package group'.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/changes_to_package_header for more information
deprecation_warning(
/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/next/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
ret = run_job(
[2022-07-29 14:07:31,833][root][INFO] - CFG's local_rank=-1
[2022-07-29 14:07:31,833][root][INFO] - Env WORLD_SIZE=None
[2022-07-29 14:07:31,833][root][INFO] - Initialized host nikola-compute-16.cs.cornell.edu as d.rank -1 on device=cuda, n_gpu=4, world size=1
[2022-07-29 14:07:31,833][root][INFO] - 16-bits training: False
[2022-07-29 14:07:31,833][root][INFO] - Reading saved model from /home/zl327/DPR/dpr/downloads/checkpoint/retriever/single/nq/bert-base-encoder.cp
[2022-07-29 14:07:32,385][root][INFO] - model_state_dict keys odict_keys(['model_dict', 'optimizer_dict', 'scheduler_dict', 'offset', 'epoch', 'encoder_params'])
[2022-07-29 14:07:32,387][root][INFO] - CFG (after gpu configuration):
[2022-07-29 14:07:32,392][root][INFO] - encoder:
encoder_model_type: hf_bert
pretrained_model_cfg: bert-base-uncased
pretrained_file: null
projection_dim: 0
sequence_length: 256
dropout: 0.1
fix_ctx_encoder: false
pretrained: true
datasets:
nq_test:
target: dpr.data.retriever_data.CsvQASrc
file: data.retriever.qas.nq-test
nq_train:
target: dpr.data.retriever_data.CsvQASrc
file: data.retriever.qas.nq-train
nq_dev:
target: dpr.data.retriever_data.CsvQASrc
file: data.retriever.qas.nq-dev
trivia_test:
target: dpr.data.retriever_data.CsvQASrc
file: data.retriever.qas.trivia-test
trivia_train:
target: dpr.data.retriever_data.CsvQASrc
file: data.retriever.qas.trivia-train
trivia_dev:
target: dpr.data.retriever_data.CsvQASrc
file: data.retriever.qas.trivia-dev
webq_test:
target: dpr.data.retriever_data.CsvQASrc
file: data.retriever.qas.webq-test
curatedtrec_test:
target: dpr.data.retriever_data.CsvQASrc
file: data.retriever.qas.curatedtrec-test
ctx_sources:
dpr_wiki:
target: dpr.data.retriever_data.CsvCtxSrc
file: data.wikipedia_split.psgs_w100
id_prefix: 'wiki:'
indexers:
flat:
target: dpr.indexer.faiss_indexers.DenseFlatIndexer
hnsw:
target: dpr.indexer.faiss_indexers.DenseHNSWFlatIndexer
hnsw_sq:
target: dpr.indexer.faiss_indexers.DenseHNSWSQIndexer
qa_dataset: nq_test
ctx_datatsets: dpr_wiki
encoded_ctx_files: /home/zl327/DPR/dpr/downloads/data/retriever_results/nq/single/wikipedia_passages
*
out_file: /home/zl327/DPR/dpr/result
match: string
n_docs: 100
validation_workers: 16
batch_size: 128
do_lower_case: true
encoder_path: null
index_path: null
kilt_out_file: null
model_file: /home/zl327/DPR/dpr/downloads/checkpoint/retriever/single/nq/bert-base-encoder.cp
validate_as_tables: false
rpc_retriever_cfg_file: null
rpc_index_id: null
use_l2_conversion: false
use_rpc_meta: false
rpc_meta_compressed: false
indexer: flat
special_tokens: null
local_rank: -1
global_loss_buf_sz: 150000
device: cuda
distributed_world_size: 1
distributed_port: null
no_cuda: false
n_gpu: 4
fp16: false
fp16_opt_level: O1

[2022-07-29 14:07:32,615][dpr.models.hf_models][INFO] - Initializing HF BERT Encoder. cfg_name=bert-base-uncased
Some weights of the model checkpoint at bert-base-uncased were not used when initializing HFBertEncoder: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight']

  • This IS expected if you are initializing HFBertEncoder from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing HFBertEncoder from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    [2022-07-29 14:07:34,264][dpr.models.hf_models][INFO] - Initializing HF BERT Encoder. cfg_name=bert-base-uncased
    Some weights of the model checkpoint at bert-base-uncased were not used when initializing HFBertEncoder: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight']
  • This IS expected if you are initializing HFBertEncoder from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing HFBertEncoder from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    [2022-07-29 14:07:36,604][root][INFO] - Loading saved model state ...
    [2022-07-29 14:07:36,871][root][INFO] - Selecting standard question encoder
    [2022-07-29 14:07:40,468][root][INFO] - Encoder vector_size=768
    [2022-07-29 14:07:40,468][root][INFO] - qa_dataset: nq_test
    [2022-07-29 14:07:40,472][dpr.data.download_data][INFO] - Requested resource from https://dl.fbaipublicfiles.com/dpr/data/retriever/nq-test.qa.csv
    [2022-07-29 14:07:40,472][dpr.data.download_data][INFO] - Download root_dir /home/zl327/DPR
    [2022-07-29 14:07:40,473][dpr.data.download_data][INFO] - File to be downloaded as /home/zl327/DPR/downloads/data/retriever/qas/nq-test.csv
    [2022-07-29 14:07:40,473][dpr.data.download_data][INFO] - File already exist /home/zl327/DPR/downloads/data/retriever/qas/nq-test.csv
    [2022-07-29 14:07:40,473][dpr.data.download_data][INFO] - Loading from https://dl.fbaipublicfiles.com/dpr/nq_license/LICENSE
    [2022-07-29 14:07:40,473][dpr.data.download_data][INFO] - File already exist /home/zl327/DPR/downloads/data/retriever/qas/LICENSE
    [2022-07-29 14:07:40,473][dpr.data.download_data][INFO] - Loading from https://dl.fbaipublicfiles.com/dpr/nq_license/README
    [2022-07-29 14:07:40,473][dpr.data.download_data][INFO] - File already exist /home/zl327/DPR/downloads/data/retriever/qas/README
    [2022-07-29 14:07:40,506][root][INFO] - questions len 3610
    [2022-07-29 14:07:40,506][root][INFO] - questions_text len 0
    [2022-07-29 14:07:40,507][root][INFO] - Local Index class <class 'dpr.indexer.faiss_indexers.DenseFlatIndexer'>
    [2022-07-29 14:07:40,507][root][INFO] - Using special token None
    Error executing job with overrides: ['model_file=/home/zl327/DPR/dpr/downloads/checkpoint/retriever/single/nq/bert-base-encoder.cp', 'qa_dataset=nq_test', 'ctx_datatsets=dpr_wiki', 'encoded_ctx_files=/home/zl327/DPR/dpr/downloads/data/retriever_results/nq/single/wikipedia_passages_*', 'out_file=/home/zl327/DPR/dpr/result']
    Traceback (most recent call last):
    File "/home/zl327/DPR/dense_retriever.py", line 657, in
    main()
    File "/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/main.py", line 90, in decorated_main
    _run_hydra(
    File "/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/_internal/utils.py", line 389, in _run_hydra
    _run_app(
    File "/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/_internal/utils.py", line 452, in _run_app
    run_and_report(
    File "/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/_internal/utils.py", line 216, in run_and_report
    raise ex
    File "/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/_internal/utils.py", line 213, in run_and_report
    return func()
    File "/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/_internal/utils.py", line 453, in
    lambda: hydra.run(
    File "/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 132, in run
    _ = ret.return_value
    File "/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/core/utils.py", line 260, in return_value
    raise self._return_value
    File "/home/zl327/anaconda3/lib/python3.9/site-packages/hydra/core/utils.py", line 186, in run_job
    ret.return_value = task_function(task_cfg)
    File "/home/zl327/DPR/dense_retriever.py", line 547, in main
    questions_tensor = retriever.generate_question_vectors(questions, query_token=qa_src.special_query_token)
    File "/home/zl327/DPR/dense_retriever.py", line 122, in generate_question_vectors
    return generate_question_vectors(
    File "/home/zl327/DPR/dense_retriever.py", line 75, in generate_question_vectors
    max_vector_len = max(q_t.size(1) for q_t in batch_tensors)
    File "/home/zl327/DPR/dense_retriever.py", line 75, in
    max_vector_len = max(q_t.size(1) for q_t in batch_tensors)
    IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
@Louai98
Copy link

Louai98 commented Sep 20, 2022

Hi @ZiluLii , I got the same error. Have you by any chance found a solution?

@shyyyds
Copy link

shyyyds commented Nov 30, 2022

你好 @ZiluLii 我也遇到了同样的问题,不知道你有没有解决呀?感谢感谢

@abstractbyte
Copy link

In case anyone encounters this problem: commenting out the respective lines (75 - 81) of dense_retriever.py seems to fix the problem. There is even a comment mentioning the problem.

# TODO: this only works for Wav2vec pipeline but will crash the regular text pipeline
#max_vector_len = max(q_t.size(1) for q_t in batch_tensors)
#min_vector_len = min(q_t.size(1) for q_t in batch_tensors)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants