Convert pt with a fine-tuned state_dict #2117

Codereamp · 2024-05-01T16:00:39Z

Codereamp
May 1, 2024

Hi,
I have a fine-tuned model (base-refined.pt) that can't be converted directly to ggml, convert-pt-to-ggml.py script complains about lacking of dims entry. As I saw from the example of usage, this model successfully works with the original whisper with the following steps

The original base model loaded and tested
load_state_dict is used to load the dict from base-refined.pt and an additional decoding test outputs a refined version of recognition

I tried to use convert-pt-to-ggml.py as the base and modify it. The first naive step

   checkpoint2 = torch.load(fp2, map_location="cpu")
   checkpoint["model_state_dict"] = checkpoint2

converted the file, but inference complained about unknown tensor 'model.encoder.positional_embedding' even before "main processing", so after finding that the dict entries all prefixed with model.. I changed this to

   new_state_dict = {k.partition('model.')[2]: v for k,v in checkpoint2.items()}
   checkpoint["model_state_dict"] = new_state_dict

which moved the things further, but now "main processing" has started, but new error is

GGML_ASSERT: C:/msys64/home/{ }/whisper.cpp/ggml.c:13455: src0->type == GGML_TYPE_F16

Is there something obvious I'm missing here or additional information about this model file is needed ?

Thanks

Codereamp · 2024-05-01T18:25:43Z

Codereamp
May 1, 2024
Author

A correction to my observations. The F16 vs F32 assert failure was proably because I introduced a new second parameter (the name of the refined model) so my extra parameter count triggered F32 logic (as if use-f32 was provided in the original script)

But after this fix (the source is available at the fork ) I see a segmenation fault error while executing main.exe as below and the

whisper_model_load: qntvr         = 0
whisper_model_load: type          = 2 (base)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:      CPU total size =   147.37 MB
Segmentation fault

With the normal actions the line is usually whisper_model_load: model size = where I see Segmentation fault.

Is it possible that in order to provide a different state_dict I should also provide the same base file that was used when this model was trained? I suspect that it's not necessary since as I posted before, the original whisper was ok with the modified dict.

Also maybe some other tool exists in the project that might allow revealing what's wrong with the converted version of the bin file?

0 replies

Codereamp · 2024-05-03T20:06:45Z

Codereamp
May 3, 2024
Author

But after this fix (the source is available at the fork ) I see a segmentation fault error while executing main.exe as below and the

With a help of logging in whisper_model_load from whisper.cpp file member I noticed that whisper.cpp relies on the fact that all data arrays in state dictionary are float16 and saves the type for some of them while converting all others to float32. The side dictionary I used was all with float32 type arrays so I had to sync the types to what whisper.cpp expects in order to avoid invalid loading.

The modified fragment that relies on the actual type from arrays is below. It should be compatible with the existing convert-pt-to-ggml.py script also. I'm posting to make sure that my logic is not flawed and if it's ok, to be used when someone needs using some additional state dictionary with arbitrary float types.

 if use_f16:
      print("dtype is:", data.dtype)
      if n_dims < 2 or \
              name == "encoder.conv1.bias"   or \
              name == "encoder.conv2.bias"   or \
              name == "encoder.positional_embedding" or \
              name == "decoder.positional_embedding":
          print("  Converting to float32")
          data = data.astype(np.float32)
      else:
        print("  Converting to float16")
        data = data.astype(np.float16)

      if data.dtype == np.float16:
        ftype = 1
      else:
        ftype = 0
 else:
      data = data.astype(np.float32)
      ftype = 0

To compare, the original fragment is

    ftype = 1
    if use_f16:
        if n_dims < 2 or \
                name == "encoder.conv1.bias"   or \
                name == "encoder.conv2.bias"   or \
                name == "encoder.positional_embedding" or \
                name == "decoder.positional_embedding":
            print("  Converting to float32")
            data = data.astype(np.float32)
            ftype = 0
    else:
        data = data.astype(np.float32)
        ftype = 0

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert pt with a fine-tuned state_dict #2117

{{title}}

Replies: 2 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Convert pt with a fine-tuned state_dict #2117

Codereamp May 1, 2024

Replies: 2 comments

Codereamp May 1, 2024 Author

Codereamp May 3, 2024 Author

Codereamp
May 1, 2024

Codereamp
May 1, 2024
Author

Codereamp
May 3, 2024
Author