How to use pretrained models from AutoModelForSeq2SeqLM (google/flan-t5-xxl) #89

DavidAdamczyk · 2023-06-05T14:49:03Z

I would like to ask you how I can use model from HF:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-xxl")
model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-xxl")

I installed the latest diagNNose from git (version 1.20) and this is unclear to me how I can use that model:

from diagnnose.models import LanguageModel, import_model
config_dict["model"]["transformer_type"] = "google/flan-t5-xxl"
model = import_model(**config_dict["model"])

I read the diagnnose/models/huggingface_lm.py and this seems to me that models are not supported. Can you suggest how it is possible to load these model?

The error message is:

File ~/miniconda3/lib/python3.9/site-packages/diagnnose/models/huggingface_lm.py:40, in HuggingfaceLM.load_model(self, transformer_type, mode, cache_dir)
     36 auto_model = mode_to_auto_model.get(mode, AutoModel)
     38 self.is_causal = mode == "causal_lm"
---> 40 return auto_model.from_pretrained(transformer_type, cache_dir=cache_dir)

File ~/miniconda3/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py:470, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    466     model_class = _get_model_class(config, cls._model_mapping)
    467     return model_class.from_pretrained(
    468         pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    469     )
--> 470 raise ValueError(
    471     f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
    472     f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}."
    473 )

ValueError: Unrecognized configuration class <class 'transformers.models.t5.configuration_t5.T5Config'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, CodeGenConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, LlamaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MvpConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, TransfoXLConfig, TrOCRConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.

jumelet · 2023-06-05T14:57:42Z

Hi David,

Cool that you're planning to use diagnnose! During the development of diagnnose a few years back I never really focused on seq2seq models, mainly on Causal LMs. But it shouldn't be hard to incorporate that, what diagnnose utility were you planning to use?

jumelet · 2023-06-05T14:59:05Z

If you only intend to use the activation extraction utility you may want to look into minicons as well: https://github.com/kanishkamisra/minicons

DavidAdamczyk · 2023-07-18T09:50:48Z

Thank you for the response @jumelet 👍🏻
I decided to use llama model instead of flan-t5. May I ask how I can determine what llama model from HF transformers is supported by diagNNose? Or can you suggest a particular llama model? Or maybe this is question for @dieuwkehupkes or @LorianColtof ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use pretrained models from AutoModelForSeq2SeqLM (google/flan-t5-xxl) #89

How to use pretrained models from AutoModelForSeq2SeqLM (google/flan-t5-xxl) #89

DavidAdamczyk commented Jun 5, 2023 •

edited

Loading

jumelet commented Jun 5, 2023

jumelet commented Jun 5, 2023

DavidAdamczyk commented Jul 18, 2023 •

edited

Loading

How to use pretrained models from AutoModelForSeq2SeqLM (google/flan-t5-xxl) #89

How to use pretrained models from AutoModelForSeq2SeqLM (google/flan-t5-xxl) #89

Comments

DavidAdamczyk commented Jun 5, 2023 • edited Loading

jumelet commented Jun 5, 2023

jumelet commented Jun 5, 2023

DavidAdamczyk commented Jul 18, 2023 • edited Loading

DavidAdamczyk commented Jun 5, 2023 •

edited

Loading

DavidAdamczyk commented Jul 18, 2023 •

edited

Loading