Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use pretrained models from AutoModelForSeq2SeqLM (google/flan-t5-xxl) #89

Open
DavidAdamczyk opened this issue Jun 5, 2023 · 3 comments

Comments

@DavidAdamczyk
Copy link

DavidAdamczyk commented Jun 5, 2023

I would like to ask you how I can use model from HF:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-xxl")
model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-xxl")

I installed the latest diagNNose from git (version 1.20) and this is unclear to me how I can use that model:

from diagnnose.models import LanguageModel, import_model
config_dict["model"]["transformer_type"] = "google/flan-t5-xxl"
model = import_model(**config_dict["model"])

I read the diagnnose/models/huggingface_lm.py and this seems to me that models are not supported. Can you suggest how it is possible to load these model?

The error message is:

File ~/miniconda3/lib/python3.9/site-packages/diagnnose/models/huggingface_lm.py:40, in HuggingfaceLM.load_model(self, transformer_type, mode, cache_dir)
     36 auto_model = mode_to_auto_model.get(mode, AutoModel)
     38 self.is_causal = mode == "causal_lm"
---> 40 return auto_model.from_pretrained(transformer_type, cache_dir=cache_dir)

File ~/miniconda3/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py:470, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    466     model_class = _get_model_class(config, cls._model_mapping)
    467     return model_class.from_pretrained(
    468         pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    469     )
--> 470 raise ValueError(
    471     f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
    472     f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}."
    473 )

ValueError: Unrecognized configuration class <class 'transformers.models.t5.configuration_t5.T5Config'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, CodeGenConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, LlamaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MvpConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, TransfoXLConfig, TrOCRConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.
@jumelet
Copy link
Collaborator

jumelet commented Jun 5, 2023

Hi David,

Cool that you're planning to use diagnnose! During the development of diagnnose a few years back I never really focused on seq2seq models, mainly on Causal LMs. But it shouldn't be hard to incorporate that, what diagnnose utility were you planning to use?

@jumelet
Copy link
Collaborator

jumelet commented Jun 5, 2023

If you only intend to use the activation extraction utility you may want to look into minicons as well: https://github.com/kanishkamisra/minicons

@DavidAdamczyk
Copy link
Author

DavidAdamczyk commented Jul 18, 2023

Thank you for the response @jumelet 👍🏻
I decided to use llama model instead of flan-t5. May I ask how I can determine what llama model from HF transformers is supported by diagNNose? Or can you suggest a particular llama model? Or maybe this is question for @dieuwkehupkes or @LorianColtof ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants