-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What’s the best way to use Color palette and another image to condition outputs? #9
Comments
This is what I tried so far as color palette input and used text tokenizer.
Next, we use batched_sample = {}
# Initialize target modalities
for target_mod, ntoks in zip(target_domains, tokens_per_target):
batched_sample = init_empty_target_modality(batched_sample, MODALITY_INFO, target_mod, 1, ntoks, device)
batched_sample = custom_text(
batched_sample, input_text=caption, eos_token='[EOS]',
key='caption', device=device, text_tokenizer=text_tok
)
batched_sample = custom_text(
batched_sample, input_text=bboxes, eos_token='[EOS]',
key='det', device=device, text_tokenizer=text_tok
)
batched_sample = custom_text(
batched_sample, input_text=color_palette, eos_token='[EOS]',
key='color_palette', device=device, text_tokenizer=text_tok
) And finally create out_dict and dec_dict. But dec_dict fails and gives me an error.
|
Just want to confirm the best way to extract using caption and color palette as well please for retreival. This is what I have so far based on the input notebook. # Generation configurations
cond_domains = ["caption", "color_palette"]
target_domains = ["tok_dinov2_global"]
tokens_per_target = [16]
generation_config = {
"autoregression_schemes": ["roar"],
"decoding_steps": [1],
"token_decoding_schedules": ["linear"],
"temps": [2.0],
"temp_schedules": ["onex:0.5:0.5"],
"cfg_scales": [1.0],
"cfg_schedules": ["constant"],
"cfg_grow_conditioning": True,
}
top_p, top_k = 0.8, 0.0
schedule = build_chained_generation_schedules(
cond_domains=cond_domains,
target_domains=target_domains,
tokens_per_target=tokens_per_target,
**generation_config,
)
fm_model = FM.from_pretrained(FM_MODEL_PATH).eval().to(DEVICE)
sampler = GenerationSampler(fm_model)
for target_mod, ntoks in zip(target_domains, tokens_per_target):
batched_sample = init_empty_target_modality(
batched_sample, MODALITY_INFO, target_mod, 1, ntoks, DEVICE
)
batched_sample = custom_text(
batched_sample,
input_text=caption,
eos_token="[EOS]",
key="caption",
device=DEVICE,
text_tokenizer=text_tokenizer,
)
batched_sample = custom_text(
batched_sample,
input_text=color_palette,
eos_token="[EOS]",
key="color_palette",
device=DEVICE,
text_tokenizer=text_tokenizer,
)
out_dict = sampler.generate(
batched_sample,
schedule,
text_tokenizer=text_tokenizer,
verbose=True,
seed=0,
top_p=top_p,
top_k=top_k,
)
with torch.no_grad():
dec_dict = decode_dict(
out_dict,
{"tok_dinov2_global": vqvae.to(DEVICE)},
text_tokenizer,
image_size=IMG_SIZE,
patch_size=16,
decoding_steps=1,
)
combined_features = dec_dict["tok_dinov2_global"] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you authors for open sourcing your amazing work.
What would be the best way to use Color palette for image generation and image retrieval please?
The text was updated successfully, but these errors were encountered: