Skip to content

Official implementation for <Language Models as Inductive Reasoners>, accepted by EACL 2024.

Notifications You must be signed in to change notification settings

ZonglinY/Inductive_Reasoning

Repository files navigation

Inductive_Reasoning

This repository is the official implementation of the paper <Language Models as Inductive Reasoners>.
[Arxiv version].

In general, with this repository, you can
(1) generate hypotheses with the CoLM framework, and
(2) display results listed in the paper.

Generate hypotheses with the CoLM framework

Will be updated soon.

Display results listed in the paper

GPT-J's few-shot result

Automatic evaluation (a part of Table 4, full Table 5, and full Table 6): python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene/ --generator_model_type gptj --if_long_or_short_facts 1 --cnt_facts_as_input 3 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0

Human evaluation (a part of Table 4): python final_human_eval_result.py --output_dir ./Checkpoints/gptj_analysis_100test_newdata_newprompt_10 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0

GPT-J's finetune results

Automatic evaluation (a part of Table 4): python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene/ --generator_model_type gptj --if_long_or_short_facts 1 --cnt_facts_as_input 3 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 3 --if_already_fintuned_for_test 1

Human evaluation (a part of Table 4): python final_human_eval_result.py --output_dir ./Checkpoints/gptj_analysis_100test_newdata_newprompt_10 --setting_selection_M1_forM2M3 1 --setting_selection 3 --if_already_fintuned_for_test 1

Ablation on input facts (Table 7)

Long fact, 1 full fact: python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene_1fact_long/ --generator_model_type gptj --if_long_or_short_facts 0 --cnt_facts_as_input 1 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0

Short fact, 1 full fact: python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene_1fact/ --generator_model_type gptj --if_long_or_short_facts 1 --cnt_facts_as_input 1 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0

Short fact, 2 full facts: python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene_2fact/ --generator_model_type gptj --if_long_or_short_facts 1 --cnt_facts_as_input 2 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0

Short fact, 3 missing facts: python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene_missingfacts/ --generator_model_type gptj --if_long_or_short_facts 1 --cnt_facts_as_input 3 --if_full_or_missing_facts 1 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0

Llama's result (Table 9)

python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_llama_12_5gene_capitalYesNo/ --generator_model_type llama --if_long_or_short_facts 1 --cnt_facts_as_input 3 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0

About

Official implementation for <Language Models as Inductive Reasoners>, accepted by EACL 2024.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published