LLM-Enhanced Text-Attributed Graph (TAG) Representation Learning #9428

devanshamin · 2024-06-17T03:06:46Z

PR for #9361.

I have skipped the Design Choices section from the README. You can find the Design Choices section here. I can include it if necessary.

Please let me know if any changes are required.

Pre-commit Issue

Due to the pre-commit configuration for the torch_geometric package, the isort hook removes the newline between the external library imports and the tape package library imports. For example, in train.py:

import copy
from dataclasses import is_dataclass
from typing import Optional

import numpy as np
import pandas as pd
import torch
from jsonargparse import ActionConfigFile, ArgumentParser
from tape.config import DatasetName, FeatureType
from tape.dataset.dataset import GraphDataset
from tape.dataset.llm.engine import LlmOfflineEngineArgs, LlmOnlineEngineArgs
from tape.dataset.lm_encoder import LmEncoderArgs
from tape.gnn_model import NodeClassifierArgs
from tape.trainer.gnn_trainer import GnnTrainer, GnnTrainerArgs
from tape.utils import profile_execution

It also disrupts the order of imports. For example, in gnn_trainer.py:

from dataclasses import dataclass
from typing import Literal, Optional

import torch
from tape.dataset.dataset import GraphDataset
from tape.gnn_model import NodeClassifier, NodeClassifierArgs

from torch_geometric.data import Data

Kh4L · 2024-07-10T08:03:21Z

examples/llm/tape/tape/dataset/llm/offline/base.py

+from vllm import LLM, SamplingParams
+
+
+class LlmOfflineEngine(LlmEngine):


I think there's some overlap with Rishi's current work on adding G-Retriever: #9462

It would be nice to keep the LLM models in a specific directory, as it was done by Rishi in the above PR.

+@puririshi98 for further insights

yes i definitely agree here. as it stands this pr is basicaly the equivalent of making your own research repository on github and using PyG. there is no integration into the actual framework. there are 32 changed files which is far too much to effectively review at once. For adding G-retriever I have broken my PR down into smaller PRs first. however before embarking on that I recommend you understand how your work could actualy be integrated into torch_geometric itself, rather than just what is essentially a standalone example repository.

Thank you, @Kh4L and @puririshi98, for your feedback.

@puririshi98, I agree with you. There is no integration into the actual framework; hence, it is under the examples directory. The proposed TAPE method converts title/abstract text into LLM responses (prediction and explanation) into node features and uses it to train respective GNN models and combine the result of each GNN model. I have created a repository extending this method: tag-llm.

@rusty1s, how would you like to move forward with this since you proposed it?

devanshamin added 4 commits June 16, 2024 22:31

Add tape

d16744e

Update changelog

6ffccce

Update readme

f5fcc1b

Add note to training section

a0e73e0

devanshamin requested a review from wsad1 as a code owner June 17, 2024 03:06

github-actions bot added the example label Jun 17, 2024

rusty1s assigned devanshamin Jun 17, 2024

rusty1s added feature 1 - Priority P1 labels Jun 17, 2024

devanshamin added 6 commits June 17, 2024 06:32

Merge branch 'master' into llm_enhanced_gnn

848322c

Move utils

83f0ff6

Add support for seed runs

4425449

Update training section

0e7cefb

Merge branch 'master' into llm_enhanced_gnn

7f8059b

Add missing label

c7c14eb

Kh4L reviewed Jul 10, 2024

View reviewed changes

puririshi98 self-requested a review July 11, 2024 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM-Enhanced Text-Attributed Graph (TAG) Representation Learning #9428

LLM-Enhanced Text-Attributed Graph (TAG) Representation Learning #9428

devanshamin commented Jun 17, 2024 •

edited

Loading

Kh4L Jul 10, 2024

puririshi98 Jul 11, 2024

devanshamin Jul 11, 2024

		from vllm import LLM, SamplingParams


		class LlmOfflineEngine(LlmEngine):

LLM-Enhanced Text-Attributed Graph (TAG) Representation Learning #9428

Are you sure you want to change the base?

LLM-Enhanced Text-Attributed Graph (TAG) Representation Learning #9428

Conversation

devanshamin commented Jun 17, 2024 • edited Loading

Pre-commit Issue

Kh4L Jul 10, 2024

Choose a reason for hiding this comment

puririshi98 Jul 11, 2024

Choose a reason for hiding this comment

devanshamin Jul 11, 2024

Choose a reason for hiding this comment

devanshamin commented Jun 17, 2024 •

edited

Loading