Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Multi-Mutant Sequence Handling #5

Open
aaaaaaaaa21-code opened this issue Aug 14, 2024 · 2 comments
Open

Issue with Multi-Mutant Sequence Handling #5

aaaaaaaaa21-code opened this issue Aug 14, 2024 · 2 comments

Comments

@aaaaaaaaa21-code
Copy link

aaaaaaaaa21-code commented Aug 14, 2024

Hi, I've been exploring your published model and find it very interesting. I attempted to run it with my data but encountered an error.

In my case, I'm working with sequences that contain multiple amino acid mutations. I changed the single_mutant flag in the read_experimental_data function within top_layer.py to false. This modification led to the following error:

Embeddings and labels are aligned
Traceback (most recent call last):
File "/media/dell/newdisk/EvolvePro-main/top_layer.py", line 355, in
df_test, df_all = top_layer(
File "/media/dell/newdisk/EvolvePro-main/top_layer.py", line 205, in top_layer
y_pred_test = model.predict(X_test)
File "/home/dell/anaconda3/lib/python3.9/site-packages/sklearn/ensemble/_forest.py", line 1064, in predict
X = self._validate_X_predict(X)
File "/home/dell/anaconda3/lib/python3.9/site-packages/sklearn/ensemble/_forest.py", line 641, in _validate_X_predict
X = self._validate_data(
File "/home/dell/anaconda3/lib/python3.9/site-packages/sklearn/base.py", line 633, in _validate_data
out = check_array(X, input_name="X", **check_params)
File "/home/dell/anaconda3/lib/python3.9/site-packages/sklearn/utils/validation.py", line 1026, in check_array
raise ValueError(
ValueError: Found array with 0 sample(s) (shape=(0, 1280)) while a minimum of 1 is required by RandomForestRegressor.

When I do not modify single_mutant to false, all the output results are 1. Could you help me resolve this issue?

Thank you for your assistance.

@idmjky
Copy link
Owner

idmjky commented Aug 14, 2024

Hi, Can you attach your input result csv file, so I can check the format. Also, is that aligned with the name column in your embedding files? if you can attach these two, then I can help debug this.

@aaaaaaaaa21-code
Copy link
Author

Hi, Can you attach your input result csv file, so I can check the format. Also, is that aligned with the name column in your embedding files? if you can attach these two, then I can help debug this.

Thank you for your response and reminder. I realized the error was due to not inputting the sequence needed for prediction, resulting in the absence of X_test. I have now resolved this issue and successfully run the model!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants