Issue with Multi-Mutant Sequence Handling #5

aaaaaaaaa21-code · 2024-08-14T08:55:39Z

Hi, I've been exploring your published model and find it very interesting. I attempted to run it with my data but encountered an error.

In my case, I'm working with sequences that contain multiple amino acid mutations. I changed the single_mutant flag in the read_experimental_data function within top_layer.py to false. This modification led to the following error:

Embeddings and labels are aligned
Traceback (most recent call last):
File "/media/dell/newdisk/EvolvePro-main/top_layer.py", line 355, in
df_test, df_all = top_layer(
File "/media/dell/newdisk/EvolvePro-main/top_layer.py", line 205, in top_layer
y_pred_test = model.predict(X_test)
File "/home/dell/anaconda3/lib/python3.9/site-packages/sklearn/ensemble/_forest.py", line 1064, in predict
X = self._validate_X_predict(X)
File "/home/dell/anaconda3/lib/python3.9/site-packages/sklearn/ensemble/_forest.py", line 641, in _validate_X_predict
X = self._validate_data(
File "/home/dell/anaconda3/lib/python3.9/site-packages/sklearn/base.py", line 633, in _validate_data
out = check_array(X, input_name="X", **check_params)
File "/home/dell/anaconda3/lib/python3.9/site-packages/sklearn/utils/validation.py", line 1026, in check_array
raise ValueError(
ValueError: Found array with 0 sample(s) (shape=(0, 1280)) while a minimum of 1 is required by RandomForestRegressor.

When I do not modify single_mutant to false, all the output results are 1. Could you help me resolve this issue?

Thank you for your assistance.

idmjky · 2024-08-14T17:56:52Z

Hi, Can you attach your input result csv file, so I can check the format. Also, is that aligned with the name column in your embedding files? if you can attach these two, then I can help debug this.

aaaaaaaaa21-code · 2024-08-15T06:03:54Z

Hi, Can you attach your input result csv file, so I can check the format. Also, is that aligned with the name column in your embedding files? if you can attach these two, then I can help debug this.

Thank you for your response and reminder. I realized the error was due to not inputting the sequence needed for prediction, resulting in the absence of X_test. I have now resolved this issue and successfully run the model!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Multi-Mutant Sequence Handling #5

Issue with Multi-Mutant Sequence Handling #5

aaaaaaaaa21-code commented Aug 14, 2024 •

edited

Loading

idmjky commented Aug 14, 2024

aaaaaaaaa21-code commented Aug 15, 2024

Issue with Multi-Mutant Sequence Handling #5

Issue with Multi-Mutant Sequence Handling #5

Comments

aaaaaaaaa21-code commented Aug 14, 2024 • edited Loading

idmjky commented Aug 14, 2024

aaaaaaaaa21-code commented Aug 15, 2024

aaaaaaaaa21-code commented Aug 14, 2024 •

edited

Loading