Skip to content
This repository has been archived by the owner on May 25, 2024. It is now read-only.

XGBoost Port Code requires access to Temp Files #13

Open
CiprianFlorin-Ifrim opened this issue Apr 17, 2022 · 0 comments
Open

XGBoost Port Code requires access to Temp Files #13

CiprianFlorin-Ifrim opened this issue Apr 17, 2022 · 0 comments

Comments

@CiprianFlorin-Ifrim
Copy link

CiprianFlorin-Ifrim commented Apr 17, 2022

As the title says, the XGboost port code uses a temporary file in APPDATA/LOCAL to create a temporary json file.
There is no info about this provided to the user. In fact, tested on 3 systems, the file was not generated because the Jupyter Notebook does not have access to the APPDATA/LOCAL folder, even with admin right or by trusting the notebook, it still cannot create it.

This is the type of error generated:
XGBoostError: [14:36:23] C:\Users\Administrator\workspace\xgboost-win64_release_1.0.0\dmlc-core\src\io\local_filesys.cc:209: Check failed: allow_null: LocalFileSystem::Open "C:\Users\ZW\AppData\Local\Temp\tmp_mu9qwkg": Permission denied

I have checked the xgboost.py file. The original code is:

def port_xgboost(clf, tmp_file=None, **kwargs):
    if tmp_file is None:
        with NamedTemporaryFile('w+', suffix='.json', encoding='utf-8') as tmp:
            clf.save_model(tmp.name)
            tmp.seek(0)
            decoded = json.load(tmp)
    else:
        clf.save_model(tmp_file)

        with open(tmp_file, encoding='utf-8') as file:
            decoded = json.load(file)

    trees = [format_tree(tree) for tree in decoded['learner']['gradient_booster']['model']['trees']]

    return jinja('xgboost/xgboost.jinja', {
        'n_classes': int(decoded['learner']['learner_model_param']['num_class']),
        'trees': trees,
    }, {
        'classname': 'XGBClassifier'
    }, **kwargs)

SOLUTION:
By removing the None from:
def port_xgboost(clf, tmp_file=None, **kwargs):

The user can then specify the None in their python script if they would prefer (and if it works) a temp file in APPDATA/LOCAL or they can actually specify the directory with the file ending in .json:
print(port(xgb, tmp_file = "C:\\Users\\*username*\\Desktop\\test.json")))

And they can use the code exemplified for the DecisionTree/RandomForest to create a .h file:

with open('XGBoostClassifier.h', 'w') as file:
    file.write(port(xgb, tmp_file = "C:\\Users\\*username*\\Desktop\\test.json"))

Please update the library and add the documentation for the temp file/specified location.

Furthermore, please add all classes in the documentation. So the users know exactly how to use the namespace:
Example given: Eloquent::ML::Port::RandomForestRegressor regressor;

Correct namespace call for other ML types:

Eloquent::ML::Port::SVM name_to_be_used_in_code;
Eloquent::ML::Port::OneClassSVM name_to_be_used_in_code;
Eloquent::ML::Port::SEFR name_to_be_used_in_code;
Eloquent::ML::Port::DecisionTreeClassifier name_to_be_used_in_code;
Eloquent::ML::Port::DecisionTreeRegressor name_to_be_used_in_code;
Eloquent::ML::Port::RandomForestClassifier name_to_be_used_in_code;
Eloquent::ML::Port::GaussianNB name_to_be_used_in_code;
Eloquent::ML::Port::LogisticRegression name_to_be_used_in_code;
Eloquent::ML::Port::PCA name_to_be_used_in_code;
Eloquent::ML::Port::PrincipalFFT name_to_be_used_in_code;
Eloquent::ML::Port::LinearRegression name_to_be_used_in_code;
Eloquent::ML::Port::XGBClassifier name_to_be_used_in_code;

Thank you and take care!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant