Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring feature selection #19

Merged
merged 3 commits into from
Nov 18, 2020
Merged

Conversation

ml-evs
Copy link
Collaborator

@ml-evs ml-evs commented Nov 5, 2020

Refactors feature selection with a multiple changes:

  • Added MODData.split() method for creating new MODData objects for test/train splits
  • Store target and cross NMI inside MODData by default
  • Loop over unique pairs of features rather than rows and columns (2x speedup)
  • Fix edgecase when providing both df_featurized and structures
  • Use same random seed for all features when computing NMI
  • For constant features, explicitly set diagonal to NaN to avoid overflow elsewhere
  • Added a "slow" marker to enable running only fast tests with e.g. pytest -m "not slow".

@ml-evs ml-evs force-pushed the ml-evs/ft_selection branch 3 times, most recently from c817cb1 to 9f4c290 Compare November 9, 2020 20:50
ml-evs and others added 3 commits November 13, 2020 17:06
- Store target and cross NMI inside MODData by default
- Loop unique over pairs rather than rows and columns
- Fix edgecase when providing both df_featurized and structures
- Use same random seed when computing NMI for all features
- For constant features, explicitly set diagonal to NaN to avoid
  overflow elsewhere
@@ -7,17 +7,19 @@

"""

from __future__ import annotations
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just found out about this, it lets you defer type annotation evaluation so e.g. methods of MODData can use MODData as a type-hint

@ml-evs ml-evs marked this pull request as ready for review November 13, 2020 17:51
@ml-evs ml-evs merged commit be27d9b into ppdebreuck:master Nov 18, 2020
@ml-evs ml-evs deleted the ml-evs/ft_selection branch November 18, 2020 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants