Fixed index mismatch issue when passing a dataframe to the sankey function which has been sorted in any way #39
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi there 👋
Two small changes here.
Noticed an issue when using this module the other day where if you pass a dataframe that has been sorted based on the weights, the output is incorrect (see example attached below). I did a bit of digging and I noticed that this is because you reindex
left
andright
if they are passed as a series but not theleftWeight
andrightWeight
, so when you then create thedataFrame
variable, there is in index mismatch and the values get jumbled up basically.second thing I noticed is that when you
check_data_matches_labels
on the right hand side, you were actually passing in theleftLabels
rather than therightLabels
which I think is incorrect.Example
Sample dataset
Create Sankey's