-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #118 - [Performance] Excel-like filter for huge data sets #119
Conversation
Signed-off-by: Dirk Fauth <[email protected]>
...se.nebula.widgets.nattable.core/src/org/eclipse/nebula/widgets/nattable/widget/NatCombo.java
Show resolved
Hide resolved
Switched from IntList to IntSet in NatCombo#select(int[]) Signed-off-by: Dirk Fauth <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I found a new one, I'd like to add:
ComboBoxFilterUtils.isAllSelected
, l. 69. Temporarily putting the dataCollection
into a set should do the trick (containsAll
calls contains on this
, not the provided collection).
@@ -264,7 +264,7 @@ && getComboBoxDataProvider() instanceof FilterRowComboBoxDataProvider | |||
// available items | |||
List<?> allValues = ((FilterRowComboBoxDataProvider) getComboBoxDataProvider()).getAllValues(getColumnIndex()); | |||
List<?> visibleValues = getComboBoxDataProvider().getValues(getColumnIndex(), getRowIndex()); | |||
List<?> diffValues = new ArrayList<>(allValues); | |||
HashSet<?> diffValues = new HashSet<>(allValues); | |||
diffValues.removeAll(visibleValues); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized: This does not actually solve the issue:
The expensive contains
operation within the removeAll
is not executed on "this" (ie. diffValues
) but the method argument, an ArrayList
in this case. See HashSet
, l. 175 (Java 11).
Sorry, my bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before I changed from ArrayList
to HashSet
I tested locally with this code:
@Test
void testListPerformance() {
ArrayList<String> all = new ArrayList<>();
ArrayList<String> flanders = new ArrayList<>();
for (int i = 0; i < 100_000; i++) {
all.add("Simpson_" + i);
all.add("Flanders_" + i);
flanders.add("Flanders_" + i);
}
long start = System.currentTimeMillis();
ArrayList<String> diff = new ArrayList<>(all);
diff.removeAll(flanders);
long end = System.currentTimeMillis();
System.out.println("ArrayList#removeAll(ArrayList) - " + (end - start));
start = System.currentTimeMillis();
HashSet<String> diffSet = new HashSet<>(all);
diffSet.removeAll(flanders);
end = System.currentTimeMillis();
System.out.println("HashSet#removeAll(ArrayList) - " + (end - start));
start = System.currentTimeMillis();
HashSet<String> diffSetSet = new HashSet<>(all);
diffSetSet.removeAll(new HashSet<>(flanders));
end = System.currentTimeMillis();
System.out.println("HashSet#removeAll(HashSet) - " + (end - start));
}
If diff
is a List, the operation takes about 60s. The diffSet
variant takes around 30ms. I would say that the test is similar to the code in discussion. Or what am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I found out why my test was working. It strongly depends on the size of the collection passed as parameter. As long as the collection is smaller than the "this" collection, things work fast. If the collection is bigger, the performance is a mess.
I adjusted the collection creation to verify it:
for (int i = 0; i < 100_000; i++) {
all.add("Simpson_" + i);
all.add("Flanders_" + i);
flanders.add("Flanders_" + i);
flanders.add("FlandersO_" + i);
}
for (int i = 0; i < 10_000; i++) {
flanders.add("FlandersXxx_" + i);
}
Now the HashSet#removeAll(ArrayList) takes 123s while HashSet#removeAll(HashSet) is still at about 30ms.
And some more (I just searched "systematically" through the code looking for potentially expensive collection operations in the filter code instead of stopping a running program if it took too long ;-) ):
|
No description provided.