Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time to test equals() and hashCode() depends exponentially on the number of fields #203

Closed
36893488147419103231 opened this issue Aug 21, 2017 · 3 comments
Assignees

Comments

@36893488147419103231
Copy link
Contributor

36893488147419103231 commented Aug 21, 2017

In the project
https://github.com/36893488147419103231/pojo-tester-tester
in the package testertester.equalshashcode there are tests of 3 classes:

public class Small {
  int a,b,c
}
public class Medium {
  int a,b,c,d, e,f,g,h, i,j,k,l, m,n,o,p ,q,r,s,t, u;
}
public class Big {
  int a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z;
}

with equals() and hashCode() generated by IDE.

The first class is tested in no time, the second one takes about a minute, and the test for the third one did not terminate for me (but probably it would in about half an hour, unless it would take too much memory). UPD: bigTest failed after 8m 49s with java.lang.OutOfMemoryError: GC overhead limit exceeded.

It is important to test equals() and hashCode() in linear time.

I propose the following algorithm:

Start with 3 POJO objects, one filled with nulls (zeroes, etc.), and two others filled with values that do not repeat (well, it is difficult to find 20 different boolean values).
Initially for the class MyPojo {int a,b,c,d;}we have:

null, null, null, null
P,Q,R,S
W,X,Y,Z

On the first iteration, we will compare P, null, null, null with:

null, null, null, null
P, null, null, null
W, null, null, null

On the second iteration, we will compare P, Q, null, null with:

P, null, null, null
P, Q, null, null
P, X, null, null

On the 3rd iteration, we will compare P, Q, R, null with:

P, Q, null, null
P, Q, R, null
P, Q, Y, null

and so on.
Of course, you must check that a.equals(b) == b.equals(a), and that objects that are equal have equal hash codes.
You may check that the number of different returned hash codes is at least a half of the the number of objects that participated in comparisons.
I'd recommend performing additional comparisons on each iteration, e.g. that a.equals(null) == false, that a.equals(a) is true, and that a.equals(1) is false.

Note. This issue may be related on unrelated to #201

(UPDATE) Note 2. I believe, it is FieldUtils.permutations() called from ObjectGenerator.generateDifferentObjects() that is responsible for exponential growth. It for N fields, it returns 2^N permutations. If it returned a list of size N, consumption of both time and memory would be linear.

@36893488147419103231
Copy link
Contributor Author

36893488147419103231 commented Aug 22, 2017

I have added a patch to the same https://github.com/36893488147419103231/pojo-tester-tester project.
Basically, after

  @BeforeClass
  public static void init() {
    LinearPatch.patch();
  }

Method.EQUALS and Method.HASH_CODE start to use N sequences of N fields instead of 2^N permutations of N fields.
As a separate project, it is a rather dirty hack; you are welcome to integrate it as, say, Method.EQUALS_FAST and Method.HASH_CODE_FAST (I could not extend a enum; by the way, this likely means that Method should not be a enum).

Strictly speaking, it is quadratic rather than linear, but it is not that much important.

36893488147419103231 pushed a commit to 36893488147419103231/pojo-tester that referenced this issue Aug 26, 2017
Now it is possible to test equals() and hashCode() in quadratic rather
than exponential time (and 20^2 is much better than 2^20 for a POJO
with 20 fields corresponding to 20 columns in a DB table).
The old thorough testers are preserved.
@36893488147419103231
Copy link
Contributor Author

see pull requests

@sta-szek
Copy link
Owner

I will try to make review on Friday, thanks for contributions!

@ghost ghost assigned sta-szek Sep 25, 2017
sta-szek pushed a commit to 36893488147419103231/pojo-tester that referenced this issue Nov 4, 2017
Now it is possible to test equals() and hashCode() in quadratic rather
than exponential time (and 20^2 is much better than 2^20 for a POJO
with 20 fields corresponding to 20 columns in a DB table).
The old thorough testers are preserved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants