Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

define the structure's field order with metadata #954

Closed
wants to merge 1 commit into from

Conversation

idosu
Copy link
Contributor

@idosu idosu commented Apr 28, 2018

define a Structure using an annotation instead of implementing a method.
instead of this:

class Parent extends Structure {
    public int n;
    public String s;
    protected List<String> getFieldOrder() {
        return Arrays.asList("n", "s");
    }
}
class Son extends Parent {
    public double d;
    public char c;
    protected List<String> getFieldOrder() {
        List<String> fields = new LinkedList<String>(super.getFieldOrder());
        fields.addAll(Arrays.asList("d", "c"));
        return fields;
    }
}

use this:

@FieldOrder({ "n", "s" })
class Parent extends Structure {
    public int n;
    public String s;
}
@FieldOrder({ "d", "c" })
class Son extends Parent {
    public double d;
    public char c;
}

In both examples calling new Son().getFieldOrder() will resolve in this list [ "a", "b", "c", "d" ]

It improves readability and makes that the user could not forget to add super's fields because it is added automatically by the library

// TODO(idosu 28 Apr 2018): Maybe deprecate this method to let users know they should use @FieldOrder
protected List<String> getFieldOrder() {
List<String> fields = new LinkedList<String>();
for (Class<?> clazz = getClass(); clazz != Structure.class; clazz = clazz.getSuperclass()) {
Copy link
Member

@matthiasblaesing matthiasblaesing Apr 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you walk the inheritance structure here, you'll miss classes, that extend Structure, but don't use the annotation. So what I would do at this point:

  • fetch the field order for the super class via a super.getFieldOrder() call
  • extract the local order from the annotation and append it

This way you can mix the definitions. As the order is caches, this should not have a big impact on performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not fixed

I thought about it but I don't agree, because if a class overrides this method no sub class can call the default implementation i.e. getting with annotation

class Parent extends Structure {
    @Override
    protected List<String> getFieldOrder() {
        return Arrays.asList("a");
    }
}

class Son extends C {
    @Override
    protected List<String> getFieldOrder() {
        // super.getFieldOrder() returns [ "a" ]
        // Structure.super.getFieldOrder() - this is not valid
    }
}

plus calling getClass() in super.getFieldOrder() will result in the same class as it would if you write it here and therefore will result in a StackOverflowException.

Could you show me an example perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I had another idea to check in this.fieldOrder if we already got fields for the super class and use them instead of iterating the inheritance structure. what do you think, does it worth the effort?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right - subclasses can still call super#getFieldOrder and so build a complete list if a super class uses the @FieldOrder annotation. It will fail if a child class is using the annotation and the super class is not. In this case the member check at runtime will blow. For the 5.0.0 release it would be ok to update all structures in the platform.

Copy link
Contributor Author

@idosu idosu May 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that if the the parent class's getFieldOrder method was overridden there is no way to call the original code of Structure#getFieldOrder therefor if the user wants to extend the parent class he must override getFieldOrder and add the new fields manually.

I can create an helper method to use when using an old library(without the usage of @FieldOrder) so the user could just override getFieldOrder and return getNonAnnotatedAndAnnotatedFieldOrder(), but it seems to be error prone(in this configuration: classWithAnnotation extends classWithout extends classWith extends classWithout)

protected List<String> getFieldOrder() {
    return getAnnotatedFieldOrder();
}

protected final List<String> getAnnotatedFieldOrder() {
    // the code of the current implementation of getFieldOrder
}

protected final List<String> getNonAnnotatedAndAnnotatedFieldOrder() {
    List<String> parent = getFieldOrder();
    List<String> annotated = getAnnotatedFieldOrder();

    List<String> fieldOrder = new ArrayList<String>(parent.size() + annotated.size());
    fieldOrder.addAll(parent);
    fieldOrder.addAll(annotated);

    return fieldOrder;
}

Should I add it anyway?

Copy link
Member

@matthiasblaesing matthiasblaesing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments inline. Please have a look at these. In general I like the idea.

*
* When defining a new {@link Structure} you shouldn't override this
* method, but use {@link FieldOrder} annotation to define your field
* order(this also works with inheritance), For example,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not repeat the example here - if some one overlooks it, so be it. But please have another look at the java of the class - there is also a hint for the requirement of implementing getFieldOrder, that text should also be adjusted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

(remove the example) - I left the comment not to override, is that ok?

(fix other javadoc) - fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this sounds good.

@@ -865,20 +871,79 @@ protected void writeField(StructField structField) {
}
}

/** Return this Structure's field names in their proper order. For
* example,
/** Used to decllare fields order as metadata instead of method.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: decllare -> declare

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

* protected List getFieldOrder() {
* return Arrays.asList(new String[] { ... });
* // New
* <!-- -->{@literal @}FieldOrder({ "n", "s" })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for these empty html comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Yes... in vscode if a javadoc line starts with a '@' it implodes all the code lines into one line, You are right I should remove that and hope that vscode fix this problem.
microsoft/vscode#48898

* }
* }
* </code></pre>
* @author 28 Apr 2018 idosu
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove - it will be recorded in the git history

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not fixed

Could you please elaborate on that? because if this comment would not be written no one would know(from the code) how to use it.

Or did you want me to remove line 906(@author 28 Apr 2018 idosu)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant only the @author part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -227,6 +228,11 @@ protected void ensureAllocated() {
return FIELDS;
}
}
@FieldOrder({ "field0", "field1" })
public static class EasyTestStructure0 extends FilledStructure {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is not optimal - "Easy" is a personal evaluation - I think "Annotated" might fit better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@matthiasblaesing
Copy link
Member

Thank you so far - I think the change can be merged as is. If a mixed @FieldOrder/getFieldOrder implementation is observed in the wild, we'll deal with it then. One final request, please:

  • add an entry to CHANGES.md regarding this change (see the other entries as a template)
  • commit that change and squash the changes together into one commit

You don't need a new PR, just rewrite the history and force push into the branch you created for this changeset. Please also make sure that you add valid author information (is idosu your full name?).

@idosu
Copy link
Contributor Author

idosu commented May 10, 2018

I've fixed it, and now its one commit

@idosu idosu closed this May 10, 2018
@idosu idosu reopened this May 10, 2018
@matthiasblaesing
Copy link
Member

Merged via:
a59da1e

Thank you for your contribution.

@ncruces
Copy link
Contributor

ncruces commented Jun 13, 2018

I know this has been merged, and I think it is a fine change.

I'm a little concerned though about the usage of LinkedList in the implementation.

I imagine the rational to have been that since you're adding stuff at the head this is better? However, for small lists (dozens of fields), created in a few batches (a couple of inheritance levels), this is likely much worse than an ArrayList.

LinkedList is going to consume more memory (for the wrapper nodes), make iteration slower (non-contiguous memory), indexation slower still. All these effects will persist, whereas even if construction is slightly slower for ArrayList (doubtful) those would be amortized through usage.

Would a PR to replace LinkedList with ArrayList be accepted?

@idosu
Copy link
Contributor Author

idosu commented Jun 13, 2018

I think that we could profit from both worlds if we return Collections.unmodifiableList(new ArrayList<String>(fileds))

If you decide to implement it, please change the type declaration of fields from List to LinkedList

@matthiasblaesing
Copy link
Member

From my POV a reasoned PR would be very nice. @idosu suggestion looks good from my POV. getFieldOrder is not very performance critical, as the hot path goes through fieldOrder, which caches the result of getFieldOrder.

@ncruces
Copy link
Contributor

ncruces commented Jun 14, 2018

I'm of the opinion that LinkedList should simply be replaced by ArrayList.

Are you guys familiar with JMH? If so, check this benchmark.

Small represents a struct with 27 fields and 3 levels of inheritance. Large represents a struct with 110 fields, 7 levels deep. Most structs will have no inheritance, which (regardless of how many fields) should favor ArrayList further. But even for these kinds of structs, this is what I get:

Benchmark                        Mode  Cnt     Score     Error  Units
MyBenchmark.testLinkedListLarge  avgt    6  2536.281 ± 492.503  ns/op
MyBenchmark.testArrayListLarge   avgt    6  1543.860 ± 155.325  ns/op
MyBenchmark.testLinkedListSmall  avgt    6   633.974 ±  92.739  ns/op
MyBenchmark.testArrayListSmall   avgt    6   341.823 ±  49.485  ns/op

ArrayList is probably going to be faster here than LinkedList. I struggled to come up with a test where this wasn't the case.

@matthiasblaesing
Copy link
Member

You are benchmarking the list creation, but that is not the bottle-neck (it happens once, as the result from getFieldOrder is cached (see Structure#fieldOrder). Running with a profiler should show, that Structure#sortFields is the hot-path, that is used for each structure instantiation. So a benchmark should check a code structure as shown there.

@ncruces
Copy link
Contributor

ncruces commented Jun 14, 2018

Sure, but the only thing Structure.sortFields does to the names list is to iterate through it (with List.get). Everything else happens to the fields list.

protected void sortFields(List<Field> fields, List<String> names) {
for (int i=0;i < names.size();i++) {
String name = names.get(i);
for (int f=0;f < fields.size();f++) {
Field field = fields.get(f);
if (name.equals(field.getName())) {
Collections.swap(fields, i, f);
break;
}
}
}
}

Is there any doubt that ArrayList is better than LinkedList at List.get?
I can benchmark list iteration, if you insist.

@matthiasblaesing
Copy link
Member

I would convert to ArrayList at return of getFieldOrder - ArrayList is then backed by an array, that has the exact right size, if you dynamically add elements to the ArrayList, it will grow and you will end up with empty places in the backing array. @idosu suggestion looks sane.

@ncruces
Copy link
Contributor

ncruces commented Jun 15, 2018

This can also be accomplished by ArrayList.trimToSize.

Updated the benchmark:

Benchmark                                            Mode  Cnt     Score     Error Units

MyBenchmark.testArrayListLarge                       avgt    6  1803.994 ± 135.637 ns/op
MyBenchmark.testLinkedListLarge                      avgt    6  2817.097 ± 367.448 ns/op

MyBenchmark.testArrayListSmall                       avgt    6   368.166 ±  44.859 ns/op
MyBenchmark.testLinkedListSmall                      avgt    6   697.853 ±  22.738 ns/op

MyBenchmark.testArrayListTiny                        avgt    6    92.215 ±  11.539 ns/op
MyBenchmark.testLinkedListTiny                       avgt    6   271.574 ±  34.519 ns/op

MyBenchmark.testArrayListLarge:·gc.alloc.rate.norm   avgt    6  2592.000 ±   0.001  B/op
MyBenchmark.testLinkedListLarge:·gc.alloc.rate.norm  avgt    6  3712.000 ±   0.001  B/op

MyBenchmark.testArrayListSmall:·gc.alloc.rate.norm   avgt    6   712.000 ±   0.001  B/op
MyBenchmark.testLinkedListSmall:·gc.alloc.rate.norm  avgt    6   984.000 ±   0.001  B/op

MyBenchmark.testArrayListTiny:·gc.alloc.rate.norm    avgt    6   160.000 ±   0.001  B/op
MyBenchmark.testLinkedListTiny:·gc.alloc.rate.norm   avgt    6   400.000 ±   0.001  B/op

@matthiasblaesing
Copy link
Member

You are still benchmarking creation of the list (that is invoked once per Structure, see fieldOrder for the caching), but not the iteration (which is called for each write+read in getFields/sortFields).

@ncruces
Copy link
Contributor

ncruces commented Jun 16, 2018

I honestly though we agreed that benchmarking iteration was pointless, but here it is:

MyBenchmark2.testArrayListLarge                       avgt    6   139.389 ±    3.757   ns/op
MyBenchmark2.testLinkedListLarge                      avgt    6  5048.858 ± 1362.281   ns/op

MyBenchmark2.testArrayListSmall                       avgt    6    73.229 ±    3.353   ns/op
MyBenchmark2.testLinkedListSmall                      avgt    6   289.084 ±   81.066   ns/op

MyBenchmark2.testArrayListTiny                        avgt    6    36.231 ±    3.698   ns/op
MyBenchmark2.testLinkedListTiny                       avgt    6    74.107 ±    4.701   ns/op

LinkedList is just terrible for smallish lists. Even when used properly (always use iterators), and for its supposed advantages (insert/remove at/near head, through iterators), its disadvantages are just too big to overcome (memory, cache locality, traversal performance are all bad).

Personally, I'd just use ArrayList instead. With trimToSize if you feel like it might waste a lot of memory (hint: LinkedList wastes a lot more).

But if you guys feel otherwise, I'm glad to drop this, and leave it as it is.

@ncruces
Copy link
Contributor

ncruces commented Jun 17, 2018

Adding some context.

I just had to drop JNA (for JNI) in a project because of how slow JNA was at marshaling an array of a few hundred fat structs. Something like 90% of the time was spent marshaling and GCing garbage created for marshaling. This isn't a chatty API (it's a single call, which does some actual work), but there's a lot of marshaling involved (both in and out).

Just creating a "contiguous" array of structs (with Struct.toArray) to pass as an argument was painfully slow. I came up with a workaround, but had to make my structs Cloneable because most of the overhead is from the constructor.

The change in this PR, as it is, makes things measurably worse, because iterating a LinkedList with List.get is obviously slow (I've profiled it, but can't really make an MCVE out of it).

I thought it was a simple, non contentious, improvement. You use an ArrayList here, after all.

I also had other ideas, but I'd need time to mature them, and I've moved on for now.


But, I see you guys tend to use LinkedList in a lot of places.

Another example:

Collection<CallbackReference> refs = new LinkedList<CallbackReference>(allocatedMemory.keySet());

The person who wrote this might be surprised to know that, the first thing the LinkedList constructor in OpenJDK is going to do is call Collection.toArray (see here), so you might as well do that instead.

If my benchmarks are not enough to convince you that, for smallish lists (N<100), ArrayList consumes less memory, creates less garbage, iterates faster, even handles the worse case List.add(0, e) scenario faster, then I'm afraid nothing will.

@matthiasblaesing
Copy link
Member

You are misunderstanding me. I'm fine with a change from LinkedList to ArrayList. You began measuring and I pointed out, that you are measuring the wrong part. So if you argue with numbers, I'll check if they are meaningful. And I also pointed out why they are not meaningful.

Oh and please "I see you guys tend to use LinkedList in a lot of places.": There is no "you" here. Most probably the same guy wrote the code and it ended up in "a lot of places".

I appreciate performance enhancing PR - so you can and should work on it, but maybe we should do it on the requested code change, that would make it easier.

@twall
Copy link
Contributor

twall commented Jun 17, 2018 via email

@idosu idosu deleted the structure-fieldorder branch June 18, 2018 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants