Bump JNA to 5.7.0 #165

emign · 2021-02-11T20:22:15Z

Bump JNA Version to 5.7.0 for Apple Silicon Support
Fixes: korlibs/korge#335

emign · 2021-02-11T20:23:00Z

Only tested on my machine with Azul aarch64 1.8 JDK

soywiz · 2021-02-12T04:10:49Z

Thanks!

And with this, everything works on Apple Silicon? If that's the case, that's amazing! Tough could make sense since there is only JNA and not JNI at all.

Despite I hope they have tried & they have CI, let me try it in the coming days that everything still works on the typical targets.

emign · 2021-02-12T05:56:17Z

Well. It compiles a working JVM target for the aarch64 JDK/JVM from azul.
That fixes the mentioned issue at least :)

What else there is to be discovered: I don't know :)

I wanted to introduce a github action for apple silicon for korlibs but there isn't any at the moment.

soywiz · 2021-02-12T06:45:33Z

For the record, this is the merged PR: java-native-access/jna#1297

BTW Nico, Im just curious, have you tried the sprites10k sample, or the bunnymark one from this repo and have some numbers of CPU usage/fps between using an emulated x64 java runtime vs using the native arm64 version. As far as I know the M1 is lightning fast, and having the GPU and embedded memory on the SoC + 5nm lithography this should be a beast

emign · 2021-02-12T10:27:55Z

Sorry I don't have anything objectively or scientifically accurate. The debug window (F7) cannot be opened on macOS 11.2 because it freezes the JVM app.

JVM: 100,000 bunnies: around 42-44% CPU
JVM aarch64 with 100,000 bunnies: around 40-42%

The compile times are SIGNIFICANTLY shorter on a native aarch64 JDK than the x86 one. Im speaking of factor 10x for the bunny mark.

soywiz · 2021-02-12T10:36:12Z

Awesome. And I asume you are talking about a clean build right, were not using prebuilt classes and stuff?

Bunnymark should be both GPU and CPU-bounded I guess, and for JNA I cannot use the native approach, since some of the OpenGL symbols are dynamically loaded via another opengl call, and I don't know a way to rebind them via code, though I guess is only affected by the number of batches, and in bunnymark it is pretty small.

But weill, It is using alpha blending, and other things that could be optimized. Have you tried to not use a retina resolution and see if the FPS changes? I usually use this: https://github.com/avibrazil/RDM to switch between retina and non-retina resolutions, to check that everything works. Retina has 4 times the number of pixels, and in latests versions it also does a 4x MSAA I believe (not sure if that happens on MacOS too, but I guess we could drop it when using retina resolutions)

emign · 2021-02-12T15:54:31Z

I could do some more research when it is implemented in KorGE itself. That would make testing much easier for me. IntelliJ in the aarch64 version does not like the HUGE KorGE-next Project. Maybe its Gradle too. It will have aarch64 in Gradle 7.0

EDIT: Good news is, that the KorGE next repo seems to work well with the 7.0 version of Gradle (gradle-7.0-20210211230048+0000 at least)

soywiz · 2021-02-12T20:50:26Z

Seems to work properly on my machines, thanks Nico! :)

emign · 2021-02-13T09:38:00Z

Azul apple silicon JDK: FPS drop below 50 at around 250k bunnies. under 2 seconds on third built till window
Adoptionen Rosetta/x64: FPS drop below 50 at around 90k bunnies. 8 seconds on third built till window

BTW:I can work better now with the KorGE Next project if we want to test stuff out. Forgot to increase the IDEA memory limit since it is a new installation

soywiz · 2021-02-13T10:27:42Z

Thats around 3-4x, nice. There are some improvements pending in both gpu and cpu

soywiz · 2021-02-14T09:37:19Z

IDEA is suffering a lot with korge-next. I have simplified it a bit by not including the intellij plugin here.

Regarding to optimizations:

It would be possible to use geometry shaders, emitting just points and generating quads from it, effectively reducing the GPU bandwidth requirements to almost 1/4, and not performing lots of floating on the CPU.
But KorGE is not going to support geometry shaders in the mid term because WebGL1 or even 2 doesnt support it, so until something like WebGPU similar to vulkan/metal is available and mainstream I wont probably add support for something more advanced.

In the meantime there is a trick to achieve a similar effect that theoretically requires twi extensions widely available:

https://developer.mozilla.org/en-US/docs/Web/API/ANGLE_instanced_arrays

https://developer.mozilla.org/en-US/docs/Web/API/OES_texture_float

But would require a value greater than one for this property: MAX_VERTEX_TEXTURE_IMAGE_UNITS

We could create a quad, and render lots of instances on it, then use a texture to read all the properties for all the stuff, but requires texture sampling on the vertex shader that is not always available.

We can also try to reorder how memory of the objects is allocated and stored to be as much contiguous as possible

soywiz · 2021-02-15T18:05:42Z

@emign Can you try master again on your Apple M1? You should be able to place a few more bunnies :)

emign · 2021-02-15T19:31:21Z

This brings some funny problems with recurring frame drops. See these videos:

https://youtu.be/5zZLRCo6vu4

https://youtu.be/TOC3Ez4Wm3k

soywiz · 2021-02-15T19:39:46Z

Can't watch the videos, they are marked as private

emign · 2021-02-15T19:46:10Z

fixed

soywiz · 2021-02-15T19:48:03Z

Have you master updated?

emign · 2021-02-15T19:48:55Z

Its a completely new git clone

soywiz · 2021-02-15T19:51:53Z

I had that issue before and fixed it with this commit:
087021e

That's why I asked. So if you perform a git pull it doesn't bring new commits?

emign · 2021-02-15T20:00:46Z

Head was ob 087021e
I pulled 08314d a6ebf72 now. Retesting

emign · 2021-02-15T20:06:33Z

https://youtu.be/MQkFEppIBd0

soywiz · 2021-02-15T20:13:26Z

Just tried on an intel mac, and seems to work properly. If you reduce to , batchMaxQuads = 4000 the problem persists, and sprites stop displaying at 4000?

emign · 2021-02-15T20:29:55Z

Better:
https://youtu.be/z70qZovsAww

soywiz · 2021-02-15T20:56:07Z

Still that shows artifacts, it is a bit strange. Can you try to run it on JS? In this effort I tried to optimize all the targets, and each one has its own peculiarities. But would want to see if the JS target displays those artifacts on your computer with different batchMaxQuads values.

~~Was not able to reproduce your issue myself with master on Windows, macOS intel, JS and Android.~~

This might fix the issue: ee1b313

emign · 2021-02-15T22:02:19Z

JS does not show artifacts. But I believe the z-index of the pouring bunnies does not look right. The stream of new bunnies plops behind the rest just to reappear.

https://youtu.be/ygGLuXYWg40

Thats on BatchBuilder2D.MAX_BATCH_QUADS)

soywiz · 2021-02-15T22:04:06Z

Have you done a git pull? I think I have fixed the issue you had here: ee1b313

emign · 2021-02-15T22:05:09Z

just saw that after my answer. building as we speak

emign · 2021-02-15T22:20:42Z

yes its fixed
wohoo

120k bunnies aarch64 JDK to drop under 50 FPS. But that's on the MacBook Air .
Have to test it on the mini later/tomorrow (the results from above are from the mini)

soywiz · 2021-02-15T22:24:13Z

That's much worse then? Or is that rosetta?

Azul apple silicon JDK: FPS drop below 50 at around 250k bunnies. under 2 seconds on third built till window
Adoptionen Rosetta/x64: FPS drop below 50 at around 90k bunnies. 8 seconds on third built till window

emign · 2021-02-15T22:28:06Z

yes it is worse for the aarch64 jdk.
Good message is, that the x86/rosetta version is at 120k too now.
But the aarch64 version regressed

soywiz · 2021-02-15T22:39:28Z

If VisualVM works on aarch64, could you profile the old version and the new one when you have time and send me the snapshot .nps files for 20 seconds of CPU sampling with 400K sprites on aarch64 old faster version and new slower version?

Pressing the Sampler -> CPU, then Snapshot, then the diskette Export Snapshot data to save it

emign · 2021-02-15T22:41:16Z

ofc. But I cannot do it today.
Well tomorrow is in 19 minutes but I mean this night :)

emign · 2021-02-16T09:19:57Z

I tested with 7d92c54 on the mini again.

Drop under 50 FPS
Mac mini x86/rosetta JDK: around 200k Sprites
Mac mini aarch64 JDK: around 330k Sprites
MacBook Air x86/rosetta JDK: around 200k Sprites
MacBook Air aarch64 JDK: around 330k Sprites
(I tried it with power attached and on battery)

So the results are consistent again, which makes sense, because they have the exact same CPU/GPU. Did you do any optimizations in between?

soywiz · 2021-04-29T16:54:00Z

Hey @emign ! How are you doing?

Could you by chance try: ./gradlew :samples:bunnymark-fast:runJvm on the Apple M1? And try to resize the window to make it smaller. The -fast sample should now be GPU-bounded and I wonder if it can reach 60fps with 800K bunnies

emign added 4 commits February 11, 2021 21:09

Bumpded JNA to 5.7.0

5278c2d

Bumpded JNA to 5.7.0

efa41ba

Merge remote-tracking branch 'origin/master'

ee57e10

Bum 5.7.0 fixes

5c4091c

emign requested a review from soywiz February 11, 2021 20:22

soywiz merged commit 846be7d into soywiz-archive:master Feb 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump JNA to 5.7.0 #165

Bump JNA to 5.7.0 #165

emign commented Feb 11, 2021

emign commented Feb 11, 2021

soywiz commented Feb 12, 2021 •

edited

Loading

emign commented Feb 12, 2021

soywiz commented Feb 12, 2021

emign commented Feb 12, 2021

soywiz commented Feb 12, 2021

emign commented Feb 12, 2021 •

edited

Loading

soywiz commented Feb 12, 2021

emign commented Feb 13, 2021 •

edited

Loading

soywiz commented Feb 13, 2021

soywiz commented Feb 14, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021 •

edited

Loading

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021 •

edited

Loading

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021 •

edited

Loading

emign commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

emign commented Feb 16, 2021 •

edited

Loading

soywiz commented Apr 29, 2021

Bump JNA to 5.7.0 #165

Bump JNA to 5.7.0 #165

Conversation

emign commented Feb 11, 2021

emign commented Feb 11, 2021

soywiz commented Feb 12, 2021 • edited Loading

emign commented Feb 12, 2021

soywiz commented Feb 12, 2021

emign commented Feb 12, 2021

soywiz commented Feb 12, 2021

emign commented Feb 12, 2021 • edited Loading

soywiz commented Feb 12, 2021

emign commented Feb 13, 2021 • edited Loading

soywiz commented Feb 13, 2021

soywiz commented Feb 14, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021 • edited Loading

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021 • edited Loading

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021 • edited Loading

emign commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

soywiz commented Feb 15, 2021

emign commented Feb 15, 2021

emign commented Feb 16, 2021 • edited Loading

soywiz commented Apr 29, 2021

soywiz commented Feb 12, 2021 •

edited

Loading

emign commented Feb 12, 2021 •

edited

Loading

emign commented Feb 13, 2021 •

edited

Loading

soywiz commented Feb 15, 2021 •

edited

Loading

soywiz commented Feb 15, 2021 •

edited

Loading

soywiz commented Feb 15, 2021 •

edited

Loading

emign commented Feb 16, 2021 •

edited

Loading