Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ensure macOS has enough mapped shared memory #881

Merged
merged 6 commits into from
Aug 8, 2019

Conversation

gdams
Copy link
Member

@gdams gdams commented Aug 6, 2019

as requested in #840

I've already run this on all the test mac's but they will need to be disconnected and reconnected in order to have the new umask.

@sxa
Copy link
Member

sxa commented Aug 6, 2019

Why the updated umask? I don't see that documented in 840 and unless there's a good reason to relax the permissions we shouldn't be doing that

@gdams
Copy link
Member Author

gdams commented Aug 6, 2019

@sxa555 I've change the link, it's required in eclipse-openj9/openj9#4375

@sxa
Copy link
Member

sxa commented Aug 6, 2019

Is mode: 0777 creating a world-writable file? I don't seem to have login access to the machines to check.

@sxa
Copy link
Member

sxa commented Aug 6, 2019

I'll check with the OpenJ9 team whether that is a hard requirement on the machine setup as my initial feel is that anything testing that functionality could override it itself

@jdekonin
Copy link
Contributor

jdekonin commented Aug 6, 2019

Default permissions on the file for OpenJ9 systems appears to be; -rw-r--r-- 1 root wheel 72 Jan 24 2019 /etc/sysctl.conf. At least I don't believe I modified the perms from the default.

There is also a create option for blockinfile so you could collapse the 2 sections into 1.

Not sure if the test setting umask is an option or not. @hangshao0 or @Mesbah-Alam would be in a better position to answer that question. The comments for each change should be indicative of the actual change. Update main.yml will not be helpful if you are scanning through commits looking for a problem in 6 months.

@hangshao0
Copy link

A bit background info: It is testing a command line option to create a shared cache file that is readable and writable to other users within this group. The umask is preventing such permission to be set.

Not sure if the test setting umask is an option or not.

@Mesbah-Alam, @llxia

@sxa
Copy link
Member

sxa commented Aug 6, 2019

@hangshao0 Makes sense (although presumably since the tests are running as a single user it's only verifying the permissions afterwards as opposed to utilising group write?) Do you know if there is scope for the test to set the umask itself before verifying that functionality since that would assist third parties trying to run our tests?

@hangshao0
Copy link

The JVM that creates the shared cache file will verify if the required access is successfully set or not (right after the file is created). I guess we need the modified umask only when we are running the command that creates the shared cache file.

@llxia
Copy link

llxia commented Aug 6, 2019

I think we can set umask in the test and set it back. Similar to https://github.com/eclipse/openj9/blob/master/test/functional/cmdLineTests/sigxfszHandlingTest/runSIGXFSZTest.sh

@Mesbah-Alam could you help to set umask in the systemtest? Thanks

@karianna karianna added this to the August 2019 milestone Aug 7, 2019
@karianna karianna requested a review from sxa August 7, 2019 12:24
Copy link
Member

@sxa sxa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rejecting on the basis that the test is to be modified to cover the umask part of it

@gdams
Copy link
Member Author

gdams commented Aug 8, 2019

I'll back the umask bit of out of it

@gdams
Copy link
Member Author

gdams commented Aug 8, 2019

@sxa555 updates PTAL

@jdekonin
Copy link
Contributor

jdekonin commented Aug 8, 2019

@sxa555, is this another example where tests need a specific value, then perhaps the tests should set them and fail if they can't increase it?

@karianna karianna requested a review from sxa August 8, 2019 13:53
@Mesbah-Alam
Copy link
Contributor

Mesbah-Alam commented Aug 8, 2019

Even with umask 0002 set in the playlist, SharedClassesAPI test is still failing https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1873/console

from 8.WL4.stderr:

JVMSHRC659E An error has occurred while opening shared memory
JVMSHRC336E Port layer error code = -393970
JVMSHRC337E Platform error message: shmget : Cannot allocate memory
JVMSHRC029E Not enough memory left on the system
JVMSHRC663I Error recovery: destroyed semaphore set with id=65537 associated with shared class cache.
JVMJ9VM015W Initialization error for library j9shr29(11): JVMJ9VM009E J9VMDllMain failed
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

It seems the machine is running low on memory.

Internal OSX machines were updated to the following values
Ref: #840 (comment)

kern.sysv.shmmax=125839605760
kern.sysv.shmall=30722560

@sxa555, @gdams, what are the values set for kern.sysv.shmmax and kern.sysv.shmall in the Adopt osx machines? (e.g., in the one where the test is failing: https://ci.adoptopenjdk.net/computer/test-macincloud-macos1013-x64-1/)? Can we update them to match what we have in the internal machines?

@gdams
Copy link
Member Author

gdams commented Aug 8, 2019

@Mesbah-Alam It's already got these values set as I've run this change on all the test macs:

test-macincloud-macos1013-x64-1:~ admin$ cat /etc/sysctl.conf
# BEGIN ANSIBLE MANAGED BLOCK
kern.sysv.shmmax=314572800
kern.sysv.shmall=76800

Do the macs need a reboot after making the change I can make the numbers the same as the ones you've got as well if needed?

@gdams
Copy link
Member Author

gdams commented Aug 8, 2019

based on https://serverfault.com/a/383990 it suggests that they do need a reboot, I'll add that to the playbook

@gdams
Copy link
Member Author

gdams commented Aug 8, 2019

re-running grinder with change on that machine: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1875/console

@gdams
Copy link
Member Author

gdams commented Aug 8, 2019

grinder passed so I assume this fixes the issue. @sxa555 can you please land and I'll deploy to the rest of the macs

@Mesbah-Alam
Copy link
Contributor

Test updated to set umask 0002 when on osx: adoptium/aqa-tests#1262

@hangshao0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants