Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error state of a VM not evaluated correctly #4344

Closed
7 tasks
philfry opened this issue Mar 11, 2020 · 6 comments
Closed
7 tasks

error state of a VM not evaluated correctly #4344

philfry opened this issue Mar 11, 2020 · 6 comments

Comments

@philfry
Copy link

philfry commented Mar 11, 2020

Description
The "crashed" state is not evaluated correctly by OpenNebula.
For example, I have this VM on which I provoked a kernel panic. Libvirt shows:

$ virsh list
 Id    Name                           State
----------------------------------------------------
 4     one-45                         crashed

The polling scripts recognize that state and return an error state (STATE=e)

$ ssh -n mykvmnode /var/tmp/one/im/run_probes kvm \
  /var/lib/one/datastores 4124 60 0 mykvmnode | sed -n '/^VM=/,/]/p'
VM=[
  ID=45,
  DEPLOY_ID=one-45,
  POLL="STATE=e CPU=0.0 MEMORY=264576" ]

Unfortunately, OpenNebula seems to be ignoring this state and returning STATE=ACTIVE and LCM_STATE=UNKNOWN:

$ onevm show 45 | head
VIRTUAL MACHINE 45 INFORMATION                                                 
ID                  : 45                 
NAME                : example       
USER                : oneadmin           
GROUP               : oneadmin           
STATE               : ACTIVE             
LCM_STATE           : UNKNOWN            
LOCK                : None               
RESCHED             : No                 
HOST                : mykvmhost

As far as I saw the correct states STATE=FAILED and LCM_STATE=FAILURE are commented out in include/VirtualMachine.h, src/sunstone/public/app/opennebula/vm.js and probably other files. Also it looks like the code for processing these states is missing.

Is there a reason for that? Or is it just not implemented yet?

To Reproduce

  1. Apply [feature] implement crash detection for qemu vms #4332 to enable crash detection
  2. set crash action to preserve
  3. fire up a vm
  4. login into the vm and run echo c > /proc/sysrq-trigger

Expected behavior
The VM to be shown with an error state instead of an unknown state.

Details

  • Affected Component: core, sunstone
  • Hypervisor: kvm
  • Version: 5.10.3

Additional context

Progress Status

  • Branch created
  • Code committed to development branch
  • Testing - QA
  • Documentation
  • Release notes - resolved issues, compatibility, known issues
  • Code committed to upstream release/hotfix branches
  • Documentation committed to upstream release/hotfix branches
@stale
Copy link

stale bot commented Jun 11, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. The OpenNebula Dev Team

@philfry
Copy link
Author

philfry commented Jun 11, 2021

well… "bump", I guess.

@christian7007
Copy link
Contributor

christian7007 commented Sep 13, 2021

Hi @philfry,

Sorry for the late reply, have you tried this on an environment with version >= 5.12.0. The monitoring subsystem was re-designed for that version and it seems this is already mapped into the FAILURE state (

'crashed' => 'FAILURE',
).

(Updated link)

@philfry
Copy link
Author

philfry commented Sep 13, 2021

Hi @christian7007, thanks for your reply.
Unfortunately, I cannot see that repo (one-ee? enterprise edition?), but I'll take a look whether or not the bug is still valid for 5.12.0.1.

@christian7007
Copy link
Contributor

Sorry @philfry, here you have the link to the public repo:

'crashed' => 'FAILURE',

@rsmontero
Copy link
Member

Seems it is already fixed. We'll reopen if needed

THANKS for your feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants