Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pillars not updated on minions until salt-minion is restarted #31907

Closed
aabognah opened this issue Mar 15, 2016 · 27 comments
Closed

pillars not updated on minions until salt-minion is restarted #31907

aabognah opened this issue Mar 15, 2016 · 27 comments
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt Pillar severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around won't-fix legitimate issue, but won't fix
Milestone

Comments

@aabognah
Copy link
Contributor

aabognah commented Mar 15, 2016

Description of Issue/Question

When using git_pilalr, if I make a change to the pillar data on the repo and run:

salt '*' saltutil.refresh_pillar

The pillar data is not updated. only if the minion is restarted that new pillars show up when I run

salt '*' pillar.item pillar_name

the git_pillar config file in /etc/salt/master.d/pillar_config.conf:

   git_pillar_provider: pygit2
   ext_pillar:
     - git:
       - master gitlab@repository_url/repository_name.git:
         - root: pillar/base
         - env: base
         - privkey: /root/.ssh/id_rsa
         - pubkey: /root/.ssh/id_rsa.pub

Steps to reproduce:

I am not sure how this will be reproduced. I am using the same repo for gitfs and git_pillar and all hosts are RHEL6/7 on a virtual environment (vMWare)

Versions Report

Salt Version:
           Salt: 2015.8.7

Dependency Versions:
         Jinja2: 2.7.2
       M2Crypto: 0.21.1
           Mako: Not Installed
         PyYAML: 3.11
          PyZMQ: 14.7.0
         Python: 2.7.5 (default, Oct 11 2015, 17:47:16)
           RAET: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5
           cffi: 0.8.6
       cherrypy: Not Installed
       dateutil: 1.5
          gitdb: 0.5.4
      gitpython: 0.3.2 RC1
          ioflo: Not Installed
        libgit2: 0.21.0
        libnacl: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.7
   mysql-python: Not Installed
      pycparser: 2.14
       pycrypto: 2.6.1
         pygit2: 0.21.4
   python-gnupg: Not Installed
          smmap: 0.8.1
        timelib: Not Installed

System Versions:
           dist: redhat 7.2 Maipo
        machine: x86_64
        release: 3.10.0-229.el7.x86_64
         system: Red Hat Enterprise Linux Server 7.2 Maipo
@jfindlay jfindlay added the info-needed waiting for more info label Mar 15, 2016
@jfindlay jfindlay added this to the Blocked milestone Mar 15, 2016
@jfindlay
Copy link
Contributor

@aabognah, thanks for reporting. What happens when you do salt '*' pillar.get pillar_name?

@aabognah
Copy link
Contributor Author

salt '*' pillar.get pillar_name return the old pillar value.

@jfindlay jfindlay modified the milestones: Approved, Blocked Mar 16, 2016
@jfindlay jfindlay added Bug broken, incorrect, or confusing behavior severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around Core relates to code central or existential to Salt P2 Priority 2 Pillar and removed info-needed waiting for more info labels Mar 16, 2016
@jfindlay
Copy link
Contributor

@aabognah, thanks for confirming. This is possibly related to #23391 and #25160.

@aabognah
Copy link
Contributor Author

I updated to the new 2015.8.8.2 version but the problem still exists.

I have another setup with another master (running on ERHEL6) and I don't see the problem on that one.

Here is the versions report of the WORKING master (where minions DO NOT need to be restarted for pillars updates to show up):

salt --versions-report
Salt Version:
       Salt: 2015.8.8.2

Dependency Versions:
     Jinja2: unknown
   M2Crypto: 0.20.2
       Mako: Not Installed
     PyYAML: 3.11
      PyZMQ: 14.5.0
     Python: 2.6.6 (r266:84292, May 22 2015, 08:34:51)
       RAET: Not Installed
    Tornado: 4.2.1
        ZMQ: 4.0.5
       cffi: Not Installed
   cherrypy: 3.2.2
   dateutil: 1.4.1
      gitdb: 0.5.4
  gitpython: 0.3.2 RC1
      ioflo: Not Installed
    libgit2: 0.20.0
    libnacl: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.4.6
mysql-python: Not Installed
  pycparser: Not Installed
   pycrypto: 2.6.1
     pygit2: 0.20.3
python-gnupg: Not Installed
      smmap: 0.8.1
    timelib: Not Installed

System Versions:
       dist: redhat 6.7 Santiago
    machine: x86_64
    release: 2.6.32-573.12.1.el6.x86_64
     system: Red Hat Enterprise Linux Server 6.7 Santiago

And here is the versions report of the NON-WORKING master (where the minions NEED TO BE RESTARTED after the pillars are updated for changes to take effect):

Salt Version:
       Salt: 2015.8.8.2

Dependency Versions:
     Jinja2: 2.7.2
   M2Crypto: 0.21.1
       Mako: Not Installed
     PyYAML: 3.11
      PyZMQ: 14.7.0
     Python: 2.7.5 (default, Oct 11 2015, 17:47:16)
       RAET: Not Installed
    Tornado: 4.2.1
        ZMQ: 4.0.5
       cffi: 0.8.6
   cherrypy: Not Installed
   dateutil: 1.5
      gitdb: 0.5.4
  gitpython: 0.3.2 RC1
      ioflo: Not Installed
    libgit2: 0.21.0
    libnacl: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.4.7
mysql-python: Not Installed
  pycparser: 2.14
   pycrypto: 2.6.1
     pygit2: 0.21.4
python-gnupg: Not Installed
      smmap: 0.8.1
    timelib: Not Installed

System Versions:
       dist: redhat 7.2 Maipo
    machine: x86_64
    release: 3.10.0-229.el7.x86_64
     system: Red Hat Enterprise Linux Server 7.2 Maipo

The minions for both masters looks similar and are all RHEL6/7 or OEL. here is a versions report of one minion:

Salt Version:
           Salt: 2015.8.8.2

Dependency Versions:
         Jinja2: 2.7.2
       M2Crypto: 0.21.1
           Mako: Not Installed
         PyYAML: 3.11
          PyZMQ: 14.7.0
         Python: 2.7.5 (default, Oct 11 2015, 17:47:16)
           RAET: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 1.5
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
        libgit2: Not Installed
        libnacl: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.7
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
         pygit2: Not Installed
   python-gnupg: Not Installed
          smmap: Not Installed
        timelib: Not Installed

System Versions:
           dist: redhat 7.2 Maipo
        machine: x86_64
        release: 3.10.0-327.10.1.el7.x86_64
         system: Red Hat Enterprise Linux Server 7.2 Maipo

@aabognah
Copy link
Contributor Author

aabognah commented Apr 1, 2016

@jfindlay is there a workaround that I can implement to fix this?

@jfindlay
Copy link
Contributor

jfindlay commented Apr 1, 2016

@aabognah, not that I know of.

@cachedout
Copy link
Contributor

Hmm, my first reaction here is that this might be be related to the difference in git provider libs. If you take pygit2 down to the version on the working master, does this problem go away?

@aabognah
Copy link
Contributor Author

Does the fact that I have two masters setup in redundant-master setup have anything to do with this?

setup was made base on this walk through:
https://docs.saltstack.com/en/latest/topics/tutorials/multimaster.html

The two masters have the same key, minions are configured to check-in with both masters, and both masters look at the same repository for gitfs and git_pillar.

Do I need to keep the local cache files on each master in sync in order to solve this?

@davidkarlsen
Copy link

Same here - here is our minion:

salt-call --versions
Salt Version:
           Salt: 2016.3.2

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 1.5
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: 0.21.1
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.7
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
         pygit2: Not Installed
         Python: 2.7.5 (default, Oct 11 2015, 17:47:16)
   python-gnupg: Not Installed
         PyYAML: 3.10
          PyZMQ: 14.5.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5

System Versions:
           dist: redhat 7.2 Maipo
        machine: x86_64
        release: 3.10.0-327.13.1.el7.x86_64
         system: Linux
        version: Red Hat Enterprise Linux Server 7.2 Maipo

@luxi2001
Copy link

luxi2001 commented Feb 9, 2017

We're struggling with this issue and we're running a multimaster topology with master_shuffle: True. The masters and minions are running 2016.3.4. We're using a custom pillar backend and pillar_cache: False and minion_data_cache: False.

Changes in pillars are not updated in pillar.get and .item calls when you target minions from one of the masters after you've executed saltutil.refresh_pillar. This is also the case when the pillar.get/item command is run from within a custom module on the targeted minion. A pillar.items without args does hand you fresh pillar data. Running salt-call pillar.get or pillar.item on the minion also works ok.

There seems to be little activity on this and the related issues linked here. Is it something that's being worked on or are there workarounds we can use, perhaps a different multimaster topology?

@gravyboat
Copy link
Contributor

I can confirm this issue is still occurring in 2016.3.8:

Salt Version:
           Salt: 2016.3.8

Dependency Versions:
           cffi: 1.1.2
       cherrypy: 11.0.0
       dateutil: 2.2
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.9.6
        libgit2: 0.22.2
        libnacl: Not Installed
       M2Crypto: 0.21.1
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.8
   mysql-python: 1.2.5
      pycparser: 2.14
       pycrypto: 2.6.1
         pygit2: 0.22.0
         Python: 2.7.10 (default, Oct 14 2015, 16:09:02)
   python-gnupg: 0.4.1
         PyYAML: 3.12
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: Not Installed
        timelib: 0.2.4
        Tornado: 4.5.2
            ZMQ: 4.1.6

System Versions:
           dist: Ubuntu 15.10 wily
        machine: x86_64
        release: 4.2.0-42-generic
         system: Linux
        version: Ubuntu 15.10 wily

The only workaround even in a simple master/minion setup is to restart the salt-minion. Neither of the associated issues have been addressed.

@angeloudy
Copy link
Contributor

This issue is still occuring on 2017.7.2

Salt Version:
           Salt: 2017.7.2

Dependency Versions:
           cffi: 1.7.0
       cherrypy: Not Installed
       dateutil: 2.6.1
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.10
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.7
   mysql-python: Not Installed
      pycparser: 2.10
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 3.6.4 (default, Jan  2 2018, 01:25:35)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 16.0.3
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.5.2
            ZMQ: 4.2.2

System Versions:
           dist:
         locale: US-ASCII
        machine: amd64
        release: 11.1-RELEASE
         system: FreeBSD
        version: Not Installed

@pauldamian
Copy link

Still reproducible on 2018.3

Salt Version:
           Salt: 2018.3.2

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.6.1
      docker-py: Not Installed
          gitdb: 2.0.3
      gitpython: 2.1.8
          ioflo: Not Installed
         Jinja2: 2.10
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: 1.0.7
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.15rc1 (default, Apr 15 2018, 21:51:34)
   python-gnupg: 0.4.1
         PyYAML: 3.12
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: 2.0.3
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.2.5

System Versions:
           dist: Ubuntu 18.04 bionic
         locale: UTF-8
        machine: x86_64
        release: 4.15.0-23-generic
         system: Linux
        version: Ubuntu 18.04 bionic

Is it going to be addressed soon?

@devopsprosiva
Copy link

I'm hitting the same issue in 2018.3.2.

@cavepopo
Copy link
Contributor

cavepopo commented Aug 28, 2018

Same here with 2018.3.2

@xbglowx
Copy link
Contributor

xbglowx commented Sep 28, 2018

I ran into something similar, but a restart of the minion didn't help. My problem was with how we use packer and the salt-masterless provisioner.

The provisioner copies pillar and salt files under /srv due to remote_pillar_roots and remote_state_tree. Once the salt-master EC2 instance launched based off of this new AMI, it would merge data in both /srv (left over from the build process) and gitfs/ext_pillar. I was able to see this after running the salt-master in debug mode.

The fix was to do a clean up (rm -rf /srv) during the packer build (shell provisioner) before it actually created the AMI.

https://www.packer.io/docs/provisioners/salt-masterless.html

@dereckson
Copy link
Contributor

One interesting thing: salt-call -l debug saltutil.pillar_refresh and salt-call -l debug pillar.items don't give the files read for the pillar information.

@maznu
Copy link

maznu commented Aug 14, 2019

Seeing similarly strange things in 2019.2.0 where, having updated a plain YAML/SLS-style pillar file on the master (and tried restarting master, minions, saltutil.refresh_pillar, etc), the minions are not getting the same values.

A bit more debugging running commands on a minion:

root@elder# salt-call pillar.get key1:key2 pillarenv=base
    primary:
        elder
    secondary:
        gage

root@elder# salt-call pillar.get key1:key2
    primary:
        gage
    secondary:
        elder

The pillar that you see when I pass pillarenv=base reflects the "source of truth" (pillar file on master). There is only one pillar defined on this deployment, it's a very small-scale single-master deployment, not really using many clever SaltStack tricks.

I was expecting to have stumbled onto a weird edge case of my own making, but I'm instead surprised by how long this has been a problem for other users. How can we help you get more information to fix this, because it's fundamental to why people use Salt — repeatability. Right now, Salt can literally deploy the wrong things on the wrong hosts.

@maznu
Copy link

maznu commented Sep 22, 2019

I found that I could workaround my problems by doing this on the master:

pillar_cache: False

@DirectRoot
Copy link
Contributor

DirectRoot commented Oct 18, 2019

I found that I could workaround my problems by doing this on the master:

pillar_cache: False

I needed to do the above, but also renamed a cache directory under /var/cache/salt/master/git_pillar (and restarted the service)

@H20-17
Copy link

H20-17 commented Apr 16, 2020

I'm having the same problem. I just noticed that one of my (Windows) minions is not refreshing its in memory pillar data after a saltutil.refresh_pillar. I'm on salt 3000 on my minions and salt 3001 on the master. I don't see what setting pillar_cache: False on the master would do since that supposed to be the default but I'm trying it anyway. I have done that and I have deleted all of the directories in /var/cache/salt/master/minions just to see what happens.

I also notice that pillar based scheduling stops doing anything on these minions once the refresh stops working.

It my particular case there could be some kind of timeout issue lurking in the background. I schedule a saltutil.refresh_pillar, but in the scheduling specification I don't see how to include a timeout value. If the salt master is not available to the minion at the time of the function is called it's possible that the the job never returns. which may be the cause of what I'm seeing (somehow).

Sorry for this stream of conciousness babble. I'm trying to understand what's going on. WHat I said about refresh_pilar makes no sense since that just causes a signal to be sent. I am seeing this happening on machines that I believe are suspending (usually laptops) and then waking up. I notice that for the pillar events since the pillar is apparently frozen schedule.next_fire_time for all of the events specified in the pillar also becomes frozen, and all the times become times in the past.

@H20-17
Copy link

H20-17 commented Apr 19, 2020

Alright, I apologise for the previous post. I wasn't really ready to say anything but I am now (sort of).
AFAICS it only happens on minions that experience network disruptions, or in the very least, it happens ways more on those machines. In particular it happens a lot with laptops and I assume that this is because people are closing the lids and putting them to sleep or they're going to sleep on their own. I don't know if this bug happens on Linux minions because I haven't got Linux installed on any minions that aren't always connected. I don't think that whatever sleep mode a machine goes into would make any difference (assuming the sleep mode is properly implemented), so I think it has to be the network disruption.

Some things I have observed:

  • pillar.items always gives correct up to date pillar data (as we would hope)
  • Pillar schedule events stop firing (or appear to). In some cases the fire times reported by shchedule.show_next_fire_time are in the past and stop updating. In some cases the fire times are updating but the events stop firing anyway.

That's all I've got. I have no idea how I could possibly triangulate this. I hope that this can be looked at because I consider it to be a quite serious problem with core functionality. If it is due to network disruptions and cannot be fixed (for instance due to however zeromq is implemented), then the FAQ should have workarounds for that situation. On Windows machines I believe I can have the scheduler restart minions after waking up (which I will try next I think). This may be an adequate workadound, if not ideal (fingers crossed).

@H20-17
Copy link

H20-17 commented Apr 29, 2020

This seems to be 90-100% resolved for Windows minions by having the salt-minion service restart after waking up from sleep. I don't know what the situation is for linux minions. I now have much more reliability with minions (specifically the laptops) reporting in regularly and actually carrying out their scheduled events.

@sagetherage sagetherage removed the P2 Priority 2 label Jun 3, 2020
@sriramperumalla
Copy link

I have the same problem with my slat-minion version 3000. I couldn't refresh pillars I have created. I couldn't see my pillars data on my minion even after restarting minion. I could see only the static state /tmp//tmp/deletemeplease.txt: in my init.sls is applied but not the user creation state that obtain multiple user data from pillars. Attaching my sample init.sls for user creation state that needs to read data from pillars through jinja template. Also Attaching my sample qa environment for users state and its pillars directory structure. I am learning salt to deploy our infrastructure with code via jenkins. Your help is much appreciated here to expedite my learning process.
my-qa-env

my init.sls code to create multiple users with one state via jinja and pillar data.
`{% for user, data in pillar.get('admin_users', {}).items() %}
user_{{ user }}:
user.present:
- name: {{ user }}
- fullname: {{ data['fullname'] }}
- shell: {{ data['shell'] }}
- home: {{ data['home'] }}
- uid: {{ data['uid'] }}
- groups: {{ data ['groups'] }}

{{ user }}_key:
ssh_auth.present:
- user: {{ user }}
- name: {{ data['ssh_key'] }}

{% endfor %}

/tmp/deletemeplease.txt:
file.absent
`

@sagetherage sagetherage modified the milestones: Approved, Aluminium Jul 29, 2020
@sagetherage sagetherage added the Aluminium Release Post Mg and Pre Si label Jul 29, 2020
@mpolinowski
Copy link

mpolinowski commented Aug 4, 2020

I am new to Salt and was following an older tutorial when I ran into the same issue. It seems that the expected folder structure changed.

The tutorial said that I should store both my state and pillar data in the /srv/salt/ directory. According to the master config etc/salt/master the actual default pillar root is in /srv/pillar/:

#####         Pillar settings        #####
##########################################
# Salt Pillars allow for the building of global data that can be made selectively
# available to different minions based on minion grain filtering. The Salt
# Pillar is laid out in the same fashion as the file server, with environments,
# a top file and sls files. However, pillar data does not need to be in the
# highstate format, and is generally just key/value pairs.
#pillar_roots:
#  base:
#    - /srv/pillar

Once I moved my files to the correct folders everything started to work :)

EDIT: Another beginner problem I ran into - when creating sub-folders in /srv/salt don't forget to set permissions.

@ari
Copy link
Contributor

ari commented Mar 23, 2021

I can make this happen with 3002.5 master and minions (not multi-master). It is random and hard to reproduce, but when it happens the following things do not help:

  • restarting the master
  • restarting the minion
  • saltutil.refresh_pillar

Eventually, the problem resolves itself. Calling pillar.items repeatedly might help resolve it, but its hard to say definitively. Certainly calling it once doesn't always fix the problem, but eventually it does.

@sagetherage
Copy link
Contributor

I am closing this as won't fix since it is difficult to reproduce on a supported version of Salt. If this comes up again, please site the supported version of Salt and open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt Pillar severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around won't-fix legitimate issue, but won't fix
Projects
None yet
Development

No branches or pull requests