Skip to content

Commit

Permalink
Improve cache and enable caching of TestbedInfo (#2856)
Browse files Browse the repository at this point in the history
What is the motivation for this PR?
The original cache implementation uses json to save cached information to file. For objects that
cannot be serialized to JSON, they cannot be cached. Another limitation is that the cached facts
must be stored under sub-folder specified by hostname. This limited usage of the caching.

How did you do it?
This change improved caching in these ways:
* Replace JSON with pickle to dump and load cached information. Then all object types supported by pickle can be cached.
* Use more generic 'zone' to specify sub-folder for cached facts. The zone can be hostname or other string.
* When 'testbed-cli.sh deploy-mg' is executed, cleanup all the cached information, not just cached information of current DUT.
* Enable caching of TestbedInfo which is also frequently used and less frequently changed.
* Enhance the cleanup method to support removing single cached information, cached information of a specified zone, or all cached information. Not just cached file is removed, cached information in memory is also removed.
* Updated related documents.

How did you verify/test it?
Run test scripts that need cache enabled tbinfo and minigraph_facts.
Run testbed-cli.sh deploy-mg

Signed-off-by: Xin Wang <[email protected]>
  • Loading branch information
wangxin authored Jan 27, 2021
1 parent ca14e1b commit 0dee240
Show file tree
Hide file tree
Showing 4 changed files with 104 additions and 62 deletions.
6 changes: 3 additions & 3 deletions ansible/config_sonic_basedon_testbed.yml
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@
when: "'dualtor' not in topo"

- name: gather dual ToR information
dual_tor_facts:
dual_tor_facts:
hostname: "{{ inventory_hostname }}"
testbed_facts: "{{ testbed_facts }}"
hostvars: "{{ hostvars }}"
Expand Down Expand Up @@ -341,8 +341,8 @@
shell: config save -y
when: save is defined and save|bool == true

- name: cleanup cached facts
shell: python ../tests/common/cache/facts_cache.py {{ inventory_hostname }}
- name: cleanup all cached facts
shell: python ../tests/common/cache/facts_cache.py
delegate_to: localhost
ignore_errors: true

Expand Down
44 changes: 29 additions & 15 deletions tests/common/cache/facts_cache.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,29 +4,31 @@ To run test scripts, we frequently need to gather facts from various devices aga

# Cache Design

To simplify the design, we use local (sonic-mgmt container) json files to cache information. Although reading from local file is slower than reading from memory, it is still much faster than running commands on remote host through SSH connection and parsing the output. A dedicated folder (by default `tests/_cache`) is used to store the cached json files. The json files are grouped into sub-folders by hostname. For example, file `tests/_cache/vlab-01/basic_facts.json` caches some basic facts of host `vlab-01`.
To simplify the design, we use local (sonic-mgmt container) pickle files to cache information. Although reading from local file is slower than reading from memory, it is still much faster than running commands on remote host through SSH connection and parsing the output. Only the first reading of cached information needs to load from file. Subsequent reading are from a runtime dictionary, the performance is equivalent to reading from memory. A dedicated folder (by default `tests/_cache`) is used to store the cached pickle files. The pickle files are grouped into sub-folders by zone (usually hostname, but the zone name can also be something else that unique, like testbed name). For example, file `tests/_cache/vlab-01/basic_facts.pickle` caches some basic facts of host `vlab-01`.

The cache function is mainly implemented in below file:
```
sonic-mgmt/tests/common/cache/facts_cache.py
```

A singleton class FactsCache is implemented. This class supports these interfaces:
* `read(self, hostname, key)`
* `write(self, hostname, key, value)`
* `cleanup(self, hostname=None)`
* `read(self, zone, key)`
* `write(self, zone, key, value)`
* `cleanup(self, zone=None)`

The FactsCache class has a dictionary for holding the cached facts in memory. When the `read` method is called, it firstly read `self._cache[hostname][key]` from memory. If not found, it will try to load the json file. If anything wrong with the json file, it will return an empty dictionary.
The FactsCache class has a dictionary for holding the cached facts in memory. When the `read` method is called, it firstly read `self._cache[zone][key]` from memory. If not found, it will try to load the pickle file. If anything wrong with the pickle file, it will return an empty dictionary.

When the `write` method is called, it will store facts in memory like `self._cache[hostname][key] = value`. Then it will also try to dump the facts to json file `tests/_cache/<hostname>/<key>.json`.
When the `write` method is called, it will store facts in memory like `self._cache[zone][key] = value`. Then it will also try to dump the facts to pickle file `tests/_cache/<zone>/<key>.pickle`.

Because `pickle` library is used for caching, all the objects supported by the `pickle` library can be cached.

# Clean up facts

The `cleanup` function is for cleaning the stored json files.
The `cleanup` function is for cleaning the stored pickle files.

When the `facts_cache.py` script is directly executed with an argument, it will call the `cleanup` function to remove stored json files for host specified by the first argument. If it is executed without argument, then all the stored json files will be removed.
When the `facts_cache.py` script is directly executed with an argument, it will call the `cleanup` function to remove stored pickle files for host specified by the first argument. If it is executed without argument, then all the stored pickle files will be removed.

When `testbed-cli.sh deploy-mg` is executed for specified testbed, the ansible playbook will run `facts_cache.py` to remove stored json files for current testbed as well.
When `testbed-cli.sh deploy-mg` is executed for specified testbed, the ansible playbook will run `facts_cache.py` to remove stored pickle files for current testbed as well.

# Use cache

Expand All @@ -46,7 +48,7 @@ class SonicHost(AnsibleHostBase):
...
```

The `cached` decorator supports name argument which correspond to the `key` argument of `read(self, hostname, key)` and `write(self, hostname, key, value)`.
The `cached` decorator supports name argument which correspond to the `key` argument of `read(self, zone, key)` and `write(self, zone, key, value)`.
The `cached` decorator can only be used on an bound method of class which is subclass of AnsibleHostBase.

## Explicitly use FactsCache
Expand All @@ -66,7 +68,7 @@ cache = FactsCache()
def get_some_facts(self, *args):
cached_facts = cache.read(self.hostname, 'some_facts')
if cached_facts:
return cached facts
return cached_facts
# Code to gather the facts from host.
facts = self._do_stuff_to_gather_facts()
Expand All @@ -75,23 +77,35 @@ def get_some_facts(self, *args):
```

* Another example
```
def get_something():
info = cache.read('common', 'some_info')
if info:
return info
# Code to get the info
info = _do_stuff_to_get_info()
cache.write('common', 'some_info', info)
return info
```

# Cached facts lifecycle in nightly test

* During `testbed-cli.sh deploy-mg` step of testbed deployment, all cached json files of current DUT are removed.
* During `testbed-cli.sh deploy-mg` step of testbed deployment, all cached pickle files are removed.
* Use `pytest test_script1.py test_script2.py` to run one set of test scripts.
* First encounter of cache enabled facts:
* No cache in memory.
* No cache in json file.
* No cache in pickle file.
* Gather from remote host.
* Store in memory.
* Store in json file.
* Store in pickle file.
* Return the facts.
* Subsequent encounter of cache enabled facts.
* Cache in memory, read from memory. Return the facts.
* Use `pytest test_script3.py test_script4.py` to run another set of test scripts.
* First encounter of cache enabled facts:
* No cache in memory.
* Cache in json file. Load from json file.
* Cache in pickle file. Load from pickle file.
* Store in memory.
* Return the facts.
* Subsequent encounter of cache enabled facts.
Expand Down
107 changes: 65 additions & 42 deletions tests/common/cache/facts_cache.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
from __future__ import print_function, division, absolute_import

import logging
import json
import os
import pickle
import shutil
import sys

Expand All @@ -17,7 +17,7 @@
CACHE_LOCATION = os.path.join(CURRENT_PATH, '../../../_cache')

SIZE_LIMIT = 1000000000 # 1G bytes, max disk usage allowed by cache
ENTRY_LIMIT = 1000000 # Max number of json files allowed in cache.
ENTRY_LIMIT = 1000000 # Max number of pickle files allowed in cache.


class Singleton(type):
Expand Down Expand Up @@ -61,84 +61,107 @@ def _check_usage(self):
.format(total_size, SIZE_LIMIT, total_entries, ENTRY_LIMIT)
raise Exception(msg)

def read(self, hostname, key):
def read(self, zone, key):
"""Read cached facts.
Args:
hostname (str): Hostname.
zone (str): Cached facts are organized by zones. This argument is to specify the zone name.
The zone name could be hostname.
key (str): Name of cached facts.
Returns:
obj: Cached object, usually a dictionary.
"""
# Lazy load
if hostname in self._cache and key in self._cache[hostname]:
logger.info('Read cached facts "{}.{}"'.format(hostname, key))
return self._cache[hostname][key]
if zone in self._cache and key in self._cache[zone]:
logger.debug('Read cached facts "{}.{}"'.format(zone, key))
return self._cache[zone][key]
else:
facts_file = os.path.join(self._cache_location, '{}/{}.json'.format(hostname, key))
facts_file = os.path.join(self._cache_location, '{}/{}.pickle'.format(zone, key))
try:
with open(facts_file) as f:
self._cache[hostname][key] = json.load(f)
logger.info('Loaded cached facts "{}.{}" from {}'.format(hostname, key, facts_file))
return self._cache[hostname][key]
self._cache[zone][key] = pickle.load(f)
logger.debug('Loaded cached facts "{}.{}" from {}'.format(zone, key, facts_file))
return self._cache[zone][key]
except (IOError, ValueError) as e:
logger.error('Load json file "{}" failed with exception: {}'\
logger.info('Load cache file "{}" failed with exception: {}'\
.format(os.path.abspath(facts_file), repr(e)))
return {}
return None

def write(self, hostname, key, value):
def write(self, zone, key, value):
"""Store facts to cache.
Args:
hostname (str): Hostname.
zone (str): Cached facts are organized by zones. This argument is to specify the zone name.
The zone name could be hostname.
key (str): Name of cached facts.
value (obj): Value of cached facts. Usually a dictionary.
Returns:
boolean: Caching facts is successful or not.
"""
self._check_usage()
facts_file = os.path.join(self._cache_location, '{}/{}.json'.format(hostname, key))
facts_file = os.path.join(self._cache_location, '{}/{}.pickle'.format(zone, key))
try:
host_folder = os.path.join(self._cache_location, hostname)
if not os.path.exists(host_folder):
logger.info('Create cache dir {}'.format(host_folder))
os.makedirs(host_folder)
cache_subfolder = os.path.join(self._cache_location, zone)
if not os.path.exists(cache_subfolder):
logger.info('Create cache dir {}'.format(cache_subfolder))
os.makedirs(cache_subfolder)

with open(facts_file, 'w') as f:
json.dump(value, f, indent=2)
self._cache[hostname][key] = value
logger.info('Cached facts "{}.{}" under {}'.format(hostname, key, host_folder))
pickle.dump(value, f)
self._cache[zone][key] = value
logger.info('Cached facts "{}.{}" to {}'.format(zone, key, facts_file))
return True
except (IOError, ValueError) as e:
logger.error('Dump json file "{}" failed with exception: {}'.format(facts_file, repr(e)))
logger.error('Dump cache file "{}" failed with exception: {}'.format(facts_file, repr(e)))
return False

def cleanup(self, hostname=None):
"""Cleanup cached json files.
def cleanup(self, zone=None, key=None):
"""Cleanup cached files.
Args:
hostname (str, optional): Hostname. Defaults to None.
zone (str): Cached facts are organized by zones. This argument is to specify the zone name.
The zone name could be hostname. Default to None. When zone is not specified, all the cached facts
will be cleaned up.
key (str): Name of cached facts. Default is None.
"""
if hostname:
sub_items = os.listdir(self._cache_location)
if hostname in sub_items:
host_folder = os.path.join(self._cache_location, hostname)
logger.info('Clean up cached facts under "{}"'.format(host_folder))
shutil.rmtree(host_folder)
if zone:
if key:
if zone in self._cache and key in self._cache[zone]:
del self._cache[zone][key]
logger.debug('Removed "{}.{}" from cache.'.format(zone, key))
try:
cache_file = os.path.join(self._cache_location, zone, '{}.pickle'.format(key))
os.remove(cache_file)
logger.debug('Removed cache file "{}.pickle"'.format(cache_file))
except OSError as e:
logger.error('Cleanup cache {}.{}.pickle failed with exception: {}'.format(zone, key, repr(e)))
else:
logger.error('Sub-folder for host "{}" is not found'.format(hostname))
if zone in self._cache:
del self._cache[zone]
logger.debug('Removed zone "{}" from cache'.format(zone))
try:
cache_subfolder = os.path.join(self._cache_location, zone)
shutil.rmtree(cache_subfolder)
logger.debug('Removed cache subfolder "{}"'.format(cache_subfolder))
except OSError as e:
logger.error('Remove cache subfolder "{}" failed with exception: {}'.format(zone, repr(e)))
else:
logger.info('Clean up all cached facts under "{}"'.format(self._cache_location))
shutil.rmtree(self._cache_location)

self._cache = defaultdict(dict)
try:
shutil.rmtree(self._cache_location)
logger.debug('Removed all cache files under "{}"'.format(self._cache_location))
except OSError as e:
logger.error('Remove cache folder "{}" failed with exception: {}'\
.format(self._cache_location, repr(e)))

def cached(name):
"""Decorator for enabling cache for facts.
The cached facts are to be stored by <name>.json. Because the cached json files must be stored under subfolder for
each host, this decorator can only be used for bound method of class which is subclass of AnsibleHostBase.
The cached facts are to be stored by <name>.pickle. Because the cached pickle files must be stored under subfolder
specified by zone, this decorator can only be used for bound method of class which is subclass of AnsibleHostBase.
The classes have attribute 'hostname' that can be used as zone.
Args:
name ([str]): Name of the cached facts.
Expand Down Expand Up @@ -166,7 +189,7 @@ def wrapper(*args, **kwargs):
if __name__ == '__main__':
cache = FactsCache()
if len(sys.argv) == 2:
hostname = sys.argv[1]
zone = sys.argv[1]
else:
hostname = None
cache.cleanup(hostname)
zone = None
cache.cleanup(zone)
9 changes: 7 additions & 2 deletions tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,13 @@
from tests.common.testbed import TestbedInfo
from tests.common.utilities import get_inventory_files, get_host_visible_vars
from tests.common.helpers.dut_utils import is_supervisor_node, is_frontend_node

from tests.common.cache import FactsCache

from tests.common.connections import ConsoleHost


logger = logging.getLogger(__name__)
cache = FactsCache()

pytest_plugins = ('tests.common.plugins.ptfadapter',
'tests.common.plugins.ansible_fixtures',
Expand Down Expand Up @@ -153,7 +154,11 @@ def get_tbinfo(request):
if tbname is None or tbfile is None:
raise ValueError("testbed and testbed_file are required!")

testbedinfo = TestbedInfo(tbfile)
testbedinfo = cache.read(tbname, 'tbinfo')
if not testbedinfo:
testbedinfo = TestbedInfo(tbfile)
cache.write(tbname, 'tbinfo', testbedinfo)

return tbname, testbedinfo.testbed_topo.get(tbname, {})

@pytest.fixture(scope="session")
Expand Down

0 comments on commit 0dee240

Please sign in to comment.