Pushing changes to GitHub Pages.

NVIDIA-Merlin · Jan 4, 2024 · 5edaf23 · 5edaf23
1 parent 79c03c9
commit 5edaf23
Show file tree

Hide file tree

Showing 3 changed files with 7 additions and 2 deletions.
diff --git a/main/hierarchical_parameter_server/profiling_hps.html b/main/hierarchical_parameter_server/profiling_hps.html
@@ -235,7 +235,7 @@ <h2>Use the HPS Profiler to get the measurement results<a class="headerlink" hre
 -h --help                       shows help message and exits [default: false]
 -v --version                    prints version information and exits [default: false]
 --config                        The path of the HPS json configuration file [required]
---powerlaw                      Generate the queried key that  in each iteration based on the power distribution [default: false]
+--distribution                      The distribution of the generated query key in each iteration. Can be &#39;powerlaw&#39;, &#39;hotkey&#39;, or &#39;histogram&#39; [default: &quot;powerlaw&quot;]
 --table_size                    The number of keys in the embedded table [default: 100000]
 --alpha                         Alpha of power distribution [default: 1.2]
 --hot_key_percentage            Percentage of hot keys in embedding tables [default: 0.2]

diff --git a/main/release_notes.html b/main/release_notes.html
@@ -97,6 +97,11 @@ <h1>Release Notes<a class="headerlink" href="#release-notes" title="Permalink to
 <section id="what-s-new-in-version-23-12">
 <h2>What’s New in Version 23.12<a class="headerlink" href="#what-s-new-in-version-23-12" title="Permalink to this heading"></a></h2>
 <ul>
+<li><p><strong>Lock-free Inference Cache in HPS</strong></p>
+<ul class="simple">
+<li><p>We have added a new lock-free GPU embedding cache for the hierarhical parameter server, which can further improve the performance of embedding table lookup in inference. It also doesn’t lead to data inconsistency even if concurrent model updates or missing key insertions are in use. That is because we ensure the cache consistency through the asynchronous stream synchronization mechanism. To enable lock-free GPU embedding cache, a user only needs to set <a class="reference external" href="https://nvidia-merlin.github.io/HugeCTR/main/hierarchical_parameter_server/hps_database_backend.html#configuration">“embedding_cache_type”</a> to <code class="docutils literal notranslate"><span class="pre">dynamic</span></code> and <code class="docutils literal notranslate"><span class="pre">&quot;use_hctr_cache_implementation&quot;</span></code> to <code class="docutils literal notranslate"><span class="pre">false</span></code>.</p></li>
+</ul>
+</li>
 <li><p><strong>Official SOK Release</strong></p>
 <ul class="simple">
 <li><p>The SOK is not an <code class="docutils literal notranslate"><span class="pre">experiment</span></code> package anymore but is now officially supported by HugeCTR. Do <code class="docutils literal notranslate"><span class="pre">import</span> <span class="pre">sparse_operation_kit</span> <span class="pre">as</span> <span class="pre">sok</span></code> instead of <code class="docutils literal notranslate"><span class="pre">from</span> <span class="pre">sparse_operation_kit</span> <span class="pre">import</span> <span class="pre">experiment</span> <span class="pre">as</span> <span class="pre">sok</span></code></p></li>

diff --git a/main/searchindex.js b/main/searchindex.js