From 66bc2a90d5209b2a78b5ad13d6edcaf4d96dfbfe Mon Sep 17 00:00:00 2001
From: Stephanie Wang <swang@cs.berkeley.edu>
Date: Mon, 28 Nov 2022 16:25:31 -0800
Subject: [PATCH] Revert "[doc] Add a session in ray core doc for tips to run
 large ray cluster. (#30599)"

This reverts commit ba4af8ef16265b2a13cec149d27265516d3884bb.
---
 doc/source/ray-core/miscellaneous.rst | 130 --------------------------
 src/ray/common/ray_config_def.h       |   7 --
 2 files changed, 137 deletions(-)

diff --git a/doc/source/ray-core/miscellaneous.rst b/doc/source/ray-core/miscellaneous.rst
index dcf9092dc9fb..9c20ce31282c 100644
--- a/doc/source/ray-core/miscellaneous.rst
+++ b/doc/source/ray-core/miscellaneous.rst
@@ -216,133 +216,3 @@ To get information about the current available resource capacity of your cluster
 
 .. autofunction:: ray.available_resources
     :noindex:
-
-Running Large Ray Clusters
---------------------------
-
-Here are some tips to run Ray with more than 1k nodes. When running Ray with such
-a large number of nodes, several system settings may need to be tuned to enable
-communication between such a large number of machines.
-
-Tuning Operating System Settings
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Because all nodes and workers connect to the GCS, many network connections will
-be created and the operating system has to support that number of connections.
-
-Maximum open files
-******************
-
-The OS has to be configured to support opening many TCP connections since every
-worker and raylet connects to the GCS. In POSIX systems, the current limit can
-be checked by ``ulimit -n`` and if it's small, it should be increased according to
-the OS manual. 
-
-ARP cache
-*********
-
-Another thing that needs to be configured is the ARP cache. In a large cluster,
-all the worker nodes connect to the head node, which adds a lot of entries to
-the ARP table. Ensure that the ARP cache size is large enough to handle this
-many nodes.
-Failure to do this will result in the head node hanging. When this happens,
-``dmesg`` will show errors like ``neighbor table overflow message``.
-
-In Ubuntu, the ARP cache size can be tuned in ``/etc/sysctl.conf`` by increasing
-the value of ``net.ipv4.neigh.default.gc_thresh1`` - ``net.ipv4.neigh.default.gc_thresh3``. 
-For more details, please refer to the OS manual.
-
-Tuning Ray Settings
-~~~~~~~~~~~~~~~~~~~
-
-.. note::
-  There is an ongoing `project <https://github.com/ray-project/ray/projects/15>`_ focusing on
-  improving Ray's scalability and stability. Feel free to share your thoughts and use cases.
-
-To run a large cluster, several parameters need to be tuned in Ray.
-
-Resource broadcasting
-*********************
-
-.. note::
-  There is an ongoing `project <https://github.com/ray-project/ray/issues/30631>`_ changing the
-  algorithm to pull-based which doesn't require tuning these parameters.
-
-
-Another functionality GCS provided is to ensure each worker node has a view of
-available resources of each other in the Ray cluster. Each raylet is going to
-push its local available resource to GCS and GCS will broadcast it to all the
-raylet periodically. The time complexity is O(N^2). In a large Ray cluster, this
-is going to be an issue, since most of the time is spent on broadcasting the
-resources. There are several settings we can use to tune this: 
-
-- ``RAY_resource_broadcast_batch_size`` The maximum number of nodes in a single
-  request sent by GCS, by default 512.
-- ``RAY_raylet_report_resources_period_milliseconds`` The interval between two
-  resources report in raylet, 100ms by default.
-
-Be aware that this is a trade-off between scheduling performance and GCS loads.
-Decreasing the resource broadcasting frequency might make scheduling slower.
-
-gRPC threads for GCS
-********************
-
-.. note::
-   There is an ongoing `PR <https://github.com/ray-project/ray/pull/30131>`_
-   setting it to vCPUs/4 by default. It's not necessary to set this up in ray 2.3+.
-
-
-By default, only one gRPC thread is used for server and client polling from the
-completion queue. This might become the bottleneck if QPS is too high.
-
-- ``RAY_gcs_server_rpc_server_thread_num`` Control the number of threads in GCS
-  polling from the server completion queue, by default, 1. 
-- ``RAY_gcs_server_rpc_client_thread_num`` Control the number of threads in GCS
-  polling from the client completion queue, by default, 1. 
-
-
-Benchmark
-~~~~~~~~~
-
-The machine setup:
-
-- 1 head node: m5.4xlarge (16 vCPUs/64GB mem)
-- 2000 worker nodes: m5.large (2 vCPUs/8GB mem)
-
-The OS setup:
-
-- Set the maximum number of opening files to 1048576
-- Increase the ARP cache size:
-    - ``net.ipv4.neigh.default.gc_thresh1=2048``
-    - ``net.ipv4.neigh.default.gc_thresh2=4096``
-    - ``net.ipv4.neigh.default.gc_thresh3=8192``
-
-
-The Ray setup:
-
-- ``RAY_gcs_server_rpc_client_thread_num=3``
-- ``RAY_gcs_server_rpc_server_thread_num=3``
-- ``RAY_event_stats=false``
-- ``RAY_gcs_resource_report_poll_period_ms=1000``
-
-Test workload:
-
-- Test script: `code <https://github.com/ray-project/ray/blob/master/release/nightly_tests/many_nodes_tests/actor_test.py>`_
-
-
-
-.. list-table:: Benchmark result
-   :header-rows: 1
-
-   * - Number of actors
-     - Actor launch time
-     - Actor ready time
-     - Total time
-   * - 20k (10 actors / node)
-     - 5.8s
-     - 146.1s
-     - 151.9s
-   * - 80k (40 actors / node)
-     - 21.1s
-     - 583.9s
-     - 605s
diff --git a/src/ray/common/ray_config_def.h b/src/ray/common/ray_config_def.h
index c32e733c39e4..c54512e7c62d 100644
--- a/src/ray/common/ray_config_def.h
+++ b/src/ray/common/ray_config_def.h
@@ -714,16 +714,9 @@ RAY_CONFIG(std::string, testing_asio_delay_us, "")
 
 /// A feature flag to enable pull based health check.
 RAY_CONFIG(bool, pull_based_healthcheck, true)
-/// The following are configs for the health check. They are borrowed
-/// from k8s health probe (shorturl.at/jmTY3)
-
-/// The delay to send the first health check.
 RAY_CONFIG(int64_t, health_check_initial_delay_ms, 5000)
-/// The interval between two health check.
 RAY_CONFIG(int64_t, health_check_period_ms, 3000)
-/// The timeout for a health check.
 RAY_CONFIG(int64_t, health_check_timeout_ms, 10000)
-/// The threshold to consider a node dead.
 RAY_CONFIG(int64_t, health_check_failure_threshold, 5)
 
 /// The pool size for grpc server call.