Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Actor] [Code Quality] Add Unit Tests for Actors Sorting #34058

Merged
merged 62 commits into from
Apr 10, 2023

Commits on Mar 20, 2023

  1. feat

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Mar 20, 2023
    Configuration menu
    Copy the full SHA
    e25340b View commit details
    Browse the repository at this point in the history
  2. code review

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Mar 20, 2023
    Configuration menu
    Copy the full SHA
    1c0c451 View commit details
    Browse the repository at this point in the history
  3. refactor const variables

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Mar 20, 2023
    Configuration menu
    Copy the full SHA
    9341845 View commit details
    Browse the repository at this point in the history
  4. add util test cases

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Mar 20, 2023
    Configuration menu
    Copy the full SHA
    ee504fb View commit details
    Browse the repository at this point in the history
  5. fix

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Mar 20, 2023
    Configuration menu
    Copy the full SHA
    9608cfb View commit details
    Browse the repository at this point in the history
  6. lint

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Mar 20, 2023
    Configuration menu
    Copy the full SHA
    7ab6687 View commit details
    Browse the repository at this point in the history
  7. change to useMemo

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Mar 20, 2023
    Configuration menu
    Copy the full SHA
    7c1d2b8 View commit details
    Browse the repository at this point in the history
  8. fix

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Mar 20, 2023
    Configuration menu
    Copy the full SHA
    cd9fe35 View commit details
    Browse the repository at this point in the history

Commits on Mar 21, 2023

  1. Configuration menu
    Copy the full SHA
    9b86757 View commit details
    Browse the repository at this point in the history
  2. add tests

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Mar 21, 2023
    Configuration menu
    Copy the full SHA
    917c9f5 View commit details
    Browse the repository at this point in the history

Commits on Apr 4, 2023

  1. debug

    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    9b7a381 View commit details
    Browse the repository at this point in the history
  2. feat

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    94283fa View commit details
    Browse the repository at this point in the history
  3. code review

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    4fc9665 View commit details
    Browse the repository at this point in the history
  4. refactor const variables

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    74aef46 View commit details
    Browse the repository at this point in the history
  5. add util test cases

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    b2526f6 View commit details
    Browse the repository at this point in the history
  6. fix

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    cdc99d7 View commit details
    Browse the repository at this point in the history
  7. lint

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    eefc919 View commit details
    Browse the repository at this point in the history
  8. change to useMemo

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    49e268b View commit details
    Browse the repository at this point in the history
  9. fix

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    6d33c08 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    ab70669 View commit details
    Browse the repository at this point in the history
  11. [Dashboard] Delete old IA (ray-project#33308)

    Deletes the old IA.
    
    Deleted all routes from old IA that were no longer used. (One remained: CmdResult)
    Moved all the new IA routes from /new/<route> to /<route>
    Deleted all components that were no longer used as a result of removing old IA routes
    Deleted all parameters for "newIA" and removed the code paths where newIA is false
    Deleted the dark theme from the the app (theme was only adjustable in the old IA)
    I manually tested every single button to make sure links still worked. Our testing on the dashboard is not quite good enough yet (it's getting better) to trust the tests to catch all possible bugs here. The earlier we get this into nightly, the more manual testing from users we can get.
    
    Signed-off-by: chaowang <[email protected]>
    alanwguo authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    1ffcd92 View commit details
    Browse the repository at this point in the history
  12. [Docs] Final polish batch_tuning and batch_forecasting examples (ray-…

    …project#31578)
    
    Why are these changes needed?
    Changes made to batch_tuning.ipynb and batch_forecasting.ipynb:
    
    Update notebook texts, make steps clearer.
    Tune outputs only showing SMOKE_TEST outputs.
    
    ---------
    
    Signed-off-by: Christy Bergman <[email protected]>
    Signed-off-by: Antoni Baum <[email protected]>
    Co-authored-by: Antoni Baum <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    2 people authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    b6c43be View commit details
    Browse the repository at this point in the history
  13. [RLlib] Remove all default config objects and rllib/agents (ray-proje…

    …ct#33242)
    
    Signed-off-by: Artur Niederfahrenhorst <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    ArturNiederfahrenhorst authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    e610960 View commit details
    Browse the repository at this point in the history
  14. [core][ci] Fix test_task_events.py (ray-project#33343)

    Fix the test run.
    
    
    Signed-off-by: chaowang <[email protected]>
    rickyyx authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    396fb3c View commit details
    Browse the repository at this point in the history
  15. [ci] Only pin typeguard for Python <=3.7 (ray-project#33402)

    70dbf41 pinned typeguard to make ax-platform work again on latest master, but a8a1ed0 updated ax-platform for python >= 3.8. The typeguard pin is incompatible with this pinned version. So we need to pin typeguard only for Python 3.7 to make the image builds for 3.8+ work again.
    
    Signed-off-by: Kai Fricke <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    krfricke authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    466d55e View commit details
    Browse the repository at this point in the history
  16. [tune] Cleanup path-related properties in experiment classes (ray-pro…

    …ject#33370)
    
    The naming of different path-related components in Ray Tune is currently messy. For instance, `Experiment.local_dir` refers to main results directory, e.g. `~/ray_results`, while `Trial.local_dir` refers to the experiment directory, e.g. `~/ray_results/experiment`. The same is true for properties, where it's unclear if it refers to the object's sub directory or to its parent directory.
    
    To disentangle this information, this PR introduces a new naming convention.
    
    - All entities receiving a "parent path" receive it in a unambiguous naming scheme.
    - For instance, `Experiment(storage_path)`, `TrialRunner(experiment_path)`, `Trial(experiment_path)`
    - Outputs are also normalized. E.g. `Trial.remote_experiment_path` and `Trial.local_experiment_path`
    - We keep existing arguments and properties for backwards compatibility
    
    Signed-off-by: Kai Fricke <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    krfricke authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    403abb9 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    1256bbe View commit details
    Browse the repository at this point in the history
  18. [Release Test] Remove runtime env usage from release tests (ray-proje…

    …ct#33288)
    
    Use SDK commands for all core tests.
    
    It is because there was a big regression after migrating to V2 anyscale job runner.
    
    Signed-off-by: chaowang <[email protected]>
    rkooo567 authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    f5daa34 View commit details
    Browse the repository at this point in the history
  19. [data] [streaming] Fix inability to pickle Dataset in the middle of s…

    …treaming execution (ray-project#33406)
    
    Signed-off-by: chaowang <[email protected]>
    ericl authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    f56a93b View commit details
    Browse the repository at this point in the history
  20. [ci] Display Core prerelease dependencies (ray-project#33408)

    Signed-off-by: Matthew Deng <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    matthewdeng authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    48ce37f View commit details
    Browse the repository at this point in the history
  21. [RLlib] Delete duplicate test_vtrace_v2 file (ray-project#33315)

    Signed-off-by: Avnish <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    avnishn authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    44a9f88 View commit details
    Browse the repository at this point in the history
  22. [RLlib] Add option for running multiple sgd iters for impala learner …

    …api (ray-project#33316)
    
    Signed-off-by: Avnish <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    avnishn authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    9516184 View commit details
    Browse the repository at this point in the history
  23. [RLlib] Move Learner Hp assignment to validate (ray-project#33392)

    * Move adding params to learner hps to validate in order to be compatible with rllib yaml files
    * Move learner_hp assignment from builder functions to validate
    
    Signed-off-by: Avnish <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    avnishn authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    db3beed View commit details
    Browse the repository at this point in the history
  24. [Core][deps] pin json-schema < 4.18 (ray-project#33412)

    ray-project#33411 there is a risk ray client_server will crash with jsonschema >= 4.18
    
    
    Signed-off-by: chaowang <[email protected]>
    scv119 authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    597b169 View commit details
    Browse the repository at this point in the history
  25. [CI][runtime-env] fix runtime-env test (ray-project#33400)

    This PR adds 3.11 in the allowed runtime env conda version which was made available during the recent conda 2.23.0 release. conda/conda#11170 (comment)
    
    It also fixes a a wheel rename for mac osx wheels from "*_intel.whl" to "*x86_64.whl" in recent builds.
    
    A previous intel wheel build for mac:https://buildkite.com/ray-project/oss-ci-build-branch/builds/1268#0184a81c-6dd7-4cd2-8534-b6f763b8ce24
    A more recent x86_64 name wheel build: https://buildkite.com/ray-project/oss-ci-build-branch/builds/2599#0186badf-4b95-44fd-b57d-6e42d5462c1d
    
    Signed-off-by: rickyyx <[email protected]>
    Co-authored-by: scv119 <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    2 people authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    8f9c4fd View commit details
    Browse the repository at this point in the history
  26. [Core] Python worker startup benchmark (ray-project#33281)

    $ ./benchmark_worker_startup.py --help
    usage: benchmark_worker_startup.py [-h] --num_cpus_in_cluster
                                       NUM_CPUS_IN_CLUSTER
                                       --num_tasks_or_actors_per_run
                                       NUM_TASKS_OR_ACTORS_PER_RUN
                                       --num_measurements_per_configuration
                                       NUM_MEASUREMENTS_PER_CONFIGURATION
    This release test measures Ray worker startup time. Specifically, it measures
    the time to start N different tasks or actors, where each task or actor imports
    a large library (currently PyTorch). N is configurable. The test runs under a
    few different configurations: {task, actor} x {runtime env, no runtime env} x
    {GPU, no GPU} x {cold start, warm start}.
    options:
      -h, --help            show this help message and exit
      --num_cpus_in_cluster NUM_CPUS_IN_CLUSTER
                            The number of CPUs in the cluster. This determines how
                            many CPU resources each actor/task requests.
      --num_tasks_or_actors_per_run NUM_TASKS_OR_ACTORS_PER_RUN
                            The number of tasks or actors per 'run'. A run starts
                            this many tasks/actors and consitutes a single
                            measurement. Several runs can be composed within a
                            single job for measure warm start, or spread across
                            different jobs to measure cold start.
      --num_measurements_per_configuration NUM_MEASUREMENTS_PER_CONFIGURATION
                            The number of measurements to record per configuration.
    This script uses test_single_configuration.py to run the actual measurements.
    
    Signed-off-by: chaowang <[email protected]>
    cadedaniel authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    c5013f8 View commit details
    Browse the repository at this point in the history
  27. [serve] Add replica info to metadata rest api (ray-project#33292)

    This adds details about the live replicas for each deployment to be fetched from the new GET endpoint.
    
    Sample replica detail from running a test application:
    ```
    replica_id: app2_BasicDriver#KhlXQe
    state: RUNNING
    pid: 25853
    actor_name: SERVE_REPLICA::app2_BasicDriver#KhlXQe
    actor_id: 2355af670b023966af79501501000000
    node_id: 3631e75fc5312752c54b567ee66491a1e58a0420f0abc5b1c44e70cf
    node_ip: 192.168.0.141
    start_time_s: 1678818083.039281
    ```
    
    Details:
    * `is_allocated` on each replica used to return just the node id for the controller to confirm the replica has been placed on a node and started. Now, it returns a tuple of runtime-context-related info:
      * `pid`
      * `actor_id`
      * `node_id`
      * `node_ip`
    * The four fields listed above that are retrieved from the replica actor may be `None` before the actor is actually scheduled, so they are marked optional in the schema. (The rest of the fields are filled in immediately when the replica is created to be tracked in the controller)
    ```
    class ReplicaDetails(BaseModel, extra=Extra.forbid):
        replica_id: str
        state: ReplicaState
        pid: Optional[int]
        actor_name: str
        actor_id: Optional[str]
        node_id: Optional[str]
        node_ip: Optional[str]
        start_time_s: float
    ```
    
    Signed-off-by: chaowang <[email protected]>
    zcin authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    c09fd76 View commit details
    Browse the repository at this point in the history
  28. [serve] Prevent mixing single/multi-app config deployment (ray-projec…

    …t#33340)
    
    To simplify execution and not have to worry about covering all possible conflicts, we want to prevent users from deploying a single-app config (`ServeApplicationSchema`) first then switch to deploying a multi-app config (`ServeDeploySchema`), or vice versa.
    
    Eventually we want to deprecate deploying using `ServeApplicationSchema`, so this also encourages users to migrate.
    
    If users mix single-app and multi-app:
    - we will raise an error in `controller.deploy_apps`
    - the REST api will also return a `400 Response` with the error message
    
    Signed-off-by: chaowang <[email protected]>
    zcin authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    c19677d View commit details
    Browse the repository at this point in the history
  29. [Data] Ramp up max_tasks_in_flight (ray-project#33379)

    n this PR, we start the max_tasks_in_flight at 1 to make sure we fully utilize all the min_workers to start with. Then once min_workers have been created, we ramp up to the default max_tasks_in_flight of 4.
    
    This allows us to maximize parallelism for datasets with a small number of blocks.
    
    ---------
    
    Signed-off-by: amogkam <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    amogkam authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    3396292 View commit details
    Browse the repository at this point in the history
  30. Add Sematic as a tool with a Ray integration (ray-project#33387)

    Signed-off-by: chaowang <[email protected]>
    augray authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    ca5b00d View commit details
    Browse the repository at this point in the history
  31. [RLlib][Docs] Restructure Policy's API page (ray-project#33344)

    Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    kouroshHakha authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    eb53ec6 View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    b19fdac View commit details
    Browse the repository at this point in the history
  33. Fix Ray on Spark node options verification (ray-project#33382)

    The current node option verification attempts to convert a string key to a dictionary when constructing an error message for blocked options, resulting in unclear / unintended exception.
    
    Signed-off-by: dbczumar <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    dbczumar authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    86c1a52 View commit details
    Browse the repository at this point in the history
  34. [core] fifo worker killing policy (ray-project#33430)

    For long-living, memory leaking actors, it is more desirable to kill oldest task that is leaking the most. This avoid the situation where we constantly kill actor, which may lead to side effects where we generate a lot of log files, or trigger increased memory consumption in gcs / dashboard
    
    Co-authored-by: Clarence Ng <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    2 people authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    17a289e View commit details
    Browse the repository at this point in the history
  35. [nightly] Reduce EBS volume for many nodes tests (ray-project#33432)

    EBS is not necessary for many nodes test and it's 150GB by default. This reduce it to 30GB.
    ---------
    
    Co-authored-by: Chen Shen <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    2 people authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    12ac2d6 View commit details
    Browse the repository at this point in the history
  36. Configuration menu
    Copy the full SHA
    0ab0525 View commit details
    Browse the repository at this point in the history
  37. Configuration menu
    Copy the full SHA
    3528a0a View commit details
    Browse the repository at this point in the history
  38. Fix linker argument error in cpp example build file. (ray-project#33435)

    CPP generate-bazel-project-template is failing during build, due to extra space in the BUILD.bazel file. This commit fixes that.
    
    Signed-off-by: Soumitra Kumar <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    soumitrak authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    64262ea View commit details
    Browse the repository at this point in the history
  39. [core][state] Task backend - Profile events capping (ray-project#33321)

    This PR restricts the number of profile events to be sent and aggregates task events from the same task attempt on the worker side to reduce the data sent to GCS.
    This PR also refactos the metrics tracking to reduce lock contention on the core worker.
    
    Signed-off-by: chaowang <[email protected]>
    rickyyx authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    e03f8ce View commit details
    Browse the repository at this point in the history
  40. [Doc] Workspace template examples (ray-project#32802)

    Signed-off-by: chaowang <[email protected]>
    justinvyu authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    0cfe1fd View commit details
    Browse the repository at this point in the history
  41. [Doc|Train] Add Pytorch ResNet finetuning starter example (ray-projec…

    …t#32936)
    
    Co-authored-by: angelinalg <[email protected]>
    Co-authored-by: Justin Yu <[email protected]>
    Co-authored-by: Yunxuan Xiao <[email protected]>
    Co-authored-by: Yunxuan Xiao <[email protected]>
    Signed-off-by: chaowang <[email protected]>
    5 people authored and chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    44eb07d View commit details
    Browse the repository at this point in the history
  42. add tests

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    a15acbb View commit details
    Browse the repository at this point in the history
  43. debug

    Signed-off-by: chaowang <[email protected]>
    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    1d3ca84 View commit details
    Browse the repository at this point in the history
  44. Merge branch 'cw/component_test_actors' of github.com:chaowanggg/ray-…

    …dev into cw/component_test_actors
    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    a7ff603 View commit details
    Browse the repository at this point in the history
  45. merge

    chaowanggg committed Apr 4, 2023
    Configuration menu
    Copy the full SHA
    82f0f5c View commit details
    Browse the repository at this point in the history

Commits on Apr 5, 2023

  1. review

    chaowanggg committed Apr 5, 2023
    Configuration menu
    Copy the full SHA
    8e5b7c7 View commit details
    Browse the repository at this point in the history
  2. typo

    chaowanggg committed Apr 5, 2023
    Configuration menu
    Copy the full SHA
    f2cf9a6 View commit details
    Browse the repository at this point in the history
  3. const

    chaowanggg committed Apr 5, 2023
    Configuration menu
    Copy the full SHA
    572b7b7 View commit details
    Browse the repository at this point in the history

Commits on Apr 6, 2023

  1. add tests

    chaowanggg committed Apr 6, 2023
    Configuration menu
    Copy the full SHA
    fe20402 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    3591d3b View commit details
    Browse the repository at this point in the history

Commits on Apr 10, 2023

  1. fix

    chaowanggg committed Apr 10, 2023
    Configuration menu
    Copy the full SHA
    a1c1b5a View commit details
    Browse the repository at this point in the history
  2. revert NodeRow

    chaowanggg committed Apr 10, 2023
    Configuration menu
    Copy the full SHA
    c723b11 View commit details
    Browse the repository at this point in the history