-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core][2/n] Core structured logging: add Text formatter and pass log config to worker process #45344
Conversation
}, | ||
"handlers": { | ||
"console": { | ||
"level": "INFO", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default log level is Ray Serve is INFO
.
}, | ||
"filters": { | ||
"core_context": { | ||
"()": CoreContextFilter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The key '()' has been used as the special key because it is not a valid keyword parameter name, and so will not clash with the names of the keyword arguments used in the call. The '()' also serves as a mnemonic that the corresponding value is a callable."
Ref: https://docs.python.org/3/library/logging.config.html#logging-config-dictschema
python/ray/job_config.py
Outdated
@@ -50,6 +51,7 @@ def __init__( | |||
ray_namespace: Optional[str] = None, | |||
default_actor_lifetime: str = "non_detached", | |||
_py_driver_sys_path: Optional[List[str]] = None, | |||
py_log_config: Optional[Union[dict, str]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's create a LoggingConfig class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class LoggingConfig:
dict_config: Union[dict, str] = "TEXT"
log_level: int = INFO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to create a LoggingConfig
class? With LoggingConfig
, user's code will look like:
ray.init(job_config=ray.job_config.JobConfig(py_log_config=ray.xxx.yyy.zzz.LoggingConfig("TEXT")))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to create a LoggingConfig class?
Maybe this is because we might add new fields to the LoggingConfig in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated bba3b09
# A dictionary to map encoding types to their corresponding logging configurations. | ||
LOG_MODE_DICT = { | ||
"TEXT": { | ||
"version": 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"version - to be set to an integer value representing the schema version. The only valid value at present is 1, but having this key allows the schema to evolve while still preserving backwards compatibility."
def generate_record_format_attrs( | ||
formatter: logging.Formatter, | ||
record: logging.LogRecord, | ||
exclude_standard_attrs=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have decided to use exclude_standard_attrs
instead of include_standard_attrs
or include_extra_attrs
because I think exclude_standard_attrs
is more accurate.
include_standard_attrs
: This function doesn't include all standard LogRecrod attributes.include_extra_attrs
: This function not only include attributes specified bylogger.info(..., extra={...})
but also Ray context.
LOG_MODE_DICT = { | ||
"TEXT": { | ||
"version": 1, | ||
"disable_existing_loggers": False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disable_existing_loggers – If specified as False, loggers which exist when this call is made are left enabled. The default is True because this enables old behaviour in a backward-compatible way. This behaviour is to disable any existing non-root loggers unless they or their ancestors are explicitly named in the logging configuration.
python/ray/job_config.py
Outdated
@@ -226,4 +233,5 @@ def from_json(cls, job_config_json): | |||
ray_namespace=job_config_json.get("ray_namespace", None), | |||
_client_job=job_config_json.get("client_job", False), | |||
_py_driver_sys_path=job_config_json.get("py_driver_sys_path", None), | |||
py_log_config=job_config_json.get("py_log_config", None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what we get from job_config_json won't have type LoggingConfig. I think we don't need to support json for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
class LoggingConfig: | ||
def __init__(self, log_config: Union[dict, str]): | ||
if isinstance(log_config, str): | ||
assert log_config in LOG_MODE_DICT, ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should raise ValueError to the user instead of check fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated a38ec10
|
||
|
||
class LoggingConfig: | ||
def __init__(self, log_config: Union[dict, str]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed, let's do
class LoggingConfig:
dict_config: Union[dict, str] = "TEXT"
# Only effective if dict_config is not a dict
log_level: int = INFO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated a38ec10
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
9ba4e95
to
1b37390
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lg
@rkooo567 can you also check the API to see if it makes sense? |
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
from ray._private.structured_logging.formatters import TextFormatter | ||
from ray._private.structured_logging.filters import CoreContextFilter | ||
|
||
LOG_MODE_DICT = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think we can just define this in the constants.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving to constants.py
will cause a circular import.
LOG_MODE_DICT
requiresTextFormatter
andCoreContextFilter
fromformatters.py
andfilters.py
.- Both
filters.py
andformatters.py
also importconstants.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated 2ec1b4f
python/ray/job_config.py
Outdated
from ray.util.annotations import PublicAPI | ||
|
||
if TYPE_CHECKING: | ||
from ray.runtime_env import RuntimeEnv | ||
|
||
|
||
@PublicAPI(stability="alpha") | ||
class LoggingConfig: | ||
def __init__(self, log_config: Union[dict, str] = "TEXT", log_level: str = "INFO"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log_config -> dict_config? to make it clear that we use python dictConfig to configure the logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated d9e63aa
python/ray/_raylet.pyx
Outdated
py_logging_config = pickle.loads(serialized_py_logging_config) | ||
log_config_dict = py_logging_config.get_dict_config() | ||
if log_config_dict: | ||
logging.config.dictConfig(log_config_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to try catch. See
if worker_process_setup_hook_key:
error = load_and_execute_setup_hook(worker_process_setup_hook_key)
if error is not None:
worker.core_worker.drain_and_exit_worker("system", error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated d9e63aa
Signed-off-by: kaihsun <[email protected]>
python/ray/_raylet.pyx
Outdated
except Exception as e: | ||
backtrace = \ | ||
"".join(traceback.format_exception(type(e), e, e.__traceback__)) | ||
print(backtrace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't add print
here, users can't see any error message from the driver process. The following screenshot is an example of backtrace
.
import ray
import logging
from ray.job_config import LoggingConfig
ray.init(
job_config=ray.job_config.JobConfig(py_logging_config=LoggingConfig({"abc": "123"}))
)
def init_logger():
return logging.getLogger()
logger = logging.getLogger("ray")
logger.info("Driver process", extra={"username": "johndoe"})
@ray.remote
def f():
logger = init_logger()
logger.info("A Ray task")
task_obj_ref = f.remote()
ray.get(task_obj_ref)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this print()
for now to be consistent with the error handling of worker_process_startup_hook. Users are able to know the worker crash reason from ray list workers
and ActorDiedException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this print() for now to be consistent with the error handling of worker_process_startup_hook. Users are able to know the worker crash reason from ray list workers and ActorDiedException
I don't think any user can troubleshoot this kind of issue on their own before we configure the root logger of the driver process. It's OK to remove print
here because:
- We currently only support
encoding
and verify it during its initialization process. Hence, it is impossible to go into thisexcept
statement. - We have currently decided to configure the root logger of the driver process, so the error message will also be printed in the driver process, even though we don't add
print(backtrace)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated 36dd127
|
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
…ctured-logging-2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lg
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
python/ray/_raylet.pyx
Outdated
except Exception as e: | ||
backtrace = \ | ||
"".join(traceback.format_exception(type(e), e, e.__traceback__)) | ||
print(backtrace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this print()
for now to be consistent with the error handling of worker_process_startup_hook. Users are able to know the worker crash reason from ray list workers
and ActorDiedException
Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
|
||
|
||
@PublicAPI(stability="alpha") | ||
@dataclass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kw_only=True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
python/ray/_private/worker.py
Outdated
@@ -1332,6 +1335,8 @@ def init( | |||
timestamp, filename, line number, and message. See the source file | |||
ray_constants.py for details. Ignored unless "configure_logging" | |||
is true. | |||
logging_config: [Experimental] Logging configuration will be applied to both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:class:`~ray._private...`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated 83cf819. I don't build the doc locally. I will wait for the CI to verify whether :class:
~LoggingConfig`` works or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed logging_config: Optional[LoggingConfig] = None
to logging_config: Optional["LoggingConfig"] = None
. This change generates a hyperlink after rendering. However, the link points to Ray Serve's LoggingConfig
instead of the LoggingConfig
for structured logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used
logging_config: Optional["ray._private.structured_logging.logging_config.LoggingConfig"] = None
no hyperlink is generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possible solution that I can come up with is to use a different class name. Honestly, I don’t have any clue what happens under the hood.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll fix it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solution: #45736 (comment)
Signed-off-by: kaihsun <[email protected]>
…config to worker process (ray-project#45344) Signed-off-by: kaihsun <[email protected]> Signed-off-by: Richard Liu <[email protected]>
Why are these changes needed?
Example (outdated)
Example
Output
Example
Example
Output
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.