Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Live debugging & Remote Configuration #398

Closed
wants to merge 6 commits into from
Closed

Conversation

mellon85
Copy link

@mellon85 mellon85 commented Apr 19, 2024

Adding live debugging support. It requires remote configuration to work at all, so to do proper integration tests of the functionality, that part is also in this big PR.

Dynamic configuration is a small part, just 50 lines of code, describing the structures it contains.

As to more specific changes to current code:

  • The sidecar is now per user (on windows it already uses the ConsoleSession), instead of per system.
    • On Windows it was necessary as memory can only easily be shared within a ConsoleSession.
    • On Linux it's now necessary as the notification mechanism for changes to the remote config are distributed via signals, which require either same euid or root-like capabilities.
  • Note AsBytes in ddcommon currently requires &'a AsBytes<'a>, while only &AsBytes<'a> (independent lifetime) is required.

General notes:

  • Code in remote-config/ is meant to be reusable and also accessible via FFI as needed. For most users the bare ConfigFetcher should be enough, and for some the SharedFetcher is needed. Fewest should actually need the MultiTargetFetcher. A FFi interface hasn't been defined yet as it wasn't needed:
    • ConfigFetcher in its simplest usage is just one function with two callbacks (store, update) essentially. StoredFile should be a wrapper for a type native to the FFI user (a pointer probably).
    • SharedFetcher is similar, but presents a run-loop. To be used in anyway multi-threaded applications, now three callbacks: store, update and on_fetched.
  • The sidecar will notify participating processes whenever remote config changes.
  • The sidecar uses shared memory to transmit the stuff received by the remote-config stuff.
  • live_debugger is:
    • a lot of manual parsing logic (If some value is X, then these fields are applicable. If this field is present, then it means this, otherwise that etc.; things which aren't trivially described by just deserializing it into a struct).
    • an evaluator for parsed expressions, i.e. walking the parsed AST and requesting values from the runtime.
    • a log probe sender

TODOs:

  • Windows support for notify on remote config change is still outstanding.
  • There is an unused timeout in ConfigFetcher. Drop it, or do we need it? No idea.
  • Make used products configurable
  • Figure out ddtags sending for live debugger sender.
  • More /// docblocks on the sidecar side.
  • Tests for ConfigFetcher
  • Tests for SharedFetcher
  • Tests for MultiTargetFetcher
  • Tests for live debugger parsing.
  • Tests for expr_eval.rs

@bwoebi bwoebi force-pushed the bob/live-debugger branch 6 times, most recently from 67d1d75 to 6808d58 Compare May 14, 2024 23:50
@bwoebi bwoebi changed the title Bob/live debugger Add Live debugging & Remote Configuration May 14, 2024
Copy link
Contributor

@pierotibou pierotibou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skimmed through the PR and have a few questions/comments:

First, are you handling agentless and if so why? It wasn't mandatory for the first implementation right? MVP FTW.

I don't understand well the sidecar code yet, but I assume there are some constants that should be attached to the session and taken from there. This could avoid passing this data around for new use cases (service name, env, version...)

Then, you could have made it easier on the reviewer by adding commits or separating into a few PRs. I assume it's because it's in draft, but you're missing a lot of tests. Doing the refacto must have been extra complicated without tests.

Very exciting work though, that opens a lot of doors. Thanks for doing it.

trace-protobuf/src/remoteconfig.rs Outdated Show resolved Hide resolved
@@ -0,0 +1,23 @@
[package]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the size impact of having RC? Are all those dependencies already used in libdatadog?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only sha2 isn't. The fat ones, like tokio, serde, serde_json and hyper are used already in other packages than the sidecar too.

}

/// Quite generic fetching implementation:
/// - runs a request against the Remote Config Server,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not going through the agent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Server as in "a remote endpoint which serves things" - might be datadog backend, might be agent. The datadog backend isn't supported yet. But it would eventually be handled here though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sidecar/src/log.rs Outdated Show resolved Hide resolved

#[no_mangle]
#[allow(clippy::missing_safety_doc)]
pub unsafe extern "C" fn ddog_sidecar_set_remote_config_data(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be orthogonal to RC.
Data used for RC could be used for dogstatsdclient, telemetry... So I assume they should be linked to a session more than to RC itself. This couples a bit things, but I think it simplifies the API

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be orthogonal.


const PROD_INTAKE_SUBDOMAIN: &str = "config";

/// Manages files.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking about files when referring to configurations can be misleading. It's a ConfigurationStore, replace the naming from file to config please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the name from message File from the proto definition. It's literally the contents of these File messages, which are meant processed here.

type StoredFile;

/// A new, currently unknown file was received.
fn store(&self, version: u64, path: RemoteConfigPath, contents: Vec<u8>) -> anyhow::Result<Arc<Self::StoredFile>>;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't make a distinction between the 2 calls, the store knows if it's already present or not, what's the different use case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the ConfigFileStorage impl in shm_remote_config.rs does not maintain it. The implementation is very straightforward there. It's done that way to avoid tracking state and expiration manually in the FileStorage implementations - that task is up to the ConfigFetcherState.

pub struct ConfigFetcher<S: FileStorage> {
pub file_storage: S,
state: Arc<ConfigFetcherState<S::StoredFile>>,
timeout: AtomicU32,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused?

remote-config/src/fetch/fetcher.rs Show resolved Hide resolved
remote-config/src/fetch/fetcher.rs Show resolved Hide resolved
remote-config/src/fetch/fetcher.rs Show resolved Hide resolved
remote-config/src/fetch/fetcher.rs Outdated Show resolved Hide resolved
remote-config/src/fetch/fetcher.rs Outdated Show resolved Hide resolved
remote-config/src/fetch/fetcher.rs Outdated Show resolved Hide resolved
remote-config/src/fetch/fetcher.rs Show resolved Hide resolved
}
}

pub trait RefcountedFile {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be achieved by using weak and strong references?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, using Weak references would imply that, once the fetchers no longer hold a reference to it, the underlying memory can no longer be accessed. This may not be desired; refcounting allows careful control of only relinquishing it from fetchers, without affecting live references.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would give strong references to the shared clients and weak ones to everything else, so that it's self managed and doesn't have set methods setting a counter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick - think of updating the codeowners for the new crates (remote-config and live-debugger). I assume common components could take ownership of the RC one.

@bwoebi bwoebi force-pushed the bob/live-debugger branch 3 times, most recently from 9562b23 to 1470cc9 Compare May 30, 2024 12:35
@bwoebi
Copy link
Contributor

bwoebi commented Jun 19, 2024

Closing this PR in favor of the other #488.

@bwoebi bwoebi closed this Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants