-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Structured Audit Logging #891
Conversation
try { | ||
super.onHalfClose(); | ||
} catch (AuthenticationException e) { | ||
// TODO: figure out of catching exception breaks spring security |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@woop Does the current E2E auth tests verify that there are no false positives for authentication? (ie should fail to authenticate due to wrong credentials. )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#892 These tests will fail your authentication if you try to sign in with a JWT that cannot be verified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer relevant as this was deemed out of scope for this PR.
/retest |
import org.apache.logging.log4j.Marker; | ||
import org.apache.logging.log4j.MarkerManager; | ||
|
||
@Log4j2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Log4j2 be hardcoded here? #277
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used SLF4J
instead.
@Getter | ||
public class AuditLogger { | ||
/** Defines the output formats that the Audit Logger can produce logs in */ | ||
public static enum LogFormat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the format be a part of the logger configuration xml instead of a part of the code base?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to expose the ability to configure the logger with the most common use cases via application.yml
without interacting with the logging configuration XML as it inaccessible from our Helm and Docker-Compose setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in the response above, I think we can use the application.yml for configuring logging without writing our own logformat configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
private Set<LogFormat> formats; | ||
|
||
// Configures the directions of messages of which matchingo messages are logged | ||
// are logged to the audit log. Requires message event kind to be enabled in kinds./ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cant make out this comment. What does the direction represent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Direction represents the flow of messages relative to the service:
- incoming - matches requests messages
- outgoing - matches response messages.
Users can use this to configure for which request "direction" message audit logging is enabled. (ie leave in only incoming
to enable request logging)
Will change this to request
and response
as that makes it easier to understand.
* ExceptionInterceptor intercepts exceptions throw when handling GRPC messages. Intercepts uncaught | ||
* exceptions and logs to audit logger and returns the appropriate Grpc statuses. | ||
*/ | ||
public class GrpcExceptionInterceptor implements ServerInterceptor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be in a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, will do.
|
||
// Attempt Extract current authenticated identity. | ||
Authentication authentication = SecurityContextHolder.getContext().getAuthentication(); | ||
String identity = (authentication == null) ? "" : authentication.getPrincipal().toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work for you? I have been pulling the identity from claims.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
call = | ||
new SimpleForwardingServerCall<ReqT, RespT>(call) { | ||
@Override | ||
public void sendMessage(RespT message) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to apply to GetOnlineFeatures as well, because in that case we will probably experience a serious performance hit. Audit logging is normally distinct from request/response logging, since with the former you expect durability and consistency but with the latter you want availability and performance. Logging the complete request/response is still useful as part of the audit log, but we need to have a way to deal with the latency sensitive case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AuditLogger
can be configured via application.yml
via AuditLogProperties
and thus can be disabled for the Online Serving instance from application.yml
for cases where latency is more important that visibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, but we need request/response logging in online serving as well. We just want to make sure that this logging happens in async fashion and doesn't affect latency of the method
|
||
/** LogEvent describes a specific event that a log is capturing */ | ||
@AutoOneOf(LogEvent.Kind.class) | ||
public abstract class LogEvent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial reaction when seeing this code is that it seems quite overwhelming. A lot of decisions have been made here in terms of the structure of our audit logging and I am not sure if we are being too ambitious here. I am quite worried that these abstractions will end up changing quite soon once people start trying to use the audit logger. I was hoping we could start with a simpler key/value API.
Did you follow some kind of convention when structuring the audit logger and its associated types, or was everything custom made?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you follow some kind of convention when structuring the audit logger and its associated types, or was everything custom made?
No, Yes.
I was hoping we could start with a simpler key/value API.
Reasons that we should go with a Key Value API.
- Infinitely more flexible and extensible as no constraints are applied.
Reasons that we should not go with a Key Value API:
- Java is statically typed. Going with a generic Key Value would mean
Map<String, Object>
which implies lots of ugly casting. - With no enforced structure in place it will be difficult to parse the logs in upstream systems as there is no expected structure.
- Enforces some rigidity to logging so that we don't break upstream systems unknowing built upon audit logging.
The structured log will be depended on by upstream logging systems and should be treated as a user facing API that should be clearly defined, the same way Core and Serving Service API are clearly defined.
I am quite worried that these abstractions will end up changing quite soon once people start trying to use the audit logger
The important thing is that we are fully aware of changes instead of being blind sighted by a map.put()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion is not to throw away all static typing and just accept arbitrary string/object keys and values. Parts of what you have implemented makes sense, but other parts do not.
From a functional perspective what we want is for specific parts of the Feast code base to create a message that can be serialized and printed/sent to the audit log. so having static methods like ofMessage(MessageLogEvent.Kind kind, String method, Message message, String identity)
makes sense.
However, I question the value of this LogEvent
as a container. It seems like a catchall class that provides little in terms of a contract. When would you have both a MessageLogEvent
and ActionLogEvent
in the same object? It doesnt seem like that should be possible and so its not clear what the scope of this LogEvent
class should be. Will it just grow infinitely as a catchall class that can contain any subclass?
If we agree that maintaining state is a bad thing, then do we need to maintain LogEvents as an object? What is the SLF4J contract that we need to meet. Does it just expect strings?
What prevents us from having an AuditLogger
class with static methods like ofMessage
that converts a tight contract of objects into a Map or string that can be serialized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored away LogEvent into MessageAuditLogEntry, ActionAuditLogEntry, TransitionAuditLogEntry subclasses.
@Test | ||
public void shouldReturnStringRepresentationOfAuditLog() { | ||
for (AuditLogEntry auditLog : getTestAuditLogs()) { | ||
System.out.println(auditLog.toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this testing? Wouldnt toString() always return something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I guess I was just trying to exercise the code path. Would remove.
responseObserver.onError( | ||
Status.INTERNAL.withDescription(e.getMessage()).withCause(e).asRuntimeException()); | ||
} | ||
GetFeastCoreVersionResponse response = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code looks much better with the exception handling in the interceptor, but I don't think its in scope for this PR is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, moving out of this PR.
public class CreateJobTask extends JobTask { | ||
|
||
public CreateJobTask(Job job, JobManager jobManager, AuditLogger auditLog) { | ||
super(job, jobManager, auditLog); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did we add this constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally this used the @Builder
constructor from lombok and had fields duplicated across all 4 Job Task implementations. I move the fields up to JobTask
abstract class however the @builder
annotation has problems with inherited fields, hence the added constructor.
import lombok.Getter; | ||
import lombok.Setter; | ||
import lombok.extern.log4j.Log4j2; | ||
import org.apache.logging.log4j.Level; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It bothers me a bit that we are pulling in logging dependencies directly into our code. Shouldn't this be abstracted away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced with SL4J
.
* name and assumed authenticated identity (if authentication is enabled). | ||
*/ | ||
public class GrpcMessageInterceptor implements ServerInterceptor { | ||
private AuditLogger auditLog; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for us to keep things simpler by using SLF4J for logging instead of having our own AuditLogger? Example: https://github.com/yidongnan/grpc-spring-boot-starter/blob/master/examples/local-grpc-server/src/main/java/net/devh/boot/grpc/examples/local/server/LogGrpcInterceptor.java#L34
I can see some value in having a way to standardize the message format through static methods, especially if we need to have actions/resources/identities, but having to pass around an audit logger when SLF4J is already available seems like an unnecessary overhead to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for us to keep things simpler by using SLF4J for logging instead of having our own AuditLogger? Example: https://github.com/yidongnan/grpc-spring-boot-starter/blob/master/examples/local-grpc-server/src/main/java/net/devh/boot/grpc/examples/local/server/LogGrpcInterceptor.java#L34
Why we have our own Audit Logger:
- Allow us to configure the audit logger from
application.yml
which is valuable as:- Allows us to configure the logger via
application.yml
(ie disable request logging for Online Serving without hard coding something.) - We can inject Feast specific elements into the Audit Log (ie Feast Component, Feast Version, or in the future git hash.)
- Allows us to configure the logger via
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allows us to configure the logger via application.yml (ie disable request logging for Online Serving without hard coding something.)
What have we implemented that can't be implemented using the normal logging configuration https://howtodoinjava.com/spring-boot2/logging/configure-logging-application-yml/
Also, it seems like your answer focuses on configuration but the AuditLogger is more than just a configuration holder right? We can have custom configuration in the FeastProperties for enabling/disabling request/response logging if needed, but that is a separate question to whether we need to maintain the AuditLogger and pass it around.
We can inject Feast specific elements into the Audit Log
Since we have an application context just floating around inside of Feast, we can easily grab these settings at any time without needing to pass around an AuditLogger right? I am just trying to make sure we arent reinventing something that already exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated AuditLogger
's logging methods to static to remove the need of passing around the AuditLogger
instance.
serving/pom.xml
Outdated
<artifactId>grpc-spring-boot-starter</artifactId> | ||
<groupId>net.devh</groupId> | ||
<artifactId>grpc-server-spring-boot-starter</artifactId> | ||
<version>${grpc.spring.boot.starter.version}</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to set this? Shouldnt this be set in the parent-pom?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was set as the interceptors do not work properly under the lognet
grpc starter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed version part to be set in the parent POM.
I think it is unlikely that we will get this PR merged in any time soon in its current state. It's trying to do too much.
The MVP we need for audit logging is not much different from what we have in our existing AuditLogger. We need a way to log incoming requests to track user actions. Tracking internal state changes of jobs are not necessary yet, and even in that case I think it can be tracked at the entity level. I would recommend that you try and separate these into different PRs. The first and most important PR implements a logging interceptor that logs authenticated user requests and the incoming payload using a standardized audit log format (hopefully with SLF4J). This might serve as inspiration https://cloud.google.com/logging/docs/reference/audit/auditlog/rest/Shared.Types/AuditLog |
This already exists with the current Next steps:
|
@@ -46,7 +47,7 @@ | |||
@Getter | |||
@Setter | |||
@Configuration | |||
@ConfigurationProperties(prefix = "feast", ignoreInvalidFields = true) | |||
@ConfigurationProperties(prefix = "feast", ignoreInvalidFields = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted.
/test test-end-to-end-batch |
1 similar comment
/test test-end-to-end-batch |
/test test-end-to-end-batch-dataflow |
private boolean enabled; | ||
|
||
// Whether to enable/disable message level (ie request/response) audit logging. | ||
private boolean messageEnabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call this something more descriptive? enableMessageLogging
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. updated.
/test test-end-to-end |
/test test-end-to-end-redis-cluster |
/test test-end-to-end-batch-dataflow |
1 similar comment
/test test-end-to-end-batch-dataflow |
// Spring runs this constructor when creating the AuditLogger bean, | ||
// which allows us to populate the AuditLogger class with dependencies. | ||
// This allows us to use the dependencies in the AuditLogger's static methods | ||
AuditLogger.properties = loggingProperties.getAudit(); | ||
instance = FeastInstance.of(buildProperties.getArtifact(), buildProperties.getVersion()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not sure why we need to persist with FeastInstance.
/test test-end-to-end-batch-dataflow |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mrzzy, woop The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test test-end-to-end-batch-dataflow |
1 similar comment
/test test-end-to-end-batch-dataflow |
New changes are detected. LGTM label has been removed. |
Why we need this PR
Feast behavior and usage be opaque due to a lack of consistent logging of:
Logging in Feast is also inconsistent making it difficult to parsed by third party logging systems.
What this PR does:
Adds Audit Logging to Feast Core and Feast Serving:
AuditLogger
that exposes structured logging methodslogMessage(), logAction()
etc.AuditLogger
disable/configurable from Core/Serving'sapplication.yml
AuditLogger
are structured JSON and machine parsable.AuditLogEntry
and subclasses that define the structure of each log entries and provides JSON conversion.GrpcMessageInterceptor
to intercept incoming/request or outgoing/response and log them to the Audit Log for both Core and Serving Services.Refactor/Tech debt cleanups:
JobService
perform actions on jobs viaJobTask
instead of directly withJobManager
to be consistent withJobCoordinatorService
Which issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?: