-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarks and String Creation #316
Comments
Not exactly. Here's a useful breakdown: In the specific code you give, Flogger's format arguments are constant, and therefore free as well -- see e.g. the first example
The expensive part is replacing (I'm also looking right now at very positive benchmarks for Flogger, but I don't know if there is anything published...) |
As Louis says, there should be no cost to "string creation" because you express that as "literal format string plus arguments" and let Flogger format it only when the log statement is enabled (and Flogger doesn't even have to format anything if the backend can do structured logging). If what you're wondering is how: Avoids things like the overhead of varargs and auto-boxing, that's all explained in depth in: TL;DR: Flogger is completely zero-allocation for disabled log statements up to 2 arguments of any type (it has overloads to cover all necessary combinations of autoboxing for 1 or 2 arguments) and it's free from varargs up to 10 arguments. Extensive analysis of many thousands of log statements in Google code showed that this would cover the vast majority of cases (2 arguments is the most common case, and above 5 arguments becomes extremely rare). It also supports lazy arguments via simple lambda closures: At some point we should probably publish our benchmarks, though we're also happy if people want to benchmark it for themselves (though you need to be quite careful to mimic real log statements, e.g. at representative stack depths and with a mix of lengths of format strings and types of argument ... it's quite fiddly). |
Well all the bytecode I have ever looked at was doing new String("foo"); for the string literal. Not sure when or if that changed. But if you are both convinced I will get some bytecode from Eclipse and post here ... |
That's already been addressed above:
In the specific code you give, "foo" is free, because it's a class-level
constant. When the class gets loaded, all constant strings in the source
code get loaded, too, so you really don't pay anything extra for this; even
if you have the if as you describe, "foo" would still get loaded as soon as
the class gets loaded.
The referenced link (https://google.github.io/flogger/anatomy.html) shows
the constant string handling (no "new String").
Chris.
…On Wed, 1 Jun 2022 at 15:08, Adligo Inc ***@***.***> wrote:
Well all the bytecode I have ever looked at was doing new String("foo");
for the string literal. Not sure when or if that changed. But if you are
both convinced I will get some bytecode from Eclipse and post here ...
—
Reply to this email directly, view it on GitHub
<#316 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCCXPQ7Z2M5PNQ4CEXNGMTVM7NMVANCNFSM5XSLJQJA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
/ Compiled from LogByteCode.java (version 11 : 55.0, super bit) // Method descriptor #11 ([Ljava/lang/String;)V // Method descriptor #20 ()V Inner classes: |
I've been using Java for 20 years and literals have always been loaded from the constant pool. It's required by the language specification: It's possible some JVM implementations don't do this, but it's incorrect (and there's even code in Flogger that assumes literals are interned so we can test via == for a fast test for known special cases). |
Ok thanks, perhaps I was remembering the String concatenation instead of construction. |
That's the "load constant" which gets the "my main message" literal and puts a reference to the interned String on the stack. There is no allocation here (allocation and mapping from constant pool index to a heap allocated instance is done during class loading). |
public static void main(String [] args) { public LogByteCode() { |
Tip (in case it wasn't clear in the docs): You really want to be setting up the logger in a private static final field in the class rather than doing all the work to obtain it many times. And assuming it's the first static field in the class this also lets you use it in any static method and during other static initialization. |
I'm still wondering if there is a noticeable difference between the imperative style to this fluent / functional / chaning style. It seems like things are happening on the stack that wouldn't need to happen. |
Ya, I know you advise the static logger style. Having done that for years (i_log.adligo.org), I'm actually starting to prefer a C# style of passing around a context to get the loggers. I think it's easier to mock. I'm thinking of writing yet another logging api (i_log2.adligo.org?), which is probably why I'm so interested in this minutiae. |
The hot-spot compiler is good at folding stuff like that into efficient code. CPU is cheap compared to allocations and GC pressure. Yes, there are 2 (or more) calls to the logging API instead of 1, but the advantages you get from having a modular, extensible API (and thus being able to get rid of boxing and varargs without combinatorial explosion of many hundreds of method overloads) is well worth it. In Java there's no need to obtain logger contexts and pass them around and no need to ever mock loggers for testing. You test logging by using Needing some "logger provider" context is a serious downside to any logging API since now you cannot add logging to public static methods without changing the API, and you cannot add logging during static initialization of a class. |
As I mentioned I have implemented a logging API. I didn't say it was necessary, I said I preferred it. |
So I guess no one has benchmarked the differences in the extra stack calls anywhere ? |
Oh yes, we certainly have, at least inside Google. After a week of attempting to pull out any reliable measure of the potential overhead, I gave up. Under realistic conditions there was just no observable difference between a "single method with many parameters" API and a "fluent logging API". The hotspot compiler was just figuring everything out as far as we could tell. Remember that the only case which matters is for the "disabled" log statements, and then only under the sort of load that would trigger the hot-spot compiler. Since Flogger has only 2 concrete implementations of the fluent API, the hot-spot compiler can do all sorts of tricks by spotting places where bi-morphic dispatching occurs. Flogger's used everywhere in Google's Java code, including many mission critical systems, so we've measured and checked as much as we can think of with regard to things like this quite carefully. What we haven't done is tidy up the benchmark code, remove internal dependencies, document it and publish it (and the results) to github. We probably should do that at some point. |
I suspect you'd run into cyclic static initialization issues pretty fast. Flogger sets up its "Platform" singleton on first use rather than as an explicit singleton exactly to avoid issues of cyclic dependencies during class loading. You also end up almost back at square one if you make a singleton to inject the factory since the singleton removes the choice of implementation and makes the logging in the code hard to test without explicit knowledge of what you need to do to inject a mock. Since JDK Logger and other common backends all provide ways to capture logs for test purposes without having different implementations of the actual user facing API, I see no reason to try and solve the testability issue using mocks like this. |
Looking forward to the benchmarks. |
Also, you don't need to test the log API in most cases. But you MAY want to test the usage of the log API. There are certain situations where you need to audit things (i.e. don't log the user's passwords, etc). |
Doesn't this style of logging api incur the cost of string creation, even when the logger is off ?
I haven't looked at the byte code, but I'm wondering what exactly is ment by 'essentially free'.
I remember reading in the initial log4j that the whole reason for the usage of the imperative paradigms if, i.e.;
if (log.isDebugEnabled()) {
log.debug("foo");
}
Was to avoid the cost of the creation of the string "foo". The impartive if statement would create branching bytecode, with GOTO statements. IIt would seem to me that this functional stream log API style would result in a lot of extra string creation and garbage collection. Has anyone benchmarked it ? Where are the benchmarks ?
The text was updated successfully, but these errors were encountered: