-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-10772. [Ozone-Streaming] Stream write metric is wrong #6610
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with streaming. But would it make more sense to move the new code to its base class, StreamDataChannelBase? IMO updating the latency should happen inside writeFileChannel().
Thanks.
@@ -212,7 +219,9 @@ void assertOpen() throws IOException { | |||
public void close() throws IOException { | |||
if (closed.compareAndSet(false, true)) { | |||
try { | |||
putBlockRequest.set(closeBuffers(buffers, super::writeFileChannel)); | |||
final long l = Time.monotonicNow(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use more meaningful variable names, like start.
final long l = Time.monotonicNow(); | |
final long start = Time.monotonicNow(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Thanks @jojochuang .
writeFileChannelWithMetrics(l)); | ||
} | ||
|
||
private WriteMethod writeFileChannelWithMetrics(long l) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private WriteMethod writeFileChannelWithMetrics(long l) { | |
private WriteMethod writeFileChannelWithMetrics(long start) { |
7d860a3
to
3eb3f89
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guohao-rosicky , thanks for working on this! How about we simply measure the time in writeFileChannel
as below?
diff --git a/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/KeyValueStreamDataChannel.java b/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/KeyValueStreamDataChannel.java
index 185ad9c001..030244e0b8 100644
--- a/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/KeyValueStreamDataChannel.java
+++ b/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/KeyValueStreamDataChannel.java
@@ -167,11 +167,7 @@ public int write(ReferenceCountedObject<ByteBuffer> referenceCounted)
getMetrics().incContainerOpsMetrics(getType());
assertOpen();
- final long l = Time.monotonicNow();
- int len = writeBuffers(referenceCounted, buffers, super::writeFileChannel);
- getMetrics()
- .incContainerOpsLatencies(getType(), Time.monotonicNow() - l);
- return len;
+ return writeBuffers(referenceCounted, buffers, this::writeFileChannel);
}
static int writeBuffers(ReferenceCountedObject<ByteBuffer> src,
diff --git a/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/StreamDataChannelBase.java b/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/StreamDataChannelBase.java
index 810495b2a7..a88f452167 100644
--- a/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/StreamDataChannelBase.java
+++ b/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/StreamDataChannelBase.java
@@ -22,6 +22,7 @@
import org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException;
import org.apache.hadoop.ozone.container.common.helpers.ContainerMetrics;
import org.apache.hadoop.ozone.container.common.impl.ContainerData;
+import org.apache.hadoop.util.Time;
import org.apache.ratis.statemachine.StateMachine;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
@@ -129,9 +130,11 @@ public void close() throws IOException {
final int writeFileChannel(ByteBuffer src) throws IOException {
try {
+ final long startTime = Time.monotonicNow();
final int writeBytes = getChannel().write(src);
metrics.incContainerBytesStats(getType(), writeBytes);
containerData.updateWriteStats(writeBytes, false);
+ metrics.incContainerOpsLatencies(getType(), Time.monotonicNow() - startTime);
return writeBytes;
} catch (IOException e) {
checkVolume();
interface WriteMethod { | ||
int applyAsInt(ByteBuffer src) throws IOException; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only used in KeyValueStreamDataChannel
. Let's keep it here.
getMetrics() | ||
.incContainerOpsLatencies(getType(), Time.monotonicNow() - l); | ||
return len; | ||
final long start = Time.monotonicNow(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writeBuffers
runs in a loop but using the same start
time. The time will be double count when there are more than one buffers.
7eed07e
to
fb186c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 the change looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 merging it now
Thanks @guohao-rosicky and @szetszwo |
(cherry picked from commit 7427026)
What changes were proposed in this pull request?
Not every channel write generated io, but only added to the Buffers object, and in the close channel will be in the Buffers object has not been written to the disk data, the part also need to record metrics.
For the detailed see jira.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10772