vigyasharma commented on code in PR #11796:
URL: https://github.com/apache/lucene/pull/11796#discussion_r977305054


##########
lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.IOException;
+import java.util.concurrent.atomic.AtomicLong;
+
+/** An {@link IndexOutput} that wraps another instance and tracks the number 
of bytes written */
+public class ByteTrackingIndexOutput extends IndexOutput {
+
+  private final IndexOutput output;
+  private final AtomicLong byteTracker;

Review Comment:
   I suppose you're using `AtomicLong` because `IndexOutput` instances are not 
thread safe. However, I do see multiple `IndexOutput` implementations that 
track bytes written, in unsynchronized variables.. like 
`RateLimitedIndexOutput` or `OutputStreamIndexOutput`. 
   
   Perhaps it's okay to do the same here? We could work with approximate 
values, and avoid the sync. hit here. I guess, that while IndexOutput doesn't 
provide thread safe guarantees, its consumers try to avoid conflict.



##########
lucene/core/src/test/org/apache/lucene/store/TestWriteAmplificationTrackingDirectoryWrapper.java:
##########
@@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import org.apache.lucene.tests.store.BaseDirectoryTestCase;
+
+public class TestWriteAmplificationTrackingDirectoryWrapper extends 
BaseDirectoryTestCase {
+
+  public void testEmptyDir() throws Exception {
+    WriteAmplificationTrackingDirectoryWrapper dir =
+        new WriteAmplificationTrackingDirectoryWrapper(new 
ByteBuffersDirectory());
+    assertEquals(1.0, dir.getApproximateWriteAmplificationFactor(), 0.0);
+  }
+
+  public void testRandom() throws Exception {
+    WriteAmplificationTrackingDirectoryWrapper dir =
+        new WriteAmplificationTrackingDirectoryWrapper(new 
ByteBuffersDirectory());
+
+    int flushBytes = random().nextInt(100);
+    int mergeBytes = random().nextInt(100);
+    double expectedBytes = ((double) flushBytes + (double) mergeBytes) / 
(double) flushBytes;

Review Comment:
   rename to `expectedWriteAmplification` ?



##########
lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.IOException;
+import java.util.concurrent.atomic.AtomicLong;
+
+/** An {@link IndexOutput} that wraps another instance and tracks the number 
of bytes written */
+public class ByteTrackingIndexOutput extends IndexOutput {
+
+  private final IndexOutput output;
+  private final AtomicLong byteTracker;
+
+  protected ByteTrackingIndexOutput(IndexOutput output, AtomicLong 
byteTracker) {
+    super(
+        "Byte tracking wrapper for: " + output.getName(),
+        "ByteTrackingIndexOutput{" + output.getName() + "}");
+    this.output = output;
+    this.byteTracker = byteTracker;
+  }
+
+  @Override
+  public void writeByte(byte b) throws IOException {
+    byteTracker.incrementAndGet();
+    output.writeByte(b);
+  }
+
+  @Override
+  public void writeBytes(byte[] b, int offset, int length) throws IOException {
+    byteTracker.addAndGet(length);

Review Comment:
   Let's increment this counter after `output.writeBytes(...)`? So that we 
don't track failed calls.



##########
lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.IOException;
+import java.util.concurrent.atomic.AtomicLong;
+
+/** An {@link IndexOutput} that wraps another instance and tracks the number 
of bytes written */
+public class ByteTrackingIndexOutput extends IndexOutput {
+
+  private final IndexOutput output;
+  private final AtomicLong byteTracker;
+
+  protected ByteTrackingIndexOutput(IndexOutput output, AtomicLong 
byteTracker) {
+    super(
+        "Byte tracking wrapper for: " + output.getName(),
+        "ByteTrackingIndexOutput{" + output.getName() + "}");
+    this.output = output;
+    this.byteTracker = byteTracker;
+  }
+
+  @Override
+  public void writeByte(byte b) throws IOException {
+    byteTracker.incrementAndGet();

Review Comment:
   I don't think `AtomicLong` has an `increment()` only method, does it?



##########
lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.IOException;
+import java.util.concurrent.atomic.AtomicLong;
+
+/** An {@link IndexOutput} that wraps another instance and tracks the number 
of bytes written */
+public class ByteTrackingIndexOutput extends IndexOutput {
+
+  private final IndexOutput output;
+  private final AtomicLong byteTracker;
+
+  protected ByteTrackingIndexOutput(IndexOutput output, AtomicLong 
byteTracker) {
+    super(
+        "Byte tracking wrapper for: " + output.getName(),
+        "ByteTrackingIndexOutput{" + output.getName() + "}");
+    this.output = output;
+    this.byteTracker = byteTracker;
+  }
+
+  @Override
+  public void writeByte(byte b) throws IOException {
+    byteTracker.incrementAndGet();
+    output.writeByte(b);
+  }
+
+  @Override
+  public void writeBytes(byte[] b, int offset, int length) throws IOException {
+    byteTracker.addAndGet(length);
+    output.writeBytes(b, offset, length);
+  }
+
+  @Override
+  public void close() throws IOException {
+    output.close();
+  }
+
+  @Override
+  public long getFilePointer() {
+    return output.getFilePointer();
+  }
+
+  @Override
+  public long getChecksum() throws IOException {
+    return output.getChecksum();
+  }
+
+  public String getWrappedName() {
+    return output.getName();
+  }
+
+  public String getWrappedToString() {
+    return output.toString();
+  }
+}

Review Comment:
   We should also override `DataOutput` methods like `writeShort()`, 
`writeLong` etc. The wrapped IndexOutput may be overriding them and not 
invoking the base class impl. (like in `OutputStreamIndexOutput`)



##########
lucene/core/src/java/org/apache/lucene/store/WriteAmplificationTrackingDirectoryWrapper.java:
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.IOException;
+import java.util.concurrent.atomic.AtomicLong;
+
+/** {@link FilterDirectory} that tracks write amplification factor */
+public final class WriteAmplificationTrackingDirectoryWrapper extends 
FilterDirectory {

Review Comment:
   This class basically lets us track bytes-written by IOContext (flush/merge). 
Write amplification is one application for it. Perhaps, we could make this 
class name a bit more generic.
   
   On similar lines, let's also add getters for the flush and merge byte 
counters.



##########
lucene/core/src/java/org/apache/lucene/store/WriteAmplificationTrackingDirectoryWrapper.java:
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.IOException;
+import java.util.concurrent.atomic.AtomicLong;
+
+/** {@link FilterDirectory} that tracks write amplification factor */
+public final class WriteAmplificationTrackingDirectoryWrapper extends 
FilterDirectory {
+
+  private final AtomicLong flushedBytes = new AtomicLong();
+  private final AtomicLong mergedBytes = new AtomicLong();
+
+  /**
+   * Sole constructor, typically called from sub-classes.
+   *
+   * @param in input Directory
+   */
+  public WriteAmplificationTrackingDirectoryWrapper(Directory in) {
+    super(in);
+  }
+
+  @Override
+  public IndexOutput createOutput(String name, IOContext context) throws 
IOException {
+    IndexOutput output = in.createOutput(name, context);
+    IndexOutput byteTrackingIndexOutput;
+    if (context.context.equals(IOContext.Context.FLUSH)) {
+      byteTrackingIndexOutput = new ByteTrackingIndexOutput(output, 
flushedBytes);
+    } else if (context.context.equals(IOContext.Context.MERGE)) {
+      byteTrackingIndexOutput = new ByteTrackingIndexOutput(output, 
mergedBytes);
+    } else {
+      return output;
+    }
+    return byteTrackingIndexOutput;
+  }
+
+  public double getApproximateWriteAmplificationFactor() {

Review Comment:
   I think we want to measure the write amplification of a specific directory 
instance. In which case, not making it static seems like the right call.



##########
lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.store;
+
+import java.io.IOException;
+import java.util.concurrent.atomic.AtomicLong;
+
+/** An {@link IndexOutput} that wraps another instance and tracks the number 
of bytes written */
+public class ByteTrackingIndexOutput extends IndexOutput {
+
+  private final IndexOutput output;
+  private final AtomicLong byteTracker;
+
+  protected ByteTrackingIndexOutput(IndexOutput output, AtomicLong 
byteTracker) {
+    super(
+        "Byte tracking wrapper for: " + output.getName(),
+        "ByteTrackingIndexOutput{" + output.getName() + "}");
+    this.output = output;
+    this.byteTracker = byteTracker;
+  }
+
+  @Override
+  public void writeByte(byte b) throws IOException {
+    byteTracker.incrementAndGet();
+    output.writeByte(b);
+  }
+
+  @Override
+  public void writeBytes(byte[] b, int offset, int length) throws IOException {
+    byteTracker.addAndGet(length);
+    output.writeBytes(b, offset, length);
+  }
+
+  @Override
+  public void close() throws IOException {
+    output.close();
+  }
+
+  @Override
+  public long getFilePointer() {
+    return output.getFilePointer();
+  }
+
+  @Override
+  public long getChecksum() throws IOException {
+    return output.getChecksum();
+  }
+
+  public String getWrappedName() {

Review Comment:
   Parent class `getName()` and `toString()` get overridden to the 
ByteTrackingIndexOutput values when we call super() in the constructor.. I 
think these methods were added to provide a way to get details about the 
IndexOutput that is being wrapped. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to