[GitHub] [lucene] uschindler commented on a diff in pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

GitBox Thu, 29 Sep 2022 04:21:30 -0700


uschindler commented on code in PR #11796:
URL: https://github.com/apache/lucene/pull/11796#discussion_r983415631



##########
lucene/misc/src/java/org/apache/lucene/misc/store/ByteWritesTrackingDirectoryWrapper.java:
##########
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.misc.store;
+
+import java.io.IOException;
+import java.util.concurrent.atomic.AtomicLong;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.store.FilterDirectory;
+import org.apache.lucene.store.IOContext;
+import org.apache.lucene.store.IndexOutput;
+
+/** {@link FilterDirectory} that tracks write amplification factor */
+public final class ByteWritesTrackingDirectoryWrapper extends FilterDirectory {
+
+  private final AtomicLong flushedBytes = new AtomicLong();
+  private final AtomicLong mergedBytes = new AtomicLong();
+  private final AtomicLong realTimeFlushedBytes = new AtomicLong();

Review Comment:
   I would add a local plain `int` field (an int is enough here, as we only 
write small amounts and it fits better to the parameters passed to the methods) 
to the IndexOutput, increment it on every write completely unsychronized. After 
every write call a private method  like `maybeUpdateGlobal()` which checks in 
the plain `int` field if it is larger than a certain value and then does 
something like `AtomicLong#addAndGet(localIntValue); localIntValue = 0;`. Of 
course if somebody is writing a large `byte[]` we can directly update the 
atomic long (to prevent overflows of the int).
   
   Nevertheless we should have a benchmark how much this affects writing. If 
the filter directory is just for debugging purposes, we can of course go with 
the full atomic implementation as we have now. Nobody takes care if you want to 
just watch the amplification factors while debugging your app and it gets 
slower because of this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [lucene] uschindler commented on a diff in pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

Reply via email to