smuching202 opened a new issue, #15026:
URL: https://github.com/apache/lucene/issues/15026

   # Context
   While implementing `Accountable.ramBytesUsed()`, I noticed a discrepancy 
between the values returned by `RamUsageEstimator.sizeOf(Query, long)` and 
`RamUsageTester.ramUsed(obj)` in Lucene 10.2.2.
   
   For example, given the following test:
   
   ```
   @Test
   void testTermQueryRamUsage() {
       Query query = new TermQuery(new Term("field", "value"));
       long actual = RamUsageTester.ramUsed(query); // 152 bytes
       long expected = RamUsageEstimator.sizeOf(query, 0); // 176 bytes
       assertThat(actual).isEqualTo(expected);
   }
   ```
   
   `RamUsageTester` reports: **152 bytes**
   `RamUsageEstimator` reports: **176 bytes** (using `RamUsageQueryVisitor` 
internally)
   
   # Observed RamUsageEstimator Flow
   1.  Invoke `RamUsageEstimator.sizeOf(query, 0)` (see 
[method](https://github.com/apache/lucene/blob/279eb7aaafe985e5d0552b7f2a10b63185a3f893/lucene/core/src/java/org/apache/lucene/util/RamUsageEstimator.java#L367C22-L367C28))
          a) Since TermQuery does not implement the Accountable interface, it 
should use `RamUsageQueryVisitor`
   2. Create an instance of `RamUsageQueryVisitor`(see 
[constructor](https://github.com/apache/lucene/blob/279eb7aaafe985e5d0552b7f2a10b63185a3f893/lucene/core/src/java/org/apache/lucene/util/RamUsageEstimator.java#L308))
          a) No default size was passed, so it will invoke 
`RamUsageEstimator.shallowSizeOf(Query)`, which in my case resulted in **24 
bytes**
   3. The query uses RamUsageQueryVisitor to traverse through
          a) We reach `RamUsageQueryVisitor.consumeTerms` (see 
[method](https://github.com/apache/lucene/blob/279eb7aaafe985e5d0552b7f2a10b63185a3f893/lucene/core/src/java/org/apache/lucene/util/RamUsageEstimator.java#L319))
 . Since query == root, we skip that and go directly into invoking sizeOf(terms)
           b) `sizeOf(terms)` (see 
[method](https://github.com/apache/lucene/blob/279eb7aaafe985e5d0552b7f2a10b63185a3f893/lucene/core/src/java/org/apache/lucene/util/RamUsageEstimator.java#L623))
 sets size to `shallowSizeOf(accountables)`, which resulted in **24 bytes**, 
and then sums together with results of `Accountable.ramBytesUsed()`
           c) `Term.ramBytesUsed()` (see 
[method](https://github.com/apache/lucene/blob/279eb7aaafe985e5d0552b7f2a10b63185a3f893/lucene/core/src/java/org/apache/lucene/index/Term.java#L189))
 returns back a total of: 48 + 56 + 24 = **128 bytes**
                   - `BASE_RAM_BYTES` = **48 bytes** (from shallow size of Term 
and BytesRef)
                   - `RamUsageEstimator.sizeOfObject(field)` = **56 bytes**
                   - `RamUsageEstimator.alignObjectSize(
                   bytes.bytes.length + 
RamUsageEstimator.NUM_BYTES_ARRAY_HEADER)` = aligned(5 + 16) = **24 bytes**
   4. The total returned is **176 bytes** (24 + 128 + 24), which is **24 bytes 
more** than the actual usage.
   
   # Problem: Double Counting
   In step 3b, the method sets the initial size to the shallow size of the 
Accountable[] array:
   
   ```
   public static long sizeOf(Accountable[] accountables) {
       long size = shallowSizeOf(accountables); // Term shallow size is 24 bytes
       for (Accountable accountable : accountables) {
           if (accountable != null) {
               size += accountable.ramBytesUsed();
           }
       }
       return size;
   }
   ```
   
   This means the shallow size of the Term is counted twice. Once in 
`shallowSizeOf(accountables)` and again within each 
`Accountable.ramBytesUsed()`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to