[GitHub] [lucene] gsmiller commented on a diff in pull request #974: LUCENE-10614: Properly support getTopChildren in RangeFacetCounts

GitBox Thu, 07 Jul 2022 12:35:55 -0700


gsmiller commented on code in PR #974:
URL: https://github.com/apache/lucene/pull/974#discussion_r916216255



##########
lucene/demo/src/java/org/apache/lucene/demo/facet/RangeFacetsExample.java:
##########
@@ -73,6 +76,35 @@ public void index() throws IOException {
       indexWriter.addDocument(doc);
     }
 
+    // Add documents with a fake timestamp, 3600 sec (1 hour) after "now", 
7200 sec (2
+    // hours) after "now", ...:
+    long startTime = 0;
+    // Index error messages since a week (24 * 7 = 168 hours) ago
+    for (int i = 0; i < 168; i++) {
+      long endTime = startTime + (i + 1) * 3600;
+
+      // Choose a relatively larger number, e,g., "35", in order to create 
variation in count for
+      // the top-n children, so that getTopChildren(10) in the 
searchTopChildren functionality
+      // can return children with different counts
+      for (int j = 0; j < i % 35; j++) {
+        Document doc = new Document();
+        // index document at a different timestamp by using endTime - i * j

Review Comment:
   Sorry, I'm sure what you're doing is really obvious to you, but it's just 
confusing to me. I find myself really stuck on things like `endTime - i * j`, 
or `i % 35` as a way to generate different numbers of log events within an hour 
block. What's wrong with just using `Random`? Would that just make it 
impossible to test? Sorry to be a pain with this, but if I were a user just 
trying to understand range faceting and I looked at this code, I'd be spending 
all my time just trying to figure out what we're trying to simulate here 
instead of understanding faceting. There has to be a simpler way right?
   
   As a suggestion, maybe we create a separate Jira issue to add a top-n range 
faceting example and revert out this work for now? That would let us get the 
actual change merged in the meantime.



##########
lucene/demo/src/java/org/apache/lucene/demo/facet/RangeFacetsExample.java:
##########
@@ -73,6 +76,35 @@ public void index() throws IOException {
       indexWriter.addDocument(doc);
     }
 
+    // Add documents with a fake timestamp, 3600 sec (1 hour) after "now", 
7200 sec (2
+    // hours) after "now", ...:
+    long startTime = 0;
+    // Index error messages since a week (24 * 7 = 168 hours) ago
+    for (int i = 0; i < 168; i++) {
+      long endTime = startTime + (i + 1) * 3600;
+
+      // Choose a relatively larger number, e,g., "35", in order to create 
variation in count for
+      // the top-n children, so that getTopChildren(10) in the 
searchTopChildren functionality
+      // can return children with different counts
+      for (int j = 0; j < i % 35; j++) {
+        Document doc = new Document();
+        // index document at a different timestamp by using endTime - i * j
+        doc.add(new NumericDocValuesField("error log", endTime - i * j));

Review Comment:
   Maybe "error timestamp" would be a better name?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller commented on a diff in pull request #974: LUCENE-10614: Properly support getTopChildren in RangeFacetCounts

Reply via email to