richardstartin opened a new pull request #8457: URL: https://github.com/apache/pinot/pull/8457
This is a straw man proposal for a tracing API which allows detailed capture of operator statistics, far beyond execution time. The default tracing implementation is delegated to by the tracing SPI which can be overridden at startup. `Tracer` has two operations: 1. Register the requestId if tracing is enabled, the tracing implementation is responsible for propagating this to query threads. The default implementation is privileged in its use of `TraceCallable` and `TraceRunnable` but third part implementations will use class transformation to add a `requestId` field to `FutureTask`. It is also responsible for maintaining lineage between parent and child spans, stack maintenance etc.. 2. Start an operator span. Operator spans are closable, and are completed when closed. `OperatorInvocationTrace` will be passed into block evaluation, and various fields can be recorded into it: ```java public interface OperatorInvocationTrace { /** * Sets the class of the operator. This allows various class-level properties * to be interrogated and cached in a {@see ClassValue}. * @param operator the class of the operator */ void setOperatorClass(Class<?> operator); /** * Sets the number of docs scanned by the operator. * @param docsScanned how many docs were scanned. */ void setDocsScanned(long docsScanned); /** * Sets the number of bytes scanned by the operator if this is possible to compute. * @param bytesScanned the number of bytes scanned */ void setBytesProcessed(long bytesScanned); /** * If the operator is a filter, determines the filter type (scan or index) and the predicate type * @param filterType SCAN or INDEX * @param predicateType e.g. BETWEEN, REGEXP_LIKE */ void setFilterType(FilterType filterType, String predicateType); /** * The phase of the query * @param phase the phase */ void setPhase(Phase phase); /** * Records whether type transformation took place during the operator's invocation and what the types were * @param inputDataType the input data type * @param outputDataType the output data type */ void setDataTypes(FieldSpec.DataType inputDataType, FieldSpec.DataType outputDataType); /** * Records the range of docIds during the operator invocation. This is useful for implicating a range of records * in a slow operator invocation. * @param firstDocId the first docId in the block * @param lastDocId the last docId in the block */ void setDocIdRange(int firstDocId, int lastDocId); /** * If known, record the cardinality of the column within the segment this operator invoked on * @param cardinality the number of distinct values */ void setColumnCardinality(int cardinality); } ``` The default implementation records none of these. Operator implementations will need to be modified to record values into the `OperatorInvocationTrace`. Dead code elimination is relied upon to eliminate overhead when these values are written to default implementations of `OperatorInvocationTrace`. These implementations will not be modified before this SPI is agreed to. The `Tracer` does not need to attach trace information to the output request, and where the trace information goes is implementation defined; it may output it to a file, an in-memory circular buffer which can be dumped on demand. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org