xingyc15 opened a new issue, #8641:
URL: https://github.com/apache/pinot/issues/8641

   A pinot segment creation failure happened when I run a standalone script for 
offline ingestion. But the problem is that, this failure didn't raise any 
exception, but just print an error and mark the task as succeed. We are running 
this data ingestion as an airflow task, this missing exception pretty much 
delay us from debugging. Our error log is:
   
   > [2022-04-28 00:32:33,981] {pod_launcher.py:149} INFO - Start building 
IndexCreator!
   [2022-04-28 00:32:36,377] {pod_launcher.py:149} INFO - Failed to generate 
Pinot segment for file - 
s3://deepmap-anga-production/metrics/etl_staging/pinot_ingest/map_making_metrics/date=2022-02-22/part-0-0
   [2022-04-28 00:32:36,377] {pod_launcher.py:149} INFO - 
shaded.com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input: 
was expecting closing quote for a string value
   [2022-04-28 00:32:36,377] {pod_launcher.py:149} INFO -  at [Source: 
(String)"{"extra":"[13508528,13508529,13508604,13594110,13594112,13508467,13508479,13563695,13594105,13508475,13508489,13508494,13594107,13508483,13594109]","missed":"[6900744,6900745,6900746,6900747,6900748,6900748,6900804,6900804,6900804,6900805,6900806,6901088,6901088,6901089,6901090,6901470,6908481,6908886,6911028,6911030,7647532,7647592,8062355,8062356,8062357,8062358,8062359,8062360,8062364,8062365,8062366,8091813,8091819,8091821,8091822,8091823,8091825,8091827,8091828,8091829,8091830,8091838,80918"[truncated
 1000 chars]; line: 1, column: 3001]
   [2022-04-28 00:32:36,377] {pod_launcher.py:149} INFO -       at 
shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:664)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,377] {pod_launcher.py:149} INFO -       at 
shaded.com.fasterxml.jackson.core.json.ReaderBasedJsonParser._finishString2(ReaderBasedJsonParser.java:2051)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,378] {pod_launcher.py:149} INFO -       at 
shaded.com.fasterxml.jackson.core.json.ReaderBasedJsonParser._finishString(ReaderBasedJsonParser.java:2038)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,378] {pod_launcher.py:149} INFO -       at 
shaded.com.fasterxml.jackson.core.json.ReaderBasedJsonParser.getText(ReaderBasedJsonParser.java:293)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,378] {pod_launcher.py:149} INFO -       at 
shaded.com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:267)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,378] {pod_launcher.py:149} INFO -       at 
shaded.com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:68)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,378] {pod_launcher.py:149} INFO -       at 
shaded.com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:15)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,378] {pod_launcher.py:149} INFO -       at 
shaded.com.fasterxml.jackson.databind.ObjectReader._bindAsTree(ObjectReader.java:1770)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,378] {pod_launcher.py:149} INFO -       at 
shaded.com.fasterxml.jackson.databind.ObjectReader._bindAndCloseAsTree(ObjectReader.java:1735)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,378] {pod_launcher.py:149} INFO -       at 
shaded.com.fasterxml.jackson.databind.ObjectReader.readTree(ObjectReader.java:1422)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,379] {pod_launcher.py:149} INFO -       at 
org.apache.pinot.spi.utils.JsonUtils.stringToJsonNode(JsonUtils.java:87) 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,379] {pod_launcher.py:149} INFO -       at 
org.apache.pinot.segment.local.segment.creator.impl.inv.json.BaseJsonIndexCreator.add(BaseJsonIndexCreator.java:92)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,379] {pod_launcher.py:149} INFO -       at 
org.apache.pinot.segment.local.segment.creator.impl.SegmentColumnarIndexCreator.indexRow(SegmentColumnarIndexCreator.java:402)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,379] {pod_launcher.py:149} INFO -       at 
org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl.build(SegmentIndexCreationDriverImpl.java:243)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,379] {pod_launcher.py:149} INFO -       at 
org.apache.pinot.plugin.ingestion.batch.common.SegmentGenerationTaskRunner.run(SegmentGenerationTaskRunner.java:111)
 
~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,379] {pod_launcher.py:149} INFO -       at 
org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.lambda$submitSegmentGenTask$1(SegmentGenerationJobRunner.java:263)
 
~[pinot-batch-ingestion-standalone-0.8.0-shaded.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808]
   [2022-04-28 00:32:36,379] {pod_launcher.py:149} INFO -       at 
java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?]
   [2022-04-28 00:32:36,379] {pod_launcher.py:149} INFO -       at 
java.util.concurrent.FutureTask.run(Unknown Source) [?:?]
   [2022-04-28 00:32:36,380] {pod_launcher.py:149} INFO -       at 
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
   [2022-04-28 00:32:36,380] {pod_launcher.py:149} INFO -       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
   [2022-04-28 00:32:36,380] {pod_launcher.py:149} INFO -       at 
java.lang.Thread.run(Unknown Source) [?:?]
   [2022-04-28 00:32:36,383] {pod_launcher.py:149} INFO - Trying to create 
instance for class 
org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner
   [2022-04-28 00:32:36,383] {pod_launcher.py:149} INFO - Initializing PinotFS 
for scheme s3, classname org.apache.pinot.plugin.filesystem.S3PinotFS
   
   Here is something I find in the source code 
[code](https://github.com/apache/pinot/blob/1e90f141282e40f819de806920cc2a836e0e35ba/pinot-plugins/pinot-batch-ingestion/pinot-batch-ingestion-standalone/src/main/java/org/apache/pinot/plugin/ingestion/batch/standalone/SegmentGenerationJobRunner.java#L284),
 I saw that this function didn't raise the exception, instead it just print an 
error. Can you fix this? I suppose it should raise an error and fail the 
process.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to