stym06 opened a new issue, #8460: URL: https://github.com/apache/pinot/issues/8460
Hey guys, I've been trying to ingest data stored on S3 in ORC format using the Pinot ingestor with the below command: `./pinot-admin.sh LaunchDataIngestionJob -jobSpecFile batch-job-standalone-spec.yaml` ### Ingestion job spec ``` executionFrameworkSpec: name: 'standalone' segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner' segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner' segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner' segmentMetadataPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentMetadataPushJobRunner' jobType: SegmentCreationAndMetadataPush inputDirURI: 's3://test-bucket/dev/pinot-input-new/' outputDirURI: 's3://test-bucket/dev/pinot/axon_entity.db/segments-v2' overwriteOutput: true pinotFSSpecs: - scheme: s3 className: org.apache.pinot.plugin.filesystem.S3PinotFS configs: region: ap-southeast-1 recordReaderSpec: dataFormat: 'orc' className: 'org.apache.pinot.plugin.inputformat.orc.ORCRecordReader' tableSpec: tableName: 'user_base_fact' schemaURI: 'http://localhost:9000/tables/user_base_fact/schema' tableConfigURI: 'http://localhost:9000/tables/user_base_fact' pinotClusterSpecs: - controllerURI: 'http://localhost:9000' pushJobSpec: pushParallelism: 2 pushAttempts: 2 pushRetryIntervalMillis: 1000 ``` The job is able to complete but leads to all null values in the Pinot table: <img width="1335" alt="Screenshot 2022-04-04 at 3 13 38 PM" src="https://user-images.githubusercontent.com/20970728/161518329-3fa4f1c0-cced-4294-bbcd-0ff2382a3a3a.png"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org