sajjad-moradi opened a new issue #6562:
URL: https://github.com/apache/incubator-pinot/issues/6562


   When a consuming segment completes, all the files in consuming directory are 
supposed to be deleted. However, text index for real time tables does not clean 
up all the files. For each segment, one file will remain undeleted.
   I dug a bit into the code and I believe found the root cause.
   In “consuming” -> “online”  transition on server, first we delete the 
consuming segment directory and then release the segment:
   
![image](https://user-images.githubusercontent.com/8548220/107400967-3bd05300-6ab7-11eb-8e97-3dfd16ff65b0.png)
   Text index uses a file “write.lock” as the lock. The lock is released 
whenever the text index gets released.
   So when we try to delete the consuming segment directory in CONSUMING to 
ONLINE transition, since the lock file is being used and not yet released, all 
files in segment directory get deleted except the lock file.
   That’s why you see the following on a server having a realtime table with 
text index enabled:
   
![image](https://user-images.githubusercontent.com/8548220/107401160-73d79600-6ab7-11eb-95d2-d2ccb01f36d6.png)
   The size of those lock files are all zero:
   ```bash
   [smoradi@lva1-app0232 ~]$ ll 
/export/content/data/pinot-server/i001/pinot/consumerDir/RealtimeFeatureTest1_REALTIME/RealtimeFeatureTest1__0__101__20201229T1103Z/textDim1.lucene.index/
   total 0
   -rw-rw-r-- 1 app app 0 Dec 29 11:03 write.lock
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to