Hi, I would like to ask and confirm one question/problem? about create index using map-reduce (Hadoop-contrib 2951).
According to Hadoop contrib 2951, it realized the creation/updating index using map-reduce. https://issues.apache.org/jira/browse/HADOOP-2951 But in this package Lucene 2.3 was used. I update the Lucene to 3.2 and also modified some source code to match Lucene 3.2 new classes and methods. But I faced one problem here that is related to create TermVector files(tvd, tvf, tvx). *These thress TermVector files(tvd, tvf, tvx) cannot be created correctly in case of merge segements.* The max num of segments can be set in the file. If the value is not set the index segments will be created for each doucment. For example there are 12 documents was input, the output index file will be 10 segements. In this case there is no any problem. But if the value of max num of segments is set to 5 for example, of course the results will be 5 segements. There is no any other problem to retrival function. But in some of segments which merged from some documents *these three files is 0 byte checked by Luke. * I checked all of my modification which related to Lucene 3.2 and didn't find the reason. Is anyboby faced the same issue or give me some advices? Thanks in advance. Yali Hu -- View this message in context: http://lucene.472066.n3.nabble.com/The-question-about-create-index-using-Map-Reduce-Hadoop-contrib-2951-tp3274205p3274205.html Sent from the Lucene - General mailing list archive at Nabble.com.
