tang-hi commented on PR #12255: URL: https://github.com/apache/lucene/pull/12255#issuecomment-1553253776
> I tried running luceneutil before/after this change using this command: > > ``` > comp = competition.Competition() > > index = comp.newIndex('baseline', sourceData, > vectorFile=constants.GLOVE_VECTOR_DOCS_FILE, > vectorDimension=100, > vectorEncoding='FLOAT32') > > comp.competitor('baseline', 'baseline', > vectorDict=constants.GLOVE_WORD_VECTORS_FILE, > index = index, concurrentSearches = concurrentSearches) > > comp.competitor('candidate', 'candidate', > vectorDict=constants.GLOVE_WORD_VECTORS_FILE, > index = index, concurrentSearches = concurrentSearches) > > comp.benchmark("baseline_vs_candidate") > ``` > > and I get this error: > > ``` > File "src/python/vector-test.py", line 65, in <module> [39/1959] > comp.benchmark("baseline_vs_candidate") > File "/local/home/sokolovm/workspace/lbench/luceneutil/src/python/competition.py", line 510, in benchmark > searchBench.run(id, base, challenger, > File "/local/home/sokolovm/workspace/lbench/luceneutil/src/python/searchBench.py", line 196, in run > raise RuntimeError('errors occurred: %s' % str(cmpDiffs)) > RuntimeError: errors occurred: ([], ["query=KnnFloatVectorQuery:vector[0.0223385,...][100] filter=None sort=None groupField=None hi > tCount=100: hit 51 has wrong field/score value ([994765], '0.9567487') vs ([824922], '0.9567554')", "query=KnnFloatVectorQuery:vect > or[-0.061654933,...][100] filter=None sort=None groupField=None hitCount=100: hit 16 has wrong field/score value ([813187], '0.8702 > 4546') vs ([134050], '0.8707979')", "query=KnnFloatVectorQuery:vector[-0.111742884,...][100] filter=None sort=None groupField=None > hitCount=100: hit 27 has wrong field/score value ([724125], '0.8874463') vs ([817731], '0.88757277')"], 1.0) > ``` > > maybe it's expected that we changed the results? I think this is what Mike M ran into with the nightly benchmarks @msokolov I try to run lucenutil use the command you provide, it throw the exception ```` Exception in thread "main" java.lang.IllegalArgumentException: facetDim Date was not indexed at perf.TaskParser$TaskBuilder.parseFacets(TaskParser.java:289) at perf.TaskParser$TaskBuilder.buildQueryTask(TaskParser.java:154) at perf.TaskParser$TaskBuilder.build(TaskParser.java:147) at perf.TaskParser.parseOneTask(TaskParser.java:108) at perf.LocalTaskSource.loadTasks(LocalTaskSource.java:169) at perf.LocalTaskSource.<init>(LocalTaskSource.java:48) at perf.SearchPerfTest._main(SearchPerfTest.java:543) at perf.SearchPerfTest.main(SearchPerfTest.java:133) ```` my vector-test.py looks like ````python #!/usr/bin/env python # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. import competition import sys import constants # simple example that runs benchmark with WIKI_MEDIUM source and task files # Baseline here is ../lucene_baseline versus ../lucene_candidate if __name__ == '__main__': #sourceData = competition.sourceData('wikivector1m') #sourceData = competition.sourceData('wikivector10k') sourceData = competition.sourceData('wikimedium10k') comp = competition.Competition(verifyScores=False) index = comp.newIndex('baseline', sourceData, vectorFile=constants.GLOVE_VECTOR_DOCS_FILE, vectorDimension=100, vectorEncoding='FLOAT32') # Warning -- Do not break the order of arguments # TODO -- Fix the following by using argparser concurrentSearches = True # create a competitor named baseline with sources in the ../trunk folder comp.competitor('baseline', 'baseline', vectorDict=constants.GLOVE_WORD_VECTORS_FILE, index=index, concurrentSearches=concurrentSearches) comp.competitor('candidate', 'candidate', vectorDict=constants.GLOVE_WORD_VECTORS_FILE, index=index, concurrentSearches=concurrentSearches) # use a different index # create a competitor named my_modified_version with sources in the ../patch folder # note that we haven't specified an index here, luceneutil will automatically use the index from the base competitor for searching # while the codec that is used for running this competitor is taken from this competitor. # start the benchmark - this can take long depending on your index and machines comp.benchmark("baseline_vs_candidate") ```` Could you tell me how can I fix that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org