jpountz commented on pull request #101:
URL: https://github.com/apache/lucene/pull/101#issuecomment-837909869


   Actually you don't need nightlyBench.py, you can use the standard python 
script. I think the following should work to test out on larger documents:
    - Download 
https://home.apache.org/~mikemccand/enwiki-20130102-lines.txt.lzma and put it 
under the same directory as the other data files like 
`enwiki-20120502-lines-1k.txt`.
    - Decompress it with `unlzma enwiki-20130102-lines.txt.lzma`.
    - Add the following to your `localconstants.py`:
   
   ```
   WIKI_BIG_DOCS_LINE_FILE = '%s/data/enwiki-20130102-lines.txt' % BASE_DIR
   WIKI_BIG_DOCS_COUNT = 6647577
   WIKI_BIG_TASKS_FILE = '%s/tasks/wikinightly.tasks' % BENCH_BASE_DIR
   ```
    - Run the benchmark with `-source wikibig10k` to make sure everything works.
    - Finally run the benchmark with `-source wikibigall`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to