Hi, I have 100 thousand indexes in Hadoop grid because 90% of my indexes will be inactive and I can distribute the other active indexes based on load. Scoring will work better for each index but I don't worry about it now.
What are the optimisations I need to do to Scale better? I do commit every time now. Should i work on keeping active index writer open and commit periodically with wal for failures. Update calls will happen frequently (80% load). I will read stored fields and update the existing document with new value. I don't compress storedfields now, because it has to uncompress block of data. Should I reconsider compression? Scale: 100s of indexes will be active at a time in a single machine(16gb ram) should I have to change to shard based architecture? I see some benefits there more batching will happen, multiple threads will not load the system. What other benefits can we get? Please share your ideas/any link for multi user environment. Regards, Suriya