-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 All,
I've got a single-core Solr instance with something like 1M small documents in it. It contains user information for fast-lookups, and it gets updated any time relevant user-info changes. Here's the basic info from the Core Dashboard: Last Modified: less than a minute ago Num Docs: 1011023 Max Doc: 1095364 Heap Memory Usage: -1 Deleted Docs: 84341 Version: 2582476 Segment Count: 15 Current: Ø Replication (Master) Version Gen Size Master (Searching) 1543329227929 491727 277.23 MB Each document add/update operation has an immediate explicit "commit" operation, which may be unnecessary, but it's there in case it makes any difference for this question. I'm wondering how often it makes sense to "optimize" my index, because there is plenty of turnover of existing documents. That is, plenty of existing users update their info and therefore the Lucene index is being updated as well -- causing a document-delete and document-add operation to occur. My understanding is that leaves a lot of dead space over time, and I'm assuming that it might even slow things down as the ratio of useful data to total data is reduced. Presumably, optimizing more often will reduce the time to perform a single optimization operation, yes? Anyhow, I'd like to know a few things: 1. Is manually-triggered optimization even worth doing at all? 2. If so, how often? Or, maybe not "how often [in hours/days/months]" but maybe "how often [in deletes, etc.]"? 3. During the optimization operation, can clients still issue (read) queries? If so, will they wait until the optimization operation has completed? 4. During the optimization operation, can clients still issue writes? If so, will they wait until the optimization operation has completed? 5. Is it possible to abort an optimization operation if it's taking too long, and simply discard the new data -- basically, fall-back to the previously-existing index data? 6. What's a good way to trigger an optimization operation? I didn't see anything directly in the web UI, but there is an "optimize" method in the Solr/J client. If I can fire-off a fire-and-forget "optimize" request via e.g. curl or similar tool rather than writing a Java client, that would be slightly more convenient for me. Thanks, - -chris -----BEGIN PGP SIGNATURE----- Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/ iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlv9WOgACgkQHPApP6U8 pFjPhQ//TOkMwES1ytAbugFE/bZdpwff9LS3sRbCEEL6Bbl9yeqZMXqDf652p2CN P9EusGW0WTSvhRaJb50H+jo4y5QxJmV36aBkMej7/o4yFw0hIRSqqihlbEFAVkI1 VMGWtr7s0Vv9O+/Wj0MP8FAizwm8d7nYl03rTvfY0b+BESOQHXv5I8DEai1+/mgF Mx49HG82qXo/9OZocrv4tal97juF7UcNDowVlnk0wcuk5LjEuilhzpOXtcTG9QmB Nc4H//d6hcDN0tp/az5hY1EoU3xmSdW2m243kgdzjVjz/Q9FotB0jAo3WGbD5EiB nmM1Yp0bKfRX/xLPHbtJ/wlQSSY4Dm/E0Y5Nb5fZFjnHtEke7/hWX1Qxps28gOs+ hXfm4WyjaTirnJk5h+I3wVJvzaHycD0vIFNwJ18JkLpPaVZ56iDfHcKVc5eHlWaa gaKYyLhz8DluZC//ydVFAbqDy7xOIeh/fiACFHM/SH9KjdempaVD1KrlO1/fxG0v U9Z4xI5GladTUnelcvvggCbl+9wFe3pO8xLqN4NMdftn5CNDFDTIs9Diph19jQJr sf7ETDQwWBebc6BesXdmFyKT8zHzX+x9uU3LtF9Tww5H0AS4JseEfogB3bsF6r3X MlRId02UPSuAMmzbMLn52jX0NljbMRNN1rHy3tVGpJD9OPgU3A8= =pb5Z -----END PGP SIGNATURE-----