Our application processes RSS feeds. Its search activity is heavily
concentrated on the most recent 24 hours, with modest searching across
the past few days, and rare (but important) searching across months or
more. So we create a Solr core for each day, and then search the
appropriate set of cores for any given date range.
We used to pile up zillions of cores in solr.xml, and open them on
every Solr restart. But we kept running out of things: memory, open
file descriptors, and threads. So I think I have a better solution.
Now, any time we need a core, we create it on the fly. We have
solr.xml set up to *not* persist new cores. But of course their data
directories are persistent.
So far this appears to work great in QA. I've only done limited
testing yet, but I believe each core that we create will either
"reconnect" to an existing data directory or create a new data
directory, as appropriate.
Anyone know of problems with this approach?
Here is some of the most important source code (using Solrj), in case
someone else finds this approach useful, or in case someone feels
motivated to study it for problems.
Dean
/**
* Keeps track of the names of cores that are known to exist, so
we don't have to keep checking.
*/
private Set<String> knownCores = new HashSet<String>(20);
/**
* Returns the [EMAIL PROTECTED] SolrServer} for the specified [EMAIL PROTECTED]
prefix} and [EMAIL PROTECTED] day}.
*/
private SolrServer getSolrServer(String prefix, int day)
throws SolrServerException, IOException
{
String coreName = prefix + day;
String serverUrl = solrRootUrl + "/" + coreName;
try {
makeCoreAvailable(coreName);
return new CommonsHttpSolrServer(serverUrl);
} catch (MalformedURLException e) {
String message = "Invalid Solr server URL
(misconfiguration of solrRootUrl) "
+ serverUrl + ": " +
ExceptionUtil.getMessage(e);
LOGGER.error(message, e);
reportError();
throw new SolrMisconfigurationException(message, e);
}
}
private synchronized void makeCoreAvailable(String coreName)
throws SolrServerException, IOException
{
if (knownCores.contains(coreName)) {
return;
}
if (solrCoreExists(coreName, solrRootUrl)) {
knownCores.add(coreName);
return;
}
CommonsHttpSolrServer adminServer = new
CommonsHttpSolrServer(solrRootUrl);
LOGGER.info("Creating new Solr core " + coreName);
CoreAdminRequest.createCore(coreName, coreName, adminServer,
solrConfigFilename, solrSchemaFilename);
LOGGER.info("Successfully created new Solr core " + coreName);
}
private static boolean solrCoreExists(String coreName, String
solrRootUrl) throws IOException, SolrServerException
{
CommonsHttpSolrServer adminServer = new
CommonsHttpSolrServer(solrRootUrl);
CoreAdminResponse status =
CoreAdminRequest.getStatus(coreName, adminServer);
return status.getCoreStatus(coreName).get("instanceDir") !=
null;
}