On 7/7/2015 10:51 AM, Steven White wrote: > What I am faced with is this. I have to create my own crawler, similar to > DIH. I have to deploy this on the same server as Solr (this is given, I > cannot change it). I have to manage this crawler just like I have to > manage my Solr deployment using Solr API through HTTP request. I figured > if I deploy my application under Jetty, with Solr, then problem is solved.
At some point in the future, Jetty is expected to go away, with Solr becoming a true standalone application. There is no set timeframe for this to happen. It will hopefully happen before 6.0, but the work needs to be *started* before any kind of guess can be made. > The other option I looked at is writing my own handler for my crawler and > plugging it into Solr's solrconfig.xml. If I do this, then my crawler will > run in the same JVM space as Solr, this is something I want to avoid. If you install another webapp into the same Jetty as Solr, then it will be running in the same JVM as Solr. Jetty is the application that the JVM runs, not Solr. This is not very different from a handler in solrconfig.xml. > Yet another option is for me deploy a second instance of Jetty on the Solr > server just for my crawler. This is over kill in my opinion. > > What do folks think about this and what's the best way to approach this > issue? Deploy my crawler on a separate server is not an option and for my > use case Solr will be used in a lightweight so there is plenty of CPU / RAM > on this one server to host Solr and my crawler. As you've already been told, it's a very strong recommendation that you treat Solr as a standalone application and forget that it's running in a standard servlet container. That means that any other webapps, like the crawler you mention, should be installed completely separately. In my previous reply, I told you how you *could* install another application into the Jetty included with Solr, but we don't recommend it, because eventually you won't have that option. Thanks, Shawn