What servlet container are you putting your Solr in? Jetty? Tomcat? Something else? Are you fronting it with apache on top of that? (I think maybe you are, otherwise I'm not sure how the phrase 'virtual host' applies).
In general, Solr of course doesn't care what directory it's in on disk, so long as the process running solr has the neccesary read/write permissions to the neccesary directories (and if it doesn't, you'd usually find out right away with an error message). And clients to Solr don't care what directory it's in on disk either, they only care that they can get it to it connecting to a certain port at a certain hostname. In general, if they can't get to it on a certain port at a certain hostname, that's something you'd discover right away, not something that would be intermittent. But I'm not familiar with nutch, you may want to try connecting to the port you have Solr running on (the hostname/port you have told nutch to find solr on?) yourself manually, and just make sure it is connectable. I can't think of any reason that what directory you have Solr in could cause CPU utilization issues. I think it's got nothing to do with that. I am not familar with nutch, if it's nutch that's taking 100% of your CPU, you might want to find some nutch experts to ask. Perhaps there's a nutch listserv? I am also not familiar with hadoop; you mention just in passing that you're using hadoop too, maybe that's an added complication, I don't know. One obvious reason nutch could be taking 100% cpu would be simply because you've asked it to do a lot of work quickly, and it's trying to. One reason I have seen Solr take 100% of CPU and become responsive, is when the Solr process gets caught up in terrible Java garbage collection. If that's what's happening, then giving the Solr JVM a higher maximum heap size can sometimes help (although confusingly, I've seen people suggest that if you give the Solr JVM too MUCH heap it can also result in long GC pauses), and if you have a multi-core/multi-CPU machine, I've found the JVM argument -XX:+UseConcMarkSweepGC to be very helpful. Other than that, it sounds to me like you've got a nutch/hadoop issue, not a Solr issue. ________________________________________ From: Eric Martin [e...@makethembite.com] Sent: Sunday, October 31, 2010 7:16 PM To: solr-user@lucene.apache.org Subject: RE: Solr in virtual host as opposed to /lib Hi, Thank you. This is more than idle curiosity. I am trying to debug an issue I am having with my installation and this is one step in verifying that I have a setup that does not consume resources. I am trying to debunk my internal myth that having Solr nad Nutch in a virtual host would be causing these issues. Here is the main issue that involves Nutch/Solr and Drupal: /home/mootlaw/lib/solr /home/mootlaw/lib/nutch /home/mootlaw/www/<Drupal site> I'm running a 1333 FSB Dual Socket Xeon 5500 Series @ 2.4ghz, Enterprise Linux - x86_64 - OS, 12 Gig RAM. My Solr and Nutch are running. I am using jetty for my Solr. My server is not rooted. Nutch is using 100% of my cpus. I see this in my CPU utilization in my whm: /usr/bin/java -Xmx1000m -Dhadoop.log.dir=/home/mootlaw/lib/nutch/logs -Dhadoop.log.file=hadoop.log -Djava.library.path=/home/mootlaw/lib/nutch/lib/native/Linux-amd64-64 -classpath /home/mootlaw/lib/nutch/conf:/usr/lib/tools.jar:/home/mootlaw/lib/nutch/buil d:/home/mootlaw/lib/nutch/build/test/classes:/home/mootlaw/lib/nutch/build/n utch-1.2.job:/home/mootlaw/lib/nutch/nutch-*.job:/home/mootlaw/lib/nutch/lib /apache-solr-core-1.4.0.jar:/home/mootlaw/lib/nutch/lib/apache-solr-solrj-1. 4.0.jar:/home/mootlaw/lib/nutch/lib/commons-beanutils-1.8.0.jar:/home/mootla w/lib/nutch/lib/commons-cli-1.2.jar:/home/mootlaw/lib/nutch/lib/commons-code c-1.3.jar:/home/mootlaw/lib/nutch/lib/commons-collections-3.2.1.jar:/home/mo otlaw/lib/nutch/lib/commons-el-1.0.jar:/home/mootlaw/lib/nutch/lib/commons-h ttpclient-3.1.jar:/home/mootlaw/lib/nutch/lib/commons-io-1.4.jar:/home/mootl aw/lib/nutch/lib/commons-lang-2.1.jar:/home/mootlaw/lib/nutch/lib/commons-lo gging-1.0.4.jar:/home/mootlaw/lib/nutch/lib/commons-logging-api-1.0.4.jar:/h ome/mootlaw/lib/nutch/lib/commons-net-1.4.1.jar:/home/mootlaw/lib/nutch/lib/ core-3.1.1.jar:/home/mootlaw/lib/nutch/lib/geronimo-stax-api_1.0_spec-1.0.1. jar:/home/mootlaw/lib/nutch/lib/hadoop-0.20.2-core.jar:/home/mootlaw/lib/nut ch/lib/hadoop-0.20.2-tools.jar:/home/mootlaw/lib/nutch/lib/hsqldb-1.8.0.10.j ar:/home/mootlaw/lib/nutch/lib/icu4j-4_0_1.jar:/home/mootlaw/lib/nutch/lib/j akarta-oro-2.0.8.jar:/home/mootlaw/lib/nutch/lib/jasper-compiler-5.5.12.jar: /home/mootlaw/lib/nutch/lib/jasper-runtime-5.5.12.jar:/home/mootlaw/lib/nutc h/lib/jcl-over-slf4j-1.5.5.jar:/home/mootlaw/lib/nutch/lib/jets3t-0.6.1.jar: /home/mootlaw/lib/nutch/lib/jetty-6.1.14.jar:/home/mootlaw/lib/nutch/lib/jet ty-util-6.1.14.jar:/home/mootlaw/lib/nutch/lib/junit-3.8.1.jar:/home/mootlaw /lib/nutch/lib/kfs-0.2.2.jar:/home/mootlaw/lib/nutch/lib/log4j-1.2.15.jar:/h ome/mootlaw/lib/nutch/lib/lucene-core-3.0.1.jar:/home/mootlaw/lib/nutch/lib/ lucene-misc-3.0.1.jar:/home/mootlaw/lib/nutch/lib/oro-2.0.8.jar:/home/mootla w/lib/nutch/lib/resolver.jar:/home/mootlaw/lib/nutch/lib/serializer.jar:/hom e/mootlaw/lib/nutch/lib/servlet-api-2.5-6.1.14.jar:/home/mootlaw/lib/nutch/l ib/slf4j-api-1.5.5.jar:/home/mootlaw/lib/nutch/lib/slf4j-log4j12-1.4.3.jar:/ home/mootlaw/lib/nutch/lib/taglibs-i18n.jar:/home/mootlaw/lib/nutch/lib/tika -core-0.7.jar:/home/mootlaw/lib/nutch/lib/wstx-asl-3.2.7.jar:/home/mootlaw/l ib/nutch/lib/xercesImpl.jar:/home/mootlaw/lib/nutch/lib/xml-apis.jar:/home/m ootlaw/lib/nutch/lib/xmlenc-0.52.jar:/home/mootlaw/lib/nutch/lib/jsp-2.1/jsp -2.1.jar:/home/mootlaw/lib/nutch/lib/jsp-2.1/jsp-api-2.1.jar org.apache.nutch.fetcher.Fetcher /home/mootlaw/lib/nutch/crawl/segments/20101031144443 -threads 50 My PIDS cannot be traced and my mem usage is at 5% My hadoop logs show: 2010-10-31 15:44:11,040 INFO fetcher.Fetcher - fetching http://caselaw.findlaw.com/us-5th-circuit/1454354.html 2010-10-31 15:44:11,294 INFO fetcher.Fetcher - fetching http://www.dallastxcriminaldefenseattorney.com/atom.xml 2010-10-31 15:44:11,337 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=2499 2010-10-31 15:44:12,339 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=2500 2010-10-31 15:44:13,341 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=2500 2010-10-31 15:44:14,344 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=2500 2010-10-31 15:44:15,346 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=2500 2010-10-31 15:44:16,349 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=2500 2010-10-31 15:44:16,568 INFO fetcher.Fetcher - fetching http://caselaw.findlaw.com/il-court-of-appeals/1542438.html 2010-10-31 15:44:17,308 INFO fetcher.Fetcher - fetching http://lcweb2.loc.gov/const/const.html 2010-10-31 15:44:17,352 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=2499 2010-10-31 15:44:18,354 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=2500 2010-10-31 15:44:19,356 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=2500 2010-10-31 15:44:20,358 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=2500 2010-10-31 15:44:21,360 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=2500 Can anyone help me out? Did I miss something should i be using Tomcat? One interesting part of this is when I try and change the nutch setting post url and urls by score to 1 they stay at 10 no matter what I do. -----Original Message----- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Sunday, October 31, 2010 4:12 PM To: solr-user@lucene.apache.org Subject: Re: Solr in virtual host as opposed to /lib Can you expand on your question? Are you having a problem? Is this idle curiosity? Because I have no idea how to respond when there is so little information. Best Erick On Sun, Oct 31, 2010 at 5:32 PM, Eric Martin <e...@makethembite.com> wrote: > Is there an issue running Solr in /home/lib as opposed to running it > somewhere outside of the virtual hosts like /lib? > > Eric > >