SolrIndexWriter holding reference to deleted file?

2007-12-20 Thread amamare

I have an application consisting of three web applications running on JBoss
1.4.2 on a Linux Redhat server. I'm using Solr/Lucene embeddedly to create
and maintain a frequently updated index. Once updated, the index is copied
to another directory used for searching. Old index-files in the search
directory are then deleted. The streams used to copy the files are closed in
finally-blocks. After a few days an IOException occurs because of "too many
open files". When I run the linux command

ls -l /proc/26788/fd/

where 26788 is jboss' process id, it gives me a seemingly ever-increasing
list of deleted files (1 per update since I optimize on every update and use
compound file format), marked with 'deleted' in parantheses. They are all
located in the search directory. From what I understand this means that
something still holds a reference to the file, and that the file will be
permanently deleted once this something loses its reference to it.

Only SolrIndexSearcher objects are in direct contact with these files in the
search application. The searchers are local objects in search-methods, and
are closed after every search operation. In theory, the garbage collector
should collect these objects later (though while profiling other
applications I've noticed that it often doesn't garbage collect until the
allocated memory starts running out).

The other objects in contact with the files are the FileOutputStreams used
to copy them, but as stated above, these are closed in finally-blocks and
thus should hold no reference to the files.

I need to get rid of the "too many open files"-problem. I suspect that it is
related to the almost-deleted files in the proc-dir, but I know too little
of Linux to be sure. Does the problem ring a bell to anyone, or do you have
any ideas as to how I can get rid of the problem?

All help is greatly appreciated.
-- 
View this message in context: 
http://www.nabble.com/SolrIndexWriter-holding-reference-to-deleted-file--tp14436326p14436326.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: SolrIndexWriter holding reference to deleted file?

2007-12-26 Thread amamare

Yes I'm using Solr 1.2. I made a typo in the subject of the thread though, I
believe it is SolrIndexSearcher, not SolrIndexWriter, that is holding the
reference... Did you find a solution to it? Can you give me the url to your
thread?


Mark Baird wrote:
> 
> Just noticed this thread.  The issue you are seeing looks identical to the
> one I am seeing.  I started a thread about this same issue on Monday.  Are
> you also running Solr 1.2?
> 
> On Dec 20, 2007 8:20 AM, amamare <[EMAIL PROTECTED]> wrote:
> 
>>
>> I have an application consisting of three web applications running on
>> JBoss
>> 1.4.2 on a Linux Redhat server. I'm using Solr/Lucene embeddedly to
>> create
>> and maintain a frequently updated index. Once updated, the index is
>> copied
>> to another directory used for searching. Old index-files in the search
>> directory are then deleted. The streams used to copy the files are closed
>> in
>> finally-blocks. After a few days an IOException occurs because of "too
>> many
>> open files". When I run the linux command
>>
>> ls -l /proc/26788/fd/
>>
>> where 26788 is jboss' process id, it gives me a seemingly ever-increasing
>> list of deleted files (1 per update since I optimize on every update and
>> use
>> compound file format), marked with 'deleted' in parantheses. They are all
>> located in the search directory. From what I understand this means that
>> something still holds a reference to the file, and that the file will be
>> permanently deleted once this something loses its reference to it.
>>
>> Only SolrIndexSearcher objects are in direct contact with these files in
>> the
>> search application. The searchers are local objects in search-methods,
>> and
>> are closed after every search operation. In theory, the garbage collector
>> should collect these objects later (though while profiling other
>> applications I've noticed that it often doesn't garbage collect until the
>> allocated memory starts running out).
>>
>> The other objects in contact with the files are the FileOutputStreams
>> used
>> to copy them, but as stated above, these are closed in finally-blocks and
>> thus should hold no reference to the files.
>>
>> I need to get rid of the "too many open files"-problem. I suspect that it
>> is
>> related to the almost-deleted files in the proc-dir, but I know too
>> little
>> of Linux to be sure. Does the problem ring a bell to anyone, or do you
>> have
>> any ideas as to how I can get rid of the problem?
>>
>> All help is greatly appreciated.
>> --
>> View this message in context:
>> http://www.nabble.com/SolrIndexWriter-holding-reference-to-deleted-file--tp14436326p14436326.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/SolrIndexWriter-holding-reference-to-deleted-file--tp14436326p14503231.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: SolrIndexWriter holding reference to deleted file?

2007-12-26 Thread amamare

Thanks, I'll do some profiling and see if I can verify that. I don't think I
wrote it in the first post, but I'm not able to reproduce the error on
Windows XP, if that is of any help.


Yonik Seeley wrote:
> 
> This is probably related to "using Solr/Lucene embeddedly"
> See the warning at the top of http://wiki.apache.org/solr/EmbeddedSolr
> 
> It does sound like your SolrIndexSearcher objects aren't being closed.
> Solr (via SolrCore) doesn't rely on garbage collection to close the
> searchers (since gc unfortunately can't be triggered by low
> descriptors).  SolrIndexSearcher objects are reference counted and
> closed when no longer in use.  This means that SolrQueryRequest
> objects must always be closed or the refcount will be off.
> 
> Not sure where you could start except perhaps trying to verify the
> number of live SolrIndexSearcher objects.
> 
> -Yonik
> 
> On Dec 20, 2007 8:20 AM, amamare <[EMAIL PROTECTED]> wrote:
>>
>> I have an application consisting of three web applications running on
>> JBoss
>> 1.4.2 on a Linux Redhat server. I'm using Solr/Lucene embeddedly to
>> create
>> and maintain a frequently updated index. Once updated, the index is
>> copied
>> to another directory used for searching. Old index-files in the search
>> directory are then deleted. The streams used to copy the files are closed
>> in
>> finally-blocks. After a few days an IOException occurs because of "too
>> many
>> open files". When I run the linux command
>>
>> ls -l /proc/26788/fd/
>>
>> where 26788 is jboss' process id, it gives me a seemingly ever-increasing
>> list of deleted files (1 per update since I optimize on every update and
>> use
>> compound file format), marked with 'deleted' in parantheses. They are all
>> located in the search directory. From what I understand this means that
>> something still holds a reference to the file, and that the file will be
>> permanently deleted once this something loses its reference to it.
>>
>> Only SolrIndexSearcher objects are in direct contact with these files in
>> the
>> search application. The searchers are local objects in search-methods,
>> and
>> are closed after every search operation. In theory, the garbage collector
>> should collect these objects later (though while profiling other
>> applications I've noticed that it often doesn't garbage collect until the
>> allocated memory starts running out).
>>
>> The other objects in contact with the files are the FileOutputStreams
>> used
>> to copy them, but as stated above, these are closed in finally-blocks and
>> thus should hold no reference to the files.
>>
>> I need to get rid of the "too many open files"-problem. I suspect that it
>> is
>> related to the almost-deleted files in the proc-dir, but I know too
>> little
>> of Linux to be sure. Does the problem ring a bell to anyone, or do you
>> have
>> any ideas as to how I can get rid of the problem?
>>
>> All help is greatly appreciated.
>> --
>> View this message in context:
>> http://www.nabble.com/SolrIndexWriter-holding-reference-to-deleted-file--tp14436326p14436326.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/SolrIndexWriter-holding-reference-to-deleted-file--tp14436326p14503234.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: SolrIndexWriter holding reference to deleted file?

2008-01-03 Thread amamare

I haven't been able to get a profiler at the server yet, but I thought I
might show how my code works, because it's quite different from the example
in the link you provided...


public synchronized ResultItem[] search(String query) throws
CorruptIndexException, IOException{
  SolrIndexSearcher searcher = new SolrIndexSearcher(solrCore.getSchema(),
"MySearcher", solrCore.getIndexDir(), true);
  Hits hits = search(searcher, query);
  for(int i =0; i < hits.length(); i++){
   parse(hits.doc(i));
   //add to result-array
  }
  searcher.close();
  //return result-array
}

private Hits search(SolrIndexSearcher searcher, String pQuery){
  try {
SolrQueryParser parser = new SolrQueryParser(solrCore.getSchema(), 
"text");
//default search field is called "text"
Query query = parser.parse(pQuery);
return searcher.search(query);
  }
  //catch exceptions
}


This is the code that does the searching. The searcher is passed as a
parameter to the search-method, because it needs to be open while I'm
parsing the documents in the hits. I know I should move the closure of the
search-operation to a finally-block, will do that in any case, but I doubt
it will solve the problem because I've never had any exceptions in this
code. Might the problem be that I'm not using SolrQueryRequest objects?

Best regards, 


Yonik Seeley wrote:
> 
> This is probably related to "using Solr/Lucene embeddedly"
> See the warning at the top of http://wiki.apache.org/solr/EmbeddedSolr
> 
> It does sound like your SolrIndexSearcher objects aren't being closed.
> Solr (via SolrCore) doesn't rely on garbage collection to close the
> searchers (since gc unfortunately can't be triggered by low
> descriptors).  SolrIndexSearcher objects are reference counted and
> closed when no longer in use.  This means that SolrQueryRequest
> objects must always be closed or the refcount will be off.
> 
> Not sure where you could start except perhaps trying to verify the
> number of live SolrIndexSearcher objects.
> 
> -Yonik
> 
> On Dec 20, 2007 8:20 AM, amamare <[EMAIL PROTECTED]> wrote:
>>
>> I have an application consisting of three web applications running on
>> JBoss
>> 1.4.2 on a Linux Redhat server. I'm using Solr/Lucene embeddedly to
>> create
>> and maintain a frequently updated index. Once updated, the index is
>> copied
>> to another directory used for searching. Old index-files in the search
>> directory are then deleted. The streams used to copy the files are closed
>> in
>> finally-blocks. After a few days an IOException occurs because of "too
>> many
>> open files". When I run the linux command
>>
>> ls -l /proc/26788/fd/
>>
>> where 26788 is jboss' process id, it gives me a seemingly ever-increasing
>> list of deleted files (1 per update since I optimize on every update and
>> use
>> compound file format), marked with 'deleted' in parantheses. They are all
>> located in the search directory. From what I understand this means that
>> something still holds a reference to the file, and that the file will be
>> permanently deleted once this something loses its reference to it.
>>
>> Only SolrIndexSearcher objects are in direct contact with these files in
>> the
>> search application. The searchers are local objects in search-methods,
>> and
>> are closed after every search operation. In theory, the garbage collector
>> should collect these objects later (though while profiling other
>> applications I've noticed that it often doesn't garbage collect until the
>> allocated memory starts running out).
>>
>> The other objects in contact with the files are the FileOutputStreams
>> used
>> to copy them, but as stated above, these are closed in finally-blocks and
>> thus should hold no reference to the files.
>>
>> I need to get rid of the "too many open files"-problem. I suspect that it
>> is
>> related to the almost-deleted files in the proc-dir, but I know too
>> little
>> of Linux to be sure. Does the problem ring a bell to anyone, or do you
>> have
>> any ideas as to how I can get rid of the problem?
>>
>> All help is greatly appreciated.
>> --
>> View this message in context:
>> http://www.nabble.com/SolrIndexWriter-holding-reference-to-deleted-file--tp14436326p14436326.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/SolrIndexWriter-holding-reference-to-deleted-file--tp14436326p14594325.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: SolrIndexWriter holding reference to deleted file?

2008-01-06 Thread amamare

Hossman, thank you for clearing that up. The reason I create a new searcher
for every search is that the index is frequently updated, and as far as I
could read the documentation, a searcher will not detect changes in the
index that occured after it was opened. I tried using just one searcher, but
that did not work. As for the rest of the code, SolrJ is not available until
Solr 1.3, and I actually never found the example provided at
http://wiki.apache.org/solr/EmbeddedSolr (I can't see that it's linked to
from the main wiki), I only found this http://wiki.apache.org/solr/SolJava.

Anyway, I'll see if I can convert my code to the logic of the example at
EmbeddedSolr.



hossman wrote:
> 
> 
> : I haven't been able to get a profiler at the server yet, but I thought I
> : might show how my code works, because it's quite different from the
> example
> : in the link you provided...
> 
> i'm not even sure i really understand the orrigins of this thread, but 
> regardless of what the "main" topic is, regarding the specific topic of 
> hte code you posted: this is all a very bad idea.  
> 
> Creating a new searcher for every query is a bad idea.  Using the Hits 
> class for any reason is a bad idea.  I say all of this without 
> having any idea what "ResultItem" looks like, or what the code in the 
> "parse" method does ... they may also be bad ideas.
> 
> If you must do "Embedded Solr" then please follow the examples from the 
> wiki (as i recall there is even some solrj.embedded code to make it even 
> easier then that), and bear in mind that this is seriously "expert" level 
> stuff using very low level APIs that were really never ment for most 
> people to see ... it is very easy to simulteneously shot yourself in the 
> foot while tripping over all the rope Embedded Solr gives you to hang 
> yourself with.
> 
> : public synchronized ResultItem[] search(String query) throws
> : CorruptIndexException, IOException{
> :   SolrIndexSearcher searcher = new
> SolrIndexSearcher(solrCore.getSchema(),
> : "MySearcher", solrCore.getIndexDir(), true);
> :   Hits hits = search(searcher, query);
> :   for(int i =0; i < hits.length(); i++){
> :parse(hits.doc(i));
> ://add to result-array
> :   }
> :   searcher.close();
> :   //return result-array
> : }
> : 
> : private Hits search(SolrIndexSearcher searcher, String pQuery){
> :   try {
> : SolrQueryParser parser = new SolrQueryParser(solrCore.getSchema(),
> "text");
> : //default search field is called "text"
> : Query query = parser.parse(pQuery);
> : return searcher.search(query);
> :   }
> :   //catch exceptions
> : }
> 
> -Hoss
> 
> 

-- 
View this message in context: 
http://www.nabble.com/SolrIndexWriter-holding-reference-to-deleted-file--tp14436326p14660123.html
Sent from the Solr - User mailing list archive at Nabble.com.



Error messages in log, but everything seems fine

2008-02-22 Thread amamare

Hi,
Solr apparently writes loads of error messages with every update, commit,
search etc. Everything seems to be fine, searching and indexing is correct
and fast, but we are concerned it might affect other parts of the system if
they are in fact symptoms of errors internal to Solr. It seems that one
error message is being logged before every info message, like this:

12:55:50,816 ERROR [STDERR] 22.feb.2008 12:55:50
org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)
12:55:50,816 ERROR [STDERR] 22.feb.2008 12:55:50
org.apache.solr.update.DirectUpdateHandler2 doDeletions
INFO: DirectUpdateHandler2 deleting and removing dups for 1457 ids
12:55:51,269 ERROR [STDERR] 22.feb.2008 12:55:51
org.apache.solr.search.SolrIndexSearcher 
INFO: Opening [EMAIL PROTECTED] DirectUpdateHandler2
12:55:51,300 ERROR [STDERR] 22.feb.2008 12:55:51
org.apache.solr.update.DirectUpdateHandler2 doDeletions
INFO: DirectUpdateHandler2 docs deleted=1457
12:55:51,300 ERROR [STDERR] 22.feb.2008 12:55:51
org.apache.solr.search.SolrIndexSearcher 
INFO: Opening [EMAIL PROTECTED] main
12:55:51,316 ERROR [STDERR] 22.feb.2008 12:55:51
org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main

filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
12:55:51,316 ERROR [STDERR] 22.feb.2008 12:55:51
org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for [EMAIL PROTECTED] main

filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
12:55:51,316 ERROR [STDERR] 22.feb.2008 12:55:51
org.apache.solr.search.SolrIndexSearcher warm

I've looked in the source code of the relevant classes, but I can't find the
source of the log messages (which really don't seem to contain much
information).

Anyone know why this happens and how I can fix it?

Thanks,
Laila
-- 
View this message in context: 
http://www.nabble.com/Error-messages-in-log%2C-but-everything-seems-fine-tp15632885p15632885.html
Sent from the Solr - User mailing list archive at Nabble.com.