Loading data to solr from mysql
Can anybody suggest me the way to load data from mysql to solr directly. -- View this message in context: http://lucene.472066.n3.nabble.com/Loading-data-to-solr-from-mysql-tp2442184p2442184.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Loading data to solr from mysql
http://wiki.apache.org/solr/DataImportHandler On Mon, Feb 7, 2011 at 11:16 AM, Bagesh Sharma wrote: > > Can anybody suggest me the way to load data from mysql to solr directly. > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Loading-data-to-solr-from-mysql-tp2442184p2442184.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Solr Error
Pl share your insights on the error. Regards, Prasad java.lang.OutOfMemoryError: Java heap space Exception in thread "Timer-1" at org.mortbay.util.URIUtil. decodePath(URIUtil.java:285) at org.mortbay.jetty.HttpURI.getDecodedPath(HttpURI.java:395) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:486) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) java.lang.OutOfMemoryError: Java heap space Exception in thread "Lucene Merge Thread #0" org.apache.lucene.index.MergePolicy$MergeException: java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:351) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:315) Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.util.UnicodeUtil.UTF16toUTF8(UnicodeUtil.java:236) at org.apache.lucene.store.IndexOutput.writeString(IndexOutput.java:103) at org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:231) at org.apache.lucene.index.FieldsWriter.addDocument(FieldsWriter.java:268) at org.apache.lucene.index.SegmentMerger.copyFieldsNoDeletions(SegmentMerger.java:451) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:352) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:153) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5112) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4675) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
Re: Solr Error
> Pl share your insights on the error. > java.lang.OutOfMemoryError: Java heap space What happens if you increase the Java heap space? java -Xmx1g -jar start.jar
Re: Solr Error
I have already allocated abt 2gb -Xmx2048m. Regards, Prasad On 7 February 2011 18:17, Ahmet Arslan wrote: > > > Pl share your insights on the error. > > java.lang.OutOfMemoryError: Java heap space > > What happens if you increase the Java heap space? > java -Xmx1g -jar start.jar > > > > >
Re: Possible Memory Leaks / Upgrading to a Later Version of Solr or Lucene
Heap usage can spike after a commit. Existing caches are still in use and new caches are being generated and/or auto warmed. Can you confirm this is the case? On Friday 28 January 2011 00:34:42 Simon Wistow wrote: > On Tue, Jan 25, 2011 at 01:28:16PM +0100, Markus Jelsma said: > > Are you sure you need CMS incremental mode? It's only adviced when > > running on a machine with one or two processors. If you have more you > > should consider disabling the incremental flags. > > I'll test agin but we added those to get better performance - not much > but there did seem to be an improvement. > > The problem seems to not be in average use but that occasionally there's > huge spike in load (there doesn't seem to be a particular "killer > query") and Solr just never recovers. > > Thanks, > > Simon -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350
Re: Solr Error
What is your index size and Ram you have? - Thanx: Grijesh http://lucidimagination.com -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Error-tp2442417p2443597.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Indexing Performance
On Sat, Feb 5, 2011 at 2:06 PM, Darx Oman wrote: > I indexed 1000 pdf file with the same configuration, it completed in about > 32 min. So, it seems like your indexing scales at least as well as the number of the PDF documents that you have. While this might be good news in your case, it is difficult to estimate an "expected" indexing rate when indexing from documents. Regards, Gora
DIH keeps felling during full-import
I'm receiving the following exception when trying to perform a full-import (~30 hours). Any idea on ways I could fix this? Is there an easy way to use DIH to break apart a full-import into multiple pieces? IE 3 mini-imports instead of 1 large import? Thanks. Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection SEVERE: Ignoring Error when closing connection com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown. at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:532) at com.mysql.jdbc.Util.handleNewInstance(Util.java:407) at com.mysql.jdbc.Util.getInstance(Util.java:382) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927) at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4751) at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345) at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564) at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399) at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390) at org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174) at org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:165) at org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332) at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection SEVERE: Ignoring Error when closing connection java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@1a797305 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries. at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:934) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:931) at com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2724) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1895) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2140) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2620) at com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:4854) at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4737) at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345) at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564) at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399) at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390) at org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174) at org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332) at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) Feb 7, 2011 7:03:29 AM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection SEVERE: Ignoring Error when closing connection com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown. at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:532) at com.mysql.jdbc.Util.handleNewInstance(Util.java:407) at com.mysql.jdbc.Util.getInstance(Util.java:382) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927) at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4751) at com.mysql.jdbc.ConnectionImpl.realCl
Re: DIH keeps failing during full-import
Typo in subject On 2/7/11 7:59 AM, Mark wrote: I'm receiving the following exception when trying to perform a full-import (~30 hours). Any idea on ways I could fix this? Is there an easy way to use DIH to break apart a full-import into multiple pieces? IE 3 mini-imports instead of 1 large import? Thanks. Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection SEVERE: Ignoring Error when closing connection com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown. at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:532) at com.mysql.jdbc.Util.handleNewInstance(Util.java:407) at com.mysql.jdbc.Util.getInstance(Util.java:382) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927) at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4751) at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345) at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564) at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399) at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390) at org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174) at org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:165) at org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332) at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection SEVERE: Ignoring Error when closing connection java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@1a797305 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries. at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:934) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:931) at com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2724) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1895) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2140) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2620) at com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:4854) at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4737) at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345) at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564) at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399) at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390) at org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174) at org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332) at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) Feb 7, 2011 7:03:29 AM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection SEVERE: Ignoring Error when closing connection com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown. at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:532) at com.mysql.jdbc.Util.handleNewInstance(Util.java:407) at com.mysql.jdbc.Util.getInstance(Util.java:382) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927) at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:
Re: DIH keeps felling during full-import
On Mon, Feb 7, 2011 at 9:29 PM, Mark wrote: > I'm receiving the following exception when trying to perform a full-import > (~30 hours). Any idea on ways I could fix this? > > Is there an easy way to use DIH to break apart a full-import into multiple > pieces? IE 3 mini-imports instead of 1 large import? > > Thanks. > > > > > Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource > closeConnection > SEVERE: Ignoring Error when closing connection > com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: > Communications link failure during rollback(). Transaction resolution > unknown. [...] This looks like a network issue, or some other failure in communicating with the mysql database. Is that a possibility? Also, how many records are you importing, what is the data size, what is the quality of the network connection, etc.? One way to break up the number of records imported at a time is to shard your data at at the database level, but the advisability of this option depends on whether there is a more fundamental issue. Regards, Gora
Re: DIH keeps felling during full-import
Full import is around 6M documents which when completed totals around 30GB in size. Im guessing it could be a database connectivity problem because I also see these types of errors on delta-imports which could be anywhere from 20K to 300K records. On 2/7/11 8:15 AM, Gora Mohanty wrote: On Mon, Feb 7, 2011 at 9:29 PM, Mark wrote: I'm receiving the following exception when trying to perform a full-import (~30 hours). Any idea on ways I could fix this? Is there an easy way to use DIH to break apart a full-import into multiple pieces? IE 3 mini-imports instead of 1 large import? Thanks. Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection SEVERE: Ignoring Error when closing connection com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown. [...] This looks like a network issue, or some other failure in communicating with the mysql database. Is that a possibility? Also, how many records are you importing, what is the data size, what is the quality of the network connection, etc.? One way to break up the number of records imported at a time is to shard your data at at the database level, but the advisability of this option depends on whether there is a more fundamental issue. Regards, Gora
Re: DIH keeps felling during full-import
On Mon, Feb 7, 2011 at 10:15 PM, Mark wrote: > Full import is around 6M documents which when completed totals around 30GB > in size. > > Im guessing it could be a database connectivity problem because I also see > these types of errors on delta-imports which could be anywhere from 20K to > 300K records. [...[ In that case, it might be advisable to start by trying to fix these. mysql, as well as most any modern database, ought to be able to deal with the sizes that you mention above, so my first guess would be issues with network connectivity. Is this an internal network, or does it go over the Internet? In either case, how good is the network supposed to be? Is there any application monitoring the network? Regards, Gora
Re: Http Connection is hanging while deleteByQuery
Hi Ravi Kiran, I am using Solr version 1.4, and the solution suggested by you seems to be there in solrconfig.xml already. But after reading your message again now, I checked the release notes(CHANGES.TXT) of Solr 1.4.1 and I found these two entries.. * SOLR-1711: SolrJ - StreamingUpdateSolrServer had a race condition that could halt the streaming of documents. The original patch to fix this (never officially released) introduced another hanging bug due to connections not being released. (Attila Babo, Erik Hetzner via yonik) * SOLR-1748, SOLR-1747, SOLR-1746, SOLR-1745, SOLR-1744: Streams and Readers retrieved from ContentStreams are not closed in various places, resulting in file descriptor leaks. (Christoff Brill, Mark Miller) Though I am not completely sure, I suspect by upgrading my Solr to 1.4.1 could solve this issue. I am in the process of upgrading my Solr. Will keep you posted on the updates when I have some.. Cheers Shan -- View this message in context: http://lucene.472066.n3.nabble.com/Http-Connection-is-hanging-while-deleteByQuery-tp2367405p2445076.html Sent from the Solr - User mailing list archive at Nabble.com.
hl.snippets in solr 3.1
hi all, I'm trying to get result like : blabla keyword blabla ... blablakeyword blabla... so, I'd like to show 2 fragments.I've added these settings 20 3 but I get only 1 fragment blabla keyword blabla. Am I trying to do it right way? Is it what can be done via changes in config file? how do I add separator between fragments(like ... in this example)? thanks.
Re: HTTP ERROR 400 undefined field: *
Thanks Otis, I'll give that a try. Jed. On 02/06/2011 08:06 PM, Otis Gospodnetic wrote: Yup, here it is, warning about needing to reindex: http://twitter.com/#!/lucene/status/28694113180192768 Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Erick Erickson To: solr-user@lucene.apache.org Sent: Sun, February 6, 2011 9:43:00 AM Subject: Re: HTTP ERROR 400 undefined field: * I *think* that there was a post a while ago saying that if you were using trunk 3_x one of the recent changes required re-indexing, but don't quote me on that. Have you tried that? Best Erick On Fri, Feb 4, 2011 at 2:04 PM, Jed Glazner wrote: Sorry for the lack of details. It's all clear in my head.. :) We checked out the head revision from the 3.x branch a few weeks ago ( https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/). We picked up r1058326. We upgraded from a previous checkout (r960098). I am using our customized schema.xml and the solrconfig.xml from the old revision with the new checkout. After upgrading I just copied the data folders from each core into the new checkout (hoping I wouldn't have to re-index the content, as this takes days). Everything seems to work fine, except that now I can't get the score to return. The stack trace is attached. I also saw this warning in the logs not sure exactly what it's talking about: Feb 3, 2011 8:14:10 PM org.apache.solr.core.Config getLuceneVersion WARNING: the luceneMatchVersion is not specified, defaulting to LUCENE_24 emulation. You should at some point declare and reindex to at least 3.0, because 2.4 emulation is deprecated and will be removed in 4.0. This parameter will be mandatory in 4.0. Here is my request handler, the actual fields here are different than what is in mine, but I'm a little uncomfortable publishing how our companies search service works to the world: explicit edismax true field_a^2 field_b^2 field_c^4 field_d^10 0.1 tvComponent Anyway Hopefully this is enough info, let me know if you need more. Jed. On 02/03/2011 10:29 PM, Chris Hostetter wrote: : I was working on an checkout of the 3.x branch from about 6 months ago. : Everything was working pretty well, but we decided that we should update and : get what was at the head. However after upgrading, I am now getting this FWIW: please be specific. "head" of what? the 3x branch? or trunk? what revision in svn does that corrispond to? (the "svnversion" command will tell you) : HTTP ERROR 400 undefined field: * : : If I clear the fl parameter (default is set to *, score) then it works fine : with one big problem, no score data. If I try and set fl=score I get the same : error except it says undefined field: score?! : : This works great in the older version, what changed? I've googled for about : an hour now and I can't seem to find anything. i can't reproduce this using either trunk (r1067044) or 3x (r1067045) all of these queries work just fine... http://localhost:8983/solr/select/?q=* http://localhost:8983/solr/select/?q=solr&fl=*,score http://localhost:8983/solr/select/?q=solr&fl=score http://localhost:8983/solr/select/?q=solr ...you'll have to proivde us with a *lot* more details to help understand why you might be getting an error (like: what your configs look like, what the request looks like, what the full stack trace of your error is in the logs, etc...) -Hoss
Re: hl.snippets in solr 3.1
--- On Mon, 2/7/11, alex wrote: > From: alex > Subject: hl.snippets in solr 3.1 > To: solr-user@lucene.apache.org > Date: Monday, February 7, 2011, 7:38 PM > hi all, > I'm trying to get result like : > blabla keyword blabla ... > blablakeyword blabla... > > so, I'd like to show 2 fragments.I've added these > settings > name="hl.simple.pre"> > name="hl.simple.post"> > name="f.content.hl.fragsize">20 > name="f.content.hl.snippets">3 > > but I get only 1 fragment blabla keyword > blabla. > Am I trying to do it right way? Is it what can be done via > changes in config file? > > how do I add separator between fragments(like ... in this > example)? > thanks. > These two should be declared under the defaults section of your requestHandler. 20 3 Where did you define them? Under the highlighting section in solrconfig.xml?
Re: HTTP ERROR 400 undefined field: *
: The stack trace is attached. I also saw this warning in the logs not sure >From your attachment... 853 SEVERE: org.apache.solr.common.SolrException: undefined field: score 854 at org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142) 855 at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) 856 at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) 857 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1357) ...this is one of the key pieces of info that was missing from your earlier email: that you are using the TermVectorComponent. It's likely that something changed in the TVC on 3x between the two versions you were using and thta change freaks out now on "*" or "score" in the fl. you still haven't given us an example of the full URLs you are using that trigger this error. (it's posisble there is something slightly off in your syntax - we don't know because you haven't shown us) All in: this sounds like a newly introduced bug in TVC, please post the details into a new Jira issue. as to the warning you asked about... : Feb 3, 2011 8:14:10 PM org.apache.solr.core.Config getLuceneVersion : WARNING: the luceneMatchVersion is not specified, defaulting to LUCENE_24 : emulation. You should at some point declare and reindex to at least 3.0, : because 2.4 emulation is deprecated and will be removed in 4.0. This parameter : will be mandatory in 4.0. if you look at the example configs on the 3x branch it should be explained. it's basically just a new "feature" that lets you specify which "quirks" of the underlying lucene code you want (so on upgrading you are in control of wether you eliminate old quirks or not) -Hoss
Re: hl.snippets in solr 3.1
Ahmet Arslan wrote: --- On Mon, 2/7/11, alex wrote: From: alex Subject: hl.snippets in solr 3.1 To: solr-user@lucene.apache.org Date: Monday, February 7, 2011, 7:38 PM hi all, I'm trying to get result like : blabla keyword blabla ... blablakeyword blabla... so, I'd like to show 2 fragments.I've added these settings 20 3 but I get only 1 fragment blabla keyword blabla. Am I trying to do it right way? Is it what can be done via changes in config file? how do I add separator between fragments(like ... in this example)? thanks. These two should be declared under the defaults section of your requestHandler. 20 3 Where did you define them? Under the highlighting section in solrconfig.xml? yes, it's in solrconfig.xml: dismax 10 explicit content^0.5 title^1.2 *:* true title content url 20 3 content 0 title 0 url I don't include the whole config , because there are just default values in it. I can see changes if I change fragsize, but no hl.snippets. and in schema.xml I have: words="stopwords.txt" ignoreCase="true" enablePositionIncrements="true"/> words="stopwords.txt" ignoreCase="true" enablePositionIncrements="true"/> and
How to search for special chars like ä from ae?
Hi! I want to search for special chars like mäcman by giving similar worded simple characters like maecman. I used and I'm getting mäcman from macman but I'm not able to get mäcman from maecman. Can this be done using any other filter? Thanks, Anithya -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-search-for-special-chars-like-a-from-ae-tp2444921p2444921.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: hl.snippets in solr 3.1
> I can see changes if I change fragsize, but no > hl.snippets. May be your text is too short to generate more than one snippets? What happens when you increase hl.maxAnalyzedChars parameter? &hl.maxAnalyzedChars=2147483647
Re: SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!
: While reloading a core I got this following error, when does this : occur ? Prior to this exception I do not see anything wrong in the logs. well, there are realy two distinct types of "errors" in your log... : [#|2011-02-01T13:02:36.697-0500|SEVERE|sun-appserver2.1|org.apache.solr.servlet.SolrDispatchFilter|_ThreadID=25;_ThreadName=httpWorkerThread-9001-5;_RequestID=450f6337-1f5c-42bc-a572-f0924de36b56;|org.apache.lucene.store.LockObtainFailedException: : Lock obtain timed out: NativeFSLock@ : /data/solr/core/solr-data/index/lucene-7dc773a074342fa21d7d5ba09fc80678-write.lock : at org.apache.lucene.store.Lock.obtain(Lock.java:85) : at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1565) : at org.apache.lucene.index.IndexWriter.(IndexWriter.java:1421) ...this is error #1, indicating that for some reason the IndexWriter Solr wasn't trying to create wasn't able to get a Native Filesystem lock on your index directory -- is it possible you have two intsances of Solr (or two solr cores) trying to re-use the same data directory? (diagnosing exampley why you got this error also requires knowing what Filesystem you are using). : [#|2011-02-01T13:02:40.330-0500|SEVERE|sun-appserver2.1|org.apache.solr.update.SolrIndexWriter|_ThreadID=82;_ThreadName=Finalizer;_RequestID=121fac59-7b08-46b9-acaa-5c5462418dc7;|SolrIndexWriter : was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE : LEAK!!!|#] : : [#|2011-02-01T13:02:40.330-0500|SEVERE|sun-appserver2.1|org.apache.solr.update.SolrIndexWriter|_ThreadID=82;_ThreadName=Finalizer;_RequestID=121fac59-7b08-46b9-acaa-5c5462418dc7;|SolrIndexWriter : was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE : LEAK!!!|#] ...these errors are warning you that something very unexpected was discovered when the the Garbage Collector tried to cleanup the SolrIndexWriter -- it found that the SolrIndexWriter had never been formally closed. In normal operation, this might indicate the existence of a bug in code not managing it's resources properly --and in fact, it does indicate the existence of a bug in that evidently a Lock timed out failure doesn't cause the SOlrIndexWriter to be closed -- but in your case it's not really something to be worried about -- it's just a cascading effect of the first error. -Hoss
Indexing a date from a POJO
Hi, I would like to know if the code below is correct, because the date is not well displayed in Luke I have a POJO with a date defined as follow: public class SolrPositionDTO { @Field private String address; @Field private Date beginDate; And in the schema config file the field is defined as: Thanks in advance for yr help JCD -- Jean-Claude Dauphin jc.daup...@gmail.com jc.daup...@afus.unesco.org http://kenai.com/projects/j-isis/ http://www.unesco.org/isis/ http://www.unesco.org/idams/ http://www.greenstone.org
Re: hl.snippets in solr 3.1
Ahmet Arslan wrote: I can see changes if I change fragsize, but no hl.snippets. May be your text is too short to generate more than one snippets? What happens when you increase hl.maxAnalyzedChars parameter? &hl.maxAnalyzedChars=2147483647 It's working now. I guess, it was a problem with config file. thanks!
RE: How to search for special chars like ä from ae?
Hi Anithya, There is a mapping file for MappingCharFilterFactory that behaves the same as ASCIIFoldingFilterFactory: mapping-FoldToASCII.txt, located in Solr's example conf/ directory in Solr 3.1+. You can rename and then edit this file to map "ä" to "ae", " ü" to "ue", etc. (look for "WITH DIAERESIS" to quickly find characters with umlauts in the mapping file). There is a commented-out example of using MappingCharFilterFactory in Solr's example schema.xml. If you are using Solr 1.4.X, you can download the mapping-FoldToASCII.txt file here (from the 3.x source tree): http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/conf/mapping-FoldToASCII.txt Please consider donating your work back to Solr if you decide to go this route. Good luck, Steve > -Original Message- > From: Anithya [mailto:surysha...@gmail.com] > Sent: Monday, February 07, 2011 12:09 PM > To: solr-user@lucene.apache.org > Subject: How to search for special chars like ä from ae? > > > Hi! I want to search for special chars like mäcman by giving similar > worded > simple characters like maecman. > I used and I'm getting > mäcman from macman but I'm not able to get mäcman from maecman. > Can this be done using any other filter? > Thanks, > Anithya > -- > View this message in context: http://lucene.472066.n3.nabble.com/How-to- > search-for-special-chars-like-a-from-ae-tp2444921p2444921.html > Sent from the Solr - User mailing list archive at Nabble.com.
Spatial Solr - Representing a bounding box and searching for it
Hi everyone, I have been looking for a searching solution for spatial data and since I have worked with Solr before, I wanted to give the spatial features a try. 1. What is the default datum used for the LatLong type? Is it WGS 84? 2. What is the best way to represent a region (a bounding box to be exact) and search for it? Spatial metadata records usually contains an element that specifies the region that the record is representing. For example North American Profile (NAP) has the following element: -95.15605 -74.34407 41.436108 54.61572 which define the bounding box containing the region. As far as I've seen, spatial fields in Solr are limited to points only. I tried using four LatLong to represent four corners of the region, but I couldn't get the bbox query to return the correct box: adding another sfield to the query had no effect. I also tried to use the "fq=store:[45,-94 TO 46,-93]" example by changing the store field into multivalue and putting the upper-right and lower-left into my document and using them as the range, but that also didn't work. So any suggestions on how to get this working? Sepehr
Re: dynamic fields revisited
Just so anyone else can know and save themselves 1/2 hour if they spend 4 minutes searching. When putting a dynamic field into a document into an index, the name of the field RETAINS the 'constant' part of the dynamic field name. Example - If a dynamic integer field is named '*_i' in the schema.xml file, __and__ you insert a field names 'my_integer_i', which matches the globbed field name '*_i', __then__ the name of the field will be 'my_integer_i' in the index and in your GETs/(updating)POSTs to the index on that document and __NOT__ 'my_integer' like I was kind of hoping that it would be :-( I.E., the suffix (or prefix if you set it up that way,) will NOT be dropped. I was hoping that everything except the globbing character, '*', would just be a flag to the query processor and disappear after being 'noticed'. Not so :-) -- View this message in context: http://lucene.472066.n3.nabble.com/dynamic-fields-revisited-tp2161080p2447814.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: dynamic fields revisited
It would be quite annoying if it behaves as you were hoping for. This way it is possible to use different field types (and analyzers) for the same field value. In faceting, for example, this can be important because you should use analyzed fields for q and fq but unanalyzed fields for facet.field. The same goes for sorting and range queries where you can use the same field value to end up in different field types, one for sorting and one for a range query. Without the prefix or suffix of the dynamic field, one must statically declare the fields beforehand and loose the dynamic advantage. > Just so anyone else can know and save themselves 1/2 hour if they spend 4 > minutes searching. > > When putting a dynamic field into a document into an index, the name of the > field RETAINS the 'constant' part of the dynamic field name. > > Example > - > If a dynamic integer field is named '*_i' in the schema.xml file, > __and__ > you insert a field names 'my_integer_i', which matches the globbed field > name '*_i', > __then__ > the name of the field will be 'my_integer_i' in the index > and in your GETs/(updating)POSTs to the index on that document and > __NOT__ > 'my_integer' like I was kind of hoping that it would be :-( > > I.E., the suffix (or prefix if you set it up that way,) will NOT be > dropped. I was hoping that everything except the globbing character, '*', > would just be a flag to the query processor and disappear after being > 'noticed'. > > Not so :-)
Re: dynamic fields revisited
I have a long way to go to understand all those implications. Mind you, I never -was- whining :-). Just ignorantly surprised. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036' EARTH has a Right To Life, otherwise we all die. From: Markus Jelsma To: solr-user@lucene.apache.org Cc: gearond Sent: Mon, February 7, 2011 3:28:18 PM Subject: Re: dynamic fields revisited It would be quite annoying if it behaves as you were hoping for. This way it is possible to use different field types (and analyzers) for the same field value. In faceting, for example, this can be important because you should use analyzed fields for q and fq but unanalyzed fields for facet.field. The same goes for sorting and range queries where you can use the same field value to end up in different field types, one for sorting and one for a range query. Without the prefix or suffix of the dynamic field, one must statically declare the fields beforehand and loose the dynamic advantage. > Just so anyone else can know and save themselves 1/2 hour if they spend 4 > minutes searching. > > When putting a dynamic field into a document into an index, the name of the > field RETAINS the 'constant' part of the dynamic field name. > > Example > - > If a dynamic integer field is named '*_i' in the schema.xml file, > __and__ > you insert a field names 'my_integer_i', which matches the globbed field > name '*_i', > __then__ > the name of the field will be 'my_integer_i' in the index > and in your GETs/(updating)POSTs to the index on that document and > __NOT__ > 'my_integer' like I was kind of hoping that it would be :-( > > I.E., the suffix (or prefix if you set it up that way,) will NOT be > dropped. I was hoping that everything except the globbing character, '*', > would just be a flag to the query processor and disappear after being > 'noticed'. > > Not so :-)
Re: DIH keeps failing during full-import
It is not reasonable to expect a database session to work over 30 hours, let alone an app/database operation. If you can mark a database record as successfully indexed, the incremental feature can be used to only index non-marked records. SOLR-1499 offers a way to check Solr with a sorted query on every field; you could use to find the most recent indexed record. There is no general way of doing this. On Mon, Feb 7, 2011 at 7:59 AM, Mark wrote: > Typo in subject > > On 2/7/11 7:59 AM, Mark wrote: >> >> I'm receiving the following exception when trying to perform a full-import >> (~30 hours). Any idea on ways I could fix this? >> >> Is there an easy way to use DIH to break apart a full-import into multiple >> pieces? IE 3 mini-imports instead of 1 large import? >> >> Thanks. >> >> >> >> >> Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource >> closeConnection >> SEVERE: Ignoring Error when closing connection >> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: >> Communications link failure during rollback(). Transaction resolution >> unknown. >> at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown >> Source) >> at >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >> at java.lang.reflect.Constructor.newInstance(Constructor.java:532) >> at com.mysql.jdbc.Util.handleNewInstance(Util.java:407) >> at com.mysql.jdbc.Util.getInstance(Util.java:382) >> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013) >> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987) >> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982) >> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927) >> at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4751) >> at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345) >> at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564) >> at >> org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399) >> at >> org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390) >> at >> org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174) >> at >> org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:165) >> at >> org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332) >> at >> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360) >> at >> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391) >> at >> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) >> Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource >> closeConnection >> SEVERE: Ignoring Error when closing connection >> java.sql.SQLException: Streaming result set >> com.mysql.jdbc.RowDataDynamic@1a797305 is still active. No statements may be >> issued when any streaming result sets are open and in use on a given >> connection. Ensure that you have called .close() on any active streaming >> result sets before attempting more queries. >> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:934) >> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:931) >> at >> com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2724) >> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1895) >> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2140) >> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2620) >> at >> com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:4854) >> at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4737) >> at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345) >> at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564) >> at >> org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399) >> at >> org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390) >> at >> org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174) >> at >> org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332) >> at >> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360) >> at >> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391) >> at >> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) >> Feb 7, 2011 7:03:29 AM org.apache.solr.handler.dataimport.JdbcDataSource >> closeConnection >> SEVERE: Ignoring Error when closing connection >> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: >> Communications link failure during rollback(). Transaction resolution >> unknown. >> at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown >> Sour
Re: DIH keeps failing during full-import
You're probably better off in this instance creating your own process based on SolrJ and your jdbc-driver-of-choice. DIH doesn't provide much in the way of fine-grained control over all aspects of the process, and at +30 hours I suspect you want some better control. FWIW, SolrJ is not very hard at all to use for this kind of thing. Best Erick On Mon, Feb 7, 2011 at 10:59 AM, Mark wrote: > Typo in subject > > On 2/7/11 7:59 AM, Mark wrote: > >> I'm receiving the following exception when trying to perform a full-import >> (~30 hours). Any idea on ways I could fix this? >> >> Is there an easy way to use DIH to break apart a full-import into multiple >> pieces? IE 3 mini-imports instead of 1 large import? >> >> Thanks. >> >> >> >> >> Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource >> closeConnection >> SEVERE: Ignoring Error when closing connection >> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: >> Communications link failure during rollback(). Transaction resolution >> unknown. >>at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown >> Source) >>at >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >>at java.lang.reflect.Constructor.newInstance(Constructor.java:532) >>at com.mysql.jdbc.Util.handleNewInstance(Util.java:407) >>at com.mysql.jdbc.Util.getInstance(Util.java:382) >>at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013) >>at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987) >>at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982) >>at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927) >>at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4751) >>at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345) >>at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564) >>at >> org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399) >>at >> org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390) >>at >> org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174) >>at >> org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:165) >>at >> org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332) >>at >> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360) >>at >> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391) >>at >> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) >> Feb 7, 2011 5:52:33 AM org.apache.solr.handler.dataimport.JdbcDataSource >> closeConnection >> SEVERE: Ignoring Error when closing connection >> java.sql.SQLException: Streaming result set >> com.mysql.jdbc.RowDataDynamic@1a797305 is still active. No statements may >> be issued when any streaming result sets are open and in use on a given >> connection. Ensure that you have called .close() on any active streaming >> result sets before attempting more queries. >>at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:934) >>at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:931) >>at >> com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2724) >>at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1895) >>at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2140) >>at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2620) >>at >> com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:4854) >>at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4737) >>at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4345) >>at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1564) >>at >> org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:399) >>at >> org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:390) >>at >> org.apache.solr.handler.dataimport.DataConfig$Entity.clearCache(DataConfig.java:174) >>at >> org.apache.solr.handler.dataimport.DataConfig.clearCaches(DataConfig.java:332) >>at >> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:360) >>at >> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391) >>at >> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) >> Feb 7, 2011 7:03:29 AM org.apache.solr.handler.dataimport.JdbcDataSource >> closeConnection >> SEVERE: Ignoring Error when closing connection >> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: >> Communications link failure during rollback(). Transaction resolution >> unknown. >>at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown >> Source) >>at >> sun.reflect.DelegatingConstructorAccessorImp
Re: geodist and spacial search
Thanks Bill, much simpler :-) On Sat, Feb 5, 2011 at 3:56 AM, Bill Bell wrote: > Why not just: > > q=*:* > fq={!bbox} > sfield=store > pt=49.45031,11.077721 > d=40 > fl=store > sort=geodist() asc > > > http://localhost:8983/solr/select?q=*:*&sfield=store&pt=49.45031,11.077721&; > d=40&fq={!bbox}&sort=geodist%28%29%20asc > > That will sort, and filter up to 40km. > > No need for the > > fq={!func}geodist() > sfield=store > pt=49.45031,11.077721 > > > Bill > > > > > On 2/4/11 4:30 AM, "Eric Grobler" wrote: > > >Hi Grant, > > > >Thanks for the tip > >This seems to work: > > > >q=*:* > >fq={!func}geodist() > >sfield=store > >pt=49.45031,11.077721 > > > >fq={!bbox} > >sfield=store > >pt=49.45031,11.077721 > >d=40 > > > >fl=store > >sort=geodist() asc > > > > > >On Thu, Feb 3, 2011 at 7:46 PM, Grant Ingersoll > >wrote: > > > >> Use a filter query? See the {!geofilt} stuff on the wiki page. That > >>gives > >> you your filter to restrict down your result set, then you can sort by > >>exact > >> distance to get your sort of just those docs that make it through the > >> filter. > >> > >> > >> On Feb 3, 2011, at 10:24 AM, Eric Grobler wrote: > >> > >> > Hi Erick, > >> > > >> > Thanks I saw that example, but I am trying to sort by distance AND > >> specify > >> > the max distance in 1 query. > >> > > >> > The reason is: > >> > running bbox on 2 million documents with a 20km distance takes only > >> 200ms. > >> > Sorting 2 million documents by distance takes over 1.5 seconds! > >> > > >> > So it will be much faster for solr to first filter the 20km documents > >>and > >> > then to sort them. > >> > > >> > Regards > >> > Ericz > >> > > >> > On Thu, Feb 3, 2011 at 1:27 PM, Erick Erickson > >> >> >wrote: > >> > > >> >> Further down that very page ... > >> >> > >> >> Here's an example of sorting by distance ascending: > >> >> > >> >> - > >> >> > >> >> ...&q=*:*&sfield=store&pt=45.15,-93.85&sort=geodist() > >> >> asc< > >> >> > >> > >> > http://localhost:8983/solr/select?wt=json&indent=true&fl=name,store&q=*:* > >>&sfield=store&pt=45.15,-93.85&sort=geodist()%20asc > >> >>> > >> >> > >> >> > >> >> > >> >> > >> >> The key is just the &sort=geodist(), I'm pretty sure that's > >>independent > >> of > >> >> the bbox, but > >> >> I could be wrong. > >> >> > >> >> Best > >> >> Erick > >> >> > >> >> On Wed, Feb 2, 2011 at 11:18 AM, Eric Grobler < > >> impalah...@googlemail.com > >> >>> wrote: > >> >> > >> >>> Hi > >> >>> > >> >>> In http://wiki.apache.org/solr/SpatialSearch > >> >>> there is an example of a bbox filter and a geodist function. > >> >>> > >> >>> Is it possible to do a bbox filter and sort by distance - combine > >>the > >> >> two? > >> >>> > >> >>> Thanks > >> >>> Ericz > >> >>> > >> >> > >> > >> -- > >> Grant Ingersoll > >> http://www.lucidimagination.com/ > >> > >> Search the Lucene ecosystem docs using Solr/Lucene: > >> http://www.lucidimagination.com/search > >> > >> > > >
Re: dynamic fields revisited
You can change the match to be my* and then insert the name you want. Bill Bell Sent from mobile On Feb 7, 2011, at 4:15 PM, gearond wrote: > > Just so anyone else can know and save themselves 1/2 hour if they spend 4 > minutes searching. > > When putting a dynamic field into a document into an index, the name of the > field RETAINS the 'constant' part of the dynamic field name. > > Example > - > If a dynamic integer field is named '*_i' in the schema.xml file, > __and__ > you insert a field names 'my_integer_i', which matches the globbed field > name '*_i', > __then__ > the name of the field will be 'my_integer_i' in the index > and in your GETs/(updating)POSTs to the index on that document and > __NOT__ > 'my_integer' like I was kind of hoping that it would be :-( > > I.E., the suffix (or prefix if you set it up that way,) will NOT be dropped. > I was hoping that everything except the globbing character, '*', would just > be a flag to the query processor and disappear after being 'noticed'. > > Not so :-) > -- > View this message in context: > http://lucene.472066.n3.nabble.com/dynamic-fields-revisited-tp2161080p2447814.html > Sent from the Solr - User mailing list archive at Nabble.com.
q.alt=*:* for every request?
Hi, I use dismax handler with solr 1.4. Sometimes, my request comes with q and fq, and others doesn't come with q (only fq and q.alt=*:*). It's quite ok if I send q.alt=*:* for every request? Does it have side effects on performance? -- Chhorn Chamnap http://chamnapchhorn.blogspot.com/
Re: Possible Memory Leaks / Upgrading to a Later Version of Solr or Lucene
On Mon, Feb 07, 2011 at 02:06:00PM +0100, Markus Jelsma said: > Heap usage can spike after a commit. Existing caches are still in use and new > caches are being generated and/or auto warmed. Can you confirm this is the > case? We see spikes after replication which I suspect is, as you say, because of the ensuing commit. What we seem to have found is that when we weren't using the Concurrent GC stop-the-world gc runs would kill the app. Now that we're using CMS we occasionally find ourselves in situations where the app still has memory "left over" but the load on the machine spikes, the GC duty cycle goes to 100 and the app never recovers. Restarting usually helps but sometimes we have to take the machine out of the laod balancer, wait for a number of minutes and then out it back in. We're working on two hypotheses Firstly - we're CPU bound somehow and that at some point we cross some threshhold and GC or something else is just unable to to keep up. So whilst it looks like instantaneous death of the app it's actually gradual resource exhaustion where the definition of 'gradual' is 'a very short period of time' (as opposed to some cataclysmic infinite loop bug somewhere). Either that or ... Secondly - there's some sort of Query Of Death that kills machines. We just haven't found it yet, even when replaying logs. Or some combination of both. Or other things. It's maddeningly frustrating. We're also got to try deploying a custom solr.war and try using the MMapDirectory to see if that helps with anything.
Re: Searching for negative numbers very slow
On Fri, Jan 28, 2011 at 12:29:18PM -0500, Yonik Seeley said: > That's odd - there should be nothing special about negative numbers. > Here are a couple of ideas: > - if you have a really big index and querying by a negative number > is much more rare, it could just be that part of the index wasn't > cached by the OS and so the query needs to hit the disk. This can > happen with any term and a really big index - nothing special for > negatives here. > - if -1 is a really common value, it can be slower. is fq=uid:\-2 or > other negative numbers really slow also? This was my first thought but -1 is relatively common but we have other numbers just as common. Interestingly enough fq=uid:-1 fq=foo:bar fq=alpha:omega is much (4x) slower than q="uid:-1 AND foo:bar AND alpha:omega" but only when searching for that number. I'm going to wave my hands here and say something like "Maybe something to do with the field caches?"
Re: q.alt=*:* for every request?
There is no measurable performance penalty when setting the parameter, except maybe the execution of the query with a high value for rows. To make things easy, you can define q.alt=*:* as default in your request handler. No need to specifiy it in the URL. > Hi, > > I use dismax handler with solr 1.4. > Sometimes, my request comes with q and fq, and others doesn't come with q > (only fq and q.alt=*:*). It's quite ok if I send q.alt=*:* for every > request? Does it have side effects on performance?
Re: Possible Memory Leaks / Upgrading to a Later Version of Solr or Lucene
Do you have GC logging enabled? Tail -f the log file and you'll see what CMS is telling you. Tuning the occupation fraction of the tenured generation to a lower value than default and telling the JVM to only use your value to initiate a collection can help a lot. The same goes for sizing the young generation and sometimes the survivor ratio. Consult the HotSpot CMS settings and young generation (or new) sizes. They are very important. If you have multiple slaves under the same load you can easily try different configurations. Keeping an eye on the nodes with a tool like JConsole and at the same time tailing the GC log will help a lot. Don't forget to send updates and frequent commits or you won't be able to replay. I've never seen a Solr instance go down under heavy load and without commits but they tend to behave badly when commits occur while under heavy load with long cache warming times (and heap consumption). You might also be suffering from memory fragmentation, this is bad and can lead to failure. You can configure the JVM to fore a compaction before a GC, that's nice but it does consume CPU time. A query of death can, in theory, also happen when you sort on a very large dataset that isn't optimized, in this case the maxDoc value is too high. Anyway, try some settings and monitor the nodes and please report your findings. > On Mon, Feb 07, 2011 at 02:06:00PM +0100, Markus Jelsma said: > > Heap usage can spike after a commit. Existing caches are still in use and > > new caches are being generated and/or auto warmed. Can you confirm this > > is the case? > > We see spikes after replication which I suspect is, as you say, because > of the ensuing commit. > > What we seem to have found is that when we weren't using the Concurrent > GC stop-the-world gc runs would kill the app. Now that we're using CMS > we occasionally find ourselves in situations where the app still has > memory "left over" but the load on the machine spikes, the GC duty cycle > goes to 100 and the app never recovers.> > Restarting usually helps but sometimes we have to take the machine out > of the laod balancer, wait for a number of minutes and then out it back > in. > > We're working on two hypotheses > > Firstly - we're CPU bound somehow and that at some point we cross some > threshhold and GC or something else is just unable to to keep up. So > whilst it looks like instantaneous death of the app it's actually > gradual resource exhaustion where the definition of 'gradual' is 'a very > short period of time' (as opposed to some cataclysmic infinite loop bug > somewhere). > > Either that or ... Secondly - there's some sort of Query Of Death that > kills machines. We just haven't found it yet, even when replaying logs. > > Or some combination of both. Or other things. It's maddeningly > frustrating. > > We're also got to try deploying a custom solr.war and try using the > MMapDirectory to see if that helps with anything.
Solr Analysis Package
I'd like to use the filter factories in the org.apache.solr.analysis package for tokenizing text in a separate application. I need to chain a couple tokenizers together like Solr does on indexing and query parsing. I have looked into the TokenizerChain class to do this. I have successfully implemented a tokenization chain, but was wondering if there is an established way to do this. I just hacked together something that happened to work. Below is a code snippet. Any advise would be appreciated. Dependencies: solr-core-1.4.0, lucene-core-2.9.3, lucene-snowball-2.9.3. I am not tied to these and could use different versions. P.S. Is this more of a question for the solr-dev mailing list? TokenizerFactory tokenizer = new WhitespaceTokenizerFactory(); Map args = new HashMap(); SnowballPorterFilterFactory porterFilter = new SnowballPorterFilterFactory(); porterFilter.init(args); args = new HashMap(); args.put("generateWordParts", "1"); args.put("generateNumberParts", "1"); args.put("catenateWords", "1"); args.put("catenateNumbers", "1"); args.put("catenateAll", "0"); WordDelimiterFilterFactory wordFilter = new WordDelimiterFilterFactory(); wordFilter.init(args); LowerCaseFilterFactory lowercaseFilter = new LowerCaseFilterFactory(); TokenFilterFactory[] filters = new TokenFilterFactory[] { wordFilter, lowercaseFilter, porterFilter }; TokenizerChain chain = new TokenizerChain(tokenizer, filters); TokenStream stream = chain.tokenStream(null, new StringReader(builder.toString())); TermAttribute tm = (TermAttribute)stream.getAttribute(TermAttribute.class); while (stream.incrementToken()) { System.out.println(tm.term()); }
Re: Performance optimization of Proximity/Wildcard searches
Hi, Yes, assuming you didn't change the index files, say by optimizing the index, the hot portions of the index should remain in the OS cache unless something else kicked them out. Re other thread - I don't think I have those messages any more. Otis --- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Salman Akram > To: solr-user@lucene.apache.org > Sent: Mon, February 7, 2011 2:49:44 AM > Subject: Re: Performance optimization of Proximity/Wildcard searches > > Only couple of thousand documents are added daily so the old OS cache should > still be useful since old documents remain same, right? > > Also can you please comment on my other thread related to Term Vectors? > Thanks! > > On Sat, Feb 5, 2011 at 8:40 PM, Otis Gospodnetic > wrote: > > > Yes, OS cache mostly remains (obviously index files that are no longer > > around > > are going to remain the OS cache for a while, but will be useless and > > gradually > > replaced by new index files). > > How long warmup takes is not relevant here, but what queries you use to > > warm up > > the index and how much you auto-warm the caches. > > > > Otis > > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > - Original Message > > > From: Salman Akram > > > To: solr-user@lucene.apache.org > > > Sent: Sat, February 5, 2011 4:06:54 AM > > > Subject: Re: Performance optimization of Proximity/Wildcard searches > > > > > > Correct me if I am wrong. > > > > > > Commit in index flushes SOLR cache but of course OS cache would still be > > > useful? If a an index is updated every hour then a warm up that takes > > less > > > than 5 mins should be more than enough, right? > > > > > > On Sat, Feb 5, 2011 at 7:42 AM, Otis Gospodnetic < > > otis_gospodne...@yahoo.com > > > > wrote: > > > > > > > Salman, > > > > > > > > Warming up may be useful if your caches are getting decent hit ratios. > > > > Plus, you > > > > are warming up the OS cache when you warm up. > > > > > > > > Otis > > > > > > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > > > > > > > > > - Original Message > > > > > From: Salman Akram > > > > > To: solr-user@lucene.apache.org > > > > > Sent: Fri, February 4, 2011 3:33:41 PM > > > > > Subject: Re: Performance optimization of Proximity/Wildcard searches > > > > > > > > > > I know so we are not really using it for regular warm-ups (in any > > case > > > > index > > > > > is updated on hourly basis). Just tried few times to compare > > results. > > > > The > > > > > issue is I am not even sure if warming up is useful for such > > regular > > > > > updates. > > > > > > > > > > > > > > > > > > > > On Fri, Feb 4, 2011 at 5:16 PM, Otis Gospodnetic < > > > > otis_gospodne...@yahoo.com > > > > > > wrote: > > > > > > > > > > > Salman, > > > > > > > > > > > > I only skimmed your email, but wanted to say that this part > > sounds a > > > > little > > > > > > suspicious: > > > > > > > > > > > > > Our warm up script currently executes all distinct queries in > > our > > > > logs > > > > > > > having count > 5. It was run yesterday (with all the indexing > > > > update > > > > > > every > > > > > > > > > > > > It sounds like this will make warmup take a long time, > > assuming > > > > you > > > > > > have > > > > > > more than a handful distinct queries in your logs. > > > > > > > > > > > > Otis > > > > > > > > > > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > > > > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > > > > > > > > > > > > > > > > > - Original Message > > > > > > > From: Salman Akram > > > > > > > To: solr-user@lucene.apache.org; t...@statsbiblioteket.dk > > > > > > > Sent: Tue, January 25, 2011 6:32:48 AM > > > > > > > Subject: Re: Performance optimization of Proximity/Wildcard > > searches > > > > > > > > > > > > > > By warmed index you only mean warming the SOLR cache or OS > > cache? As > > > > I > > > > > > said > > > > > > > our index is updated every hour so I am not sure how much SOLR > > cache > > > > > > would > > > > > > > be helpful but OS cache should still be helpful, right? > > > > > > > > > > > > > > I haven't compared the results with a proper script but from > > manual > > > > > > testing > > > > > > > here are some of the observations. > > > > > > > > > > > > > > 'Recent' queries which are in cache of course return > > immediately > > > > (only > > > > > > if > > > > > > > they are exactly same - even if they took 3-4 mins first > > time). I > > > > will > > > > > > need > > > > > > > to test how many recent queri
Re: nested faceting ?
I think what you are trying to achieve is called taxonomy facet. There is a solution for that. Check for the slides for Taxonomy faceting. http://www.lucidimagination.com/solutions/webcasts/faceting However, i don't know if you are able to render the hierachy all at once. The solution i point is for one hierachry at a time. devices (100) accessories (1000) if "device" is selected/clicked, then show --> Samsung (50) Sharp(50) If "Accessories" is selected/clicked, then show --> Samsung (500) Apple(500) -- View this message in context: http://lucene.472066.n3.nabble.com/nested-faceting-tp2389841p2449439.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr n00b question: writing a custom QueryComponent
Hi all, Been a solr user for a while now, and now I need to add some functionality to solr for which I'm trying to write a custom QueryComponent. Couldn't get much help from websearch. So, turning to solr-user for help. I'm implementing search functionality for (micro)blog aggregation. We use solr 1.4.1. In the current solr config, the title and content fields are both indexed and stored in solr. Storing takes up a lot of space, even with compression. I'd like to store the title and description field in solr in mysql and retrieve these fields in results from MySQL with an id lookup. Using the DataImportHandler won't work because we store just the title and content fields in MySQL. The rest of the fields are in solr itself. I wrote a custom component by extending QueryComponent, and overriding only the finishStage(ResponseBuilder) function where I try to retrieve the necessary records from MySQL. This is how the new QueryComponent is specified in solrconfig.xml I see that the component is getting loaded from the solr debug output 1.0 0.0 ... But the strange thing is that the finishStage() function is not being called before returning results. What am I missing? Secondly, functions like ResponseBuilder._responseDocs are visible only in the package org.apache.solr.handler.component. How do I access the results in my package? If you folks can give me links to a wiki or some sample custom QueryComponent, that'll be great. -- Thanks in advance. Ishwar. Just another resurrected Neozoic Archosaur comics. http://www.flickr.com/photos/mojosaurus/sets/72157600257724083/
Re: q.alt=*:* for every request?
To be able to "see" this well, it would be lovely to have a switch that would activate a logging of the query expansion result. The Dismax QParserPlugin is particularly powerful in there so it'd be nice to see what's happening. Any logging category I need to activate? paul Le 8 févr. 2011 à 03:22, Markus Jelsma a écrit : > There is no measurable performance penalty when setting the parameter, except > maybe the execution of the query with a high value for rows. To make things > easy, you can define q.alt=*:* as default in your request handler. No need to > specifiy it in the URL. > > >> Hi, >> >> I use dismax handler with solr 1.4. >> Sometimes, my request comes with q and fq, and others doesn't come with q >> (only fq and q.alt=*:*). It's quite ok if I send q.alt=*:* for every >> request? Does it have side effects on performance?