Re: Testing Solr4 - first impressions and problems
Hi Shawn, The transaction log is only being used to support near-real-time search at the moment, I think, so it sounds like it's surplus to requirements for your use-case. I'd just turn it off. Alan Woodward www.romseysoftware.co.uk On 15 Oct 2012, at 07:04, Shawn Heisey wrote: > On 10/14/2012 5:45 PM, Erick Erickson wrote: >> About your second point. Try committing more often with openSearcher >> set to false. >> There's a bit here: >> http://wiki.apache.org/solr/SolrConfigXml >> >> >> 1 >> 15000 >> false >> >> >> >> That should keep the size of the transaction log down to reasonable levels... > > I have autocommit turned completely off -- both values set to zero. The DIH > import from MySQL, over 12 million rows per shard, is done in one go on all > my build cores at once, then I swap cores. It takes a little over three > hours and produces a 22GB index. I have batchSize set to -1 so that jdbc > streams the records. > > When I first set this up back on 1.4.1, I had some kind of severe problem > when autocommit was turned on. I can no longer remember what it caused, but > it was a huge showstopper of some kind. > > Thanks, > Shawn >
Re: Solr4 - no examples of postingsFormat in schema.xml
The extra codecs are supplied in a separate jar file now (lucene-codecs-4.0.0.jar) - I guess this isn't being packaged into solr.war by default? You should be able to download it here: http://search.maven.org/remotecontent?filepath=org/apache/lucene/lucene-codecs/4.0.0/lucene-codecs-4.0.0-javadoc.jar and drop it into the lib/ directory. On 15 Oct 2012, at 00:49, Shawn Heisey wrote: > On 10/14/2012 3:21 PM, Rafał Kuć wrote: >> Hello! >> >> Try adding the following to solrconfig.xml: >> >> > > I did this and got a little further, but still no go. From what it's saying > now, I don't think it will be possible in the current state of branch_4x to > use anything but the default. > > SEVERE: null:java.lang.IllegalArgumentException: A SPI class of type > org.apache.lucene.codecs.PostingsFormat with name 'Block' does not exist. You > need to add the corresponding JAR file supporting this SPI to your > classpath.The current classpath supports the following names: [Lucene40] > > I saw that LUCENE-4446 was applied to branch_4x a few hours ago. I did 'svn > up' and rebuilt Solr. Trying again, it appears to be using Lucene41, which I > believe is the Block format. But when I tried to change the format for my > unique key fields to Bloom, that still didn't work. Is this something I > should file an issue on? > > SEVERE: null:java.lang.IllegalArgumentException: A SPI class of type > org.apache.lucene.codecs.PostingsFormat with name 'Bloom' does not exist. You > need to add > the corresponding JAR file supporting this SPI to your classpath.The current > classpath supports the following names: [Lucene40, Lucene41] > > Thanks, > Shawn >
Re: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7
Hi Rogerio, i can imagine what it is. Tomcat extract the war-files in /var/lib/tomcatXX/webapps. If you already run an older Solr-Version on your server, the old extracted Solr-war could still be there (keyword: tomcat cache). Delete the /var/lib/tomcatXX/webapps/solr - folder and restart tomcat, when Tomcat should put your new war-file. Best regards Vadim 2012/10/14 Rogerio Pereira : > I'll try to be more specific Jack. > > I just download the apache-solr-4.0.0.zip, from this archive I took the > core1 and core2 folders from multicore example and rename them to > collection1 and collection2, I also did all necessary changes on solr.xml > and solrconfig.xml and schema.xml on these two correct to reflect the new > names. > > After this step I just tried to deploy and war file on tomcat pointing to > the the directory (solr/home) where these two cores are located, solr.xml > is there, with collection1 and collection2 properly configured. > > The question is, now matter what is contained on solr.xml, this file isn't > read at Tomcat startup, I tried to cause a parser error on solr.xml by > removing closing tags, but even with this change I can't get at least a > parser error. > > I hope to be clear now. > > > 2012/10/14 Jack Krupansky > >> I can't quite parse "the same multicore deployment as we have on apache >> solr 4.0 distribution archive". Could you rephrase and be more specific. >> What "archive"? >> >> Were you already using 4.0-ALPHA or BETA (or some snapshot of 4.0) or are >> you moving from pre-4.0 to 4.0? The directory structure did change in 4.0. >> Look at the example/solr directory. >> >> -- Jack Krupansky >> >> -Original Message- From: Rogerio Pereira >> Sent: Sunday, October 14, 2012 10:01 AM >> To: solr-user@lucene.apache.org >> Subject: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7 >> >> >> Hi, >> >> I tried to perform the same multicore deployment as we have on apache solr >> 4.0 distribution archive, I created a directory for solr/home with solr.xml >> inside and two subdirectories collection1 and collection2, these two cores >> are properly configured with conf folder and solrconfi.xml and schema.xml, >> on Tomcat I setup the system property pointing to solr/home path, >> unfortunatelly when I start tomcat the solr.xml is ignored and only the >> default collection1 is loaded. >> >> As a test, I made changes on solr.xml to cause parser errors, and guess >> what? These errors aren't reported on tomcat startup. >> >> The same thing doesn't happens on multicore example that comes on >> distribution archive, now I'm trying to figure out what's the black magic >> happening. >> >> Let me do the same kind of deployment on Windows and Mac OSX, if persist, >> I'll update this thread. >> >> Regards, >> >> Rogério >> > > > > -- > Regards, > > Rogério Pereira Araújo > > Blogs: http://faces.eti.br, http://ararog.blogspot.com > Twitter: http://twitter.com/ararog > Skype: rogerio.araujo > MSN: ara...@hotmail.com > Gtalk/FaceTime: rogerio.ara...@gmail.com > > (0xx62) 8240 7212 > (0xx62) 3920 2666
Re: add shard to index
Can you share more please? i do not know how exactly is formula for calculating ratio. if you have something like: (term count in shard 1 + term count in shard 2) / num documents in all shards then just use shard size as weight while computing this: (term count in shard 1 * shard1 keyspace size + term count in shard 2 * shard2 keyspace size) / (num documents in all shards * all shards keyspace size)
Solr reports: "Can not read response from server" when running import
Hi, I am trying to import a mysql database : sampledatabase. When I run the full import command http://localhost:8983/solr/db/dataimport?command=full-import in the browser, I get the following error in the terminal after about 1 minute. Oct 16, 2012 3:49:20 PM org.apache.solr.core.SolrCore execute INFO: [db] webapp=/solr path=/dataimport params={command=full-import} status=0 QTime=0 Oct 16, 2012 3:49:20 PM org.apache.solr.common.SolrException log SEVERE: Exception while processing: customer document : SolrInputDocument[{}]:org.apache.solr.handler.dataimport.DataImportHandlerException : Unable to execute query: select contactLastName from customers Processing Document # 1 . . . Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received and packets from server. at sun.reflect.NativeConstructorAccessImpl.newInstance0(Native Method) . . . Caused by: java.io.EOFException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost. at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3039) at com.mysql.jdbc.MysqlIO.readPacket(MysqlIO.java:592) ... 31 more . . . My dataconfig.xml file looks like the following : - - - Relevant portion of schema.xml : Relevant portion of solarconfig.xml : - - data-config.xml Kindly let me know what can be the issue. Thanks and regards, Romita Saha Panasonic R&D Center Singapore Blk 1022 Tai Seng Avenue #06-3530 Tai Seng Ind. Est. Singapore 534415 DID: (65) 6550 5383 FAX: (65) 6550 5459 email: romita.s...@sg.panasonic.com
Re: Solr reports: "Can not read response from server" when running import
Hi, On 15 Oct 2012, at 11:02, Romita Saha wrote: > My dataconfig.xml file looks like the following : > > - > url="jdbc:mysql://localhost:8983/home/demo/snp-comm/sampledatabase" /> > - > - > > > > The error information means that the connection wasn't accepted by the server. I would make sure that a) your connection URL is correct as it looks wrong to me - i.e. your database name in the URL looks like a path[1] - and b) your binding address is correct in your config file (my.cnf) and your associated host/DNS entries would let you resolve it. Cheers, Dave [1] http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-configuration-properties.html
Re: Solr reports: "Can not read response from server" when running import
Hi Dave, Thank you for your prompt reply. The name of the database am using is sampledatabase.sql and it is located in home/demo/snp-comm folder. Hence I have specified the url as url="jdbc:mysql://localhost:8983/home/demo/snp-comm/sampledatabase.sql" /> Could you please specify which conf file i need to look into? Thanks and regards, Romita From: Dave Meikle To: solr-user@lucene.apache.org, Date: 10/15/2012 06:24 PM Subject:Re: Solr reports: "Can not read response from server" when running import Hi, On 15 Oct 2012, at 11:02, Romita Saha wrote: > My dataconfig.xml file looks like the following : > > - > url="jdbc:mysql://localhost:8983/home/demo/snp-comm/sampledatabase" /> > - > - > > > > The error information means that the connection wasn't accepted by the server. I would make sure that a) your connection URL is correct as it looks wrong to me - i.e. your database name in the URL looks like a path[1] - and b) your binding address is correct in your config file (my.cnf) and your associated host/DNS entries would let you resolve it. Cheers, Dave [1] http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-configuration-properties.html
Re: Solr reports: "Can not read response from server" when running import
Hi Romita, On 15 Oct 2012, at 11:46, Romita Saha wrote: > Thank you for your prompt reply. The name of the database am using is > sampledatabase.sql and it is located in home/demo/snp-comm folder. Hence I > have specified the url as > > url="jdbc:mysql://localhost:8983/home/demo/snp-comm/sampledatabase.sql" /> I suspect this is your problem in that the MySQL JDBC driver is expecting to connect to a server where this database is hosted as opposed to the file you have specified. I assume from the name the sampledatabase.sql is just a SQL script, so I suggest you load that into a MySQL and then connect to the database on that server. Cheers, Dave
Re: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7
Hi Vadim, In fact tomcat is running in another non standard path, there's no old version deployed on tomcat, I double checked it. Let me try in another environment. -Mensagem Original- From: Vadim Kisselmann Sent: Monday, October 15, 2012 6:01 AM To: solr-user@lucene.apache.org ; rogerio.ara...@gmail.com Subject: Re: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7 Hi Rogerio, i can imagine what it is. Tomcat extract the war-files in /var/lib/tomcatXX/webapps. If you already run an older Solr-Version on your server, the old extracted Solr-war could still be there (keyword: tomcat cache). Delete the /var/lib/tomcatXX/webapps/solr - folder and restart tomcat, when Tomcat should put your new war-file. Best regards Vadim 2012/10/14 Rogerio Pereira : I'll try to be more specific Jack. I just download the apache-solr-4.0.0.zip, from this archive I took the core1 and core2 folders from multicore example and rename them to collection1 and collection2, I also did all necessary changes on solr.xml and solrconfig.xml and schema.xml on these two correct to reflect the new names. After this step I just tried to deploy and war file on tomcat pointing to the the directory (solr/home) where these two cores are located, solr.xml is there, with collection1 and collection2 properly configured. The question is, now matter what is contained on solr.xml, this file isn't read at Tomcat startup, I tried to cause a parser error on solr.xml by removing closing tags, but even with this change I can't get at least a parser error. I hope to be clear now. 2012/10/14 Jack Krupansky I can't quite parse "the same multicore deployment as we have on apache solr 4.0 distribution archive". Could you rephrase and be more specific. What "archive"? Were you already using 4.0-ALPHA or BETA (or some snapshot of 4.0) or are you moving from pre-4.0 to 4.0? The directory structure did change in 4.0. Look at the example/solr directory. -- Jack Krupansky -Original Message- From: Rogerio Pereira Sent: Sunday, October 14, 2012 10:01 AM To: solr-user@lucene.apache.org Subject: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7 Hi, I tried to perform the same multicore deployment as we have on apache solr 4.0 distribution archive, I created a directory for solr/home with solr.xml inside and two subdirectories collection1 and collection2, these two cores are properly configured with conf folder and solrconfi.xml and schema.xml, on Tomcat I setup the system property pointing to solr/home path, unfortunatelly when I start tomcat the solr.xml is ignored and only the default collection1 is loaded. As a test, I made changes on solr.xml to cause parser errors, and guess what? These errors aren't reported on tomcat startup. The same thing doesn't happens on multicore example that comes on distribution archive, now I'm trying to figure out what's the black magic happening. Let me do the same kind of deployment on Windows and Mac OSX, if persist, I'll update this thread. Regards, Rogério -- Regards, Rogério Pereira Araújo Blogs: http://faces.eti.br, http://ararog.blogspot.com Twitter: http://twitter.com/ararog Skype: rogerio.araujo MSN: ara...@hotmail.com Gtalk/FaceTime: rogerio.ara...@gmail.com (0xx62) 8240 7212 (0xx62) 3920 2666
Selective Sorting in Solr
Hi, I have many documents indexed into Solr. I am now facing a requirement where the search results should be returned sorted based on their scores. In the *case of non-exact matches*, if there is a tie, another level of sorting is to be applied on a field called priority. I am using solr with django-haystack in django 1.4. What can/should I do to achieve my requirement? What I tried: I have ordered the SearchQuerySet method by('-score', 'priority'), but this also applies to exact matches having the same score. What should I try to achieve the above? Is it even possible to achieve what I am trying? I have even posted a question on StackOverflow: http://stackoverflow.com/questions/12890165/selective-sorting-in-solr Hoping for your guidance. -- Regards, Sandip Agarwal
Re: Selective Sorting in Solr
sort=score desc, priority desc Won't that do it? Upayavira On Mon, Oct 15, 2012, at 09:14 AM, Sandip Agarwal wrote: > Hi, > > I have many documents indexed into Solr. I am now facing a requirement > where the search results should be returned sorted based on their scores. > In the *case of non-exact matches*, if there is a tie, another level of > sorting is to be applied on a field called priority. > > I am using solr with django-haystack in django 1.4. > > What can/should I do to achieve my requirement? > What I tried: > I have ordered the SearchQuerySet method by('-score', 'priority'), but > this > also applies to exact matches having the same score. What should I try to > achieve the above? > Is it even possible to achieve what I am trying? > > I have even posted a question on StackOverflow: > http://stackoverflow.com/questions/12890165/selective-sorting-in-solr > > Hoping for your guidance. > > -- > Regards, > Sandip Agarwal
Solr - nested entity need to return two fields
Following entity defintion: You see my problem is that the nested entity 'activity' need to return 2 fields namely 'activityNotes' and a field from a nested entity 'location' namely 'locationName'. How can I realize that? Both fields should be indexed.. Only chance to COALESCE both strings? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-nested-entity-need-to-return-two-fields-tp4013701.html Sent from the Solr - User mailing list archive at Nabble.com.
exception when starting single instance solr-4.0.0
Hi, while starting solr-4.0.0 I get the following exception: SEVERE: null:java.lang.IllegalAccessError: class org.apache.lucene.codecs.lucene3x.PreFlexRWPostingsFormat cannot access its superclass org.apache.lucene.codecs.lucene3x.Lucene3xPostingsFormat Very strange, because some lines earlier in the logs I have: Oct 15, 2012 2:30:24 PM org.apache.solr.core.SolrConfig initLibs INFO: Adding specified lib dirs to ClassLoader Oct 15, 2012 2:30:24 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader INFO: Adding 'file:/srv/www/solr/solr-4.0.0/lib/lucene-core-4.0-SNAPSHOT.jar' to classloader Why is solr-4.0.0 thinking that the superclass is not there? Any ideas? Regards Bernd
Re: core.SolrCore - java.io.FileNotFoundException
I have no idea how you managed to get so many files in your index directory, but that's definitely weird. How it relates to your "file not found", I'm not quite sure, but it could be something as simple as you've run out of file handles. So you could try upping the number of file handles as a _temporary_ fix just to see if that's the problem. See your op-system's manuals for how. If it does work, then I'd run an optimize down to one segment and remove all the segment files _other_ than that one segment. NOTE: this means things like .fdt, .fdx, .tii files etc. NOT things like segments.gen and segments_1. Make a backup of course before you try this. But I think that's secondary. To generate this many fiels I suspect you've started a lot of indexing jobs that you then abort (hard kill?). To get this many files I'd guess it's something programmatic, but that's a guess. How are you committing? Autocommit? From a SolrJ (or equivalent) program? Have you implemented any custom merge policies? But to your immediate problem. You can try running CheckIndex (here's a tutorial from 2.9, but I think it's still good): http://java.dzone.com/news/lucene-and-solrs-checkindex If that doesn't help (and you can run it in diagnostic mode, without the --fix flag to see what it _would_ do) then I'm afraid you'll probably have to re-index. And you've got to get to the root of why you have so many segment files. That number is just crazy Best Erick On Sun, Oct 14, 2012 at 11:20 PM, Jun Wang wrote: > PS, I have found that there lots of segment in index directory, and most of > them is empty, like . totoal file number is 35314 in index directory. > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3n.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3o.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3o.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3p.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3p.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3q.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3q.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3r.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3r.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3s.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3s.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3t.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3t.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3u.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3u.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3v.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3v.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3w.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3w.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3x.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3x.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3y.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3y.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3z.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3z.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k40.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k40.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k41.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k41.fdx > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k42.fdt > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k42.fdx > > > > > 2012/10/15 Jun Wang > >> I have encounter the a FileNotFoundException exception occasionally when >> indexing, it's not occur every time. Anyone have some clue? Here is >> the traceback: >> >> 2012-10-14 11:37:28,105 ERROR core.SolrCore - >> java.io.FileNotFoundException: >> /home/admin/run/deploy/solr/core_p_shard2/data/index/_cwo.fnm (No such file >> or directory) >> at java.io.RandomAccessFile.open(Native Method) >> at java.io.RandomAccessFile.(RandomAccessFile.java:216) >> at >> org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:218) >> at >> org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:232) >> at >> org.apache.lucene.codecs.lucene40.Lucene40FieldInfosReader.read(Lucene40FieldInfosReader.java:47) >> at >> org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:101) >> at >> org.apache.lucene.index.SegmentReader.(SegmentReader.java:55) >> at >> org.apache.lucene.index.ReadersAndLiveDocs.getReader(ReadersAndLiveDocs.java:120) >> at >> org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:267) >> at >> org.apache.lucene.index.IndexWriter.applyAllDeletes(IndexWriter.java:2928) >> at >> org.apache.lucene.index.DocumentsWriter.applyAllDeletes(DocumentsWriter.java:180) >> at >> org.apache.lucene.index.DocumentsWriter.postUpdate(DocumentsWriter.java:310) >> at >> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:386)
Copy Field Question
Can we limit copyfield source condition? for example if we want to make lookup in source="product_name" and dest="some_dest" so our syntax would become How about copying only those product_names having status=0 AND attribute1=1 AND attribute2=0. assume status,attribute1,attribute2 and product_name being two different attribute of a same table. can we write something like Thanks in advance
Re: Copy Field Question
Hello, I think you don't have that much tuning possiblities using only the schema.xml file. You will have to write some custom Java code (subclasses of UpdateRequestProcessor and UpdateRequestProcessorFactory), build a Java jar containing your custom code, put that jar in one of the path declared your solrconfig.xml ( ) -- or add a new one, and finally tune the update processors chain configuration (still in solrconfig.xml) so your custom update processor is used. See http://wiki.apache.org/solr/UpdateRequestProcessor which uses exactly your use case as an example. I hope this will help you :) -- Tanguy 2012/10/15 Virendra Goswami > Can we limit copyfield source condition? > for example if we want to make lookup in source="product_name" and > dest="some_dest" > so our syntax would become > > How about copying only those product_names having status=0 AND attribute1=1 > AND attribute2=0. > assume status,attribute1,attribute2 and product_name being two different > attribute of a same table. > can we write something like > source="attribute:1" AND source="attribute2:0" dest="some_dest" > maxchar=200> > > Thanks in advance >
Solr - Can not set java.sql.Timestamp field …created to java.util.Date
Hi there! I cannot read timestamp data from QueryResponse (want to cast result to a POJO). If Im using SolrDocumentList there are no errors. db-data-config.xml: schema.xml: the field 'created' is a timestamp in my database and after inserting index data a result looks like (called with browser admin console) : 1 2012-10-05T07:29:23.387Z message test second third Ashley 10 Morgan DISCUSSION headline test ... Now I tried to to query a 'all result' public void search(String searchString) { SolrQuery query = new SolrQuery(); QueryResponse rsp; try { query = new SolrQuery(); query.setQuery(DEFAULT_QUERY); query.setRows(246); rsp = getServer().query(query); SolrDocumentList solrDocumentList = rsp.getResults(); // IllegalArgumentException List beans = rsp.getBeans(SearchRequestResponseObject.class); //works } catch (Exception e) { // TODO Auto-generated catch block e.printStackTrace(); } } SearchRequestResponseObject.class: public class SearchRequestResponseObject { @Field private String id; @Field private String title; @Field @Temporal(TemporalType.TIMESTAMP) //@DateTimeFormat(style = "MMdd HH:mm:ss z") //@DateTimeFormat(style = "MMdd") private Timestamp created; ... } Exception: Caused by: java.lang.IllegalArgumentException: Can not set java.sql.Timestamp field com.ebcont.redbull.wtb.solr.SearchRequestResponseObject.created to java.util.Date at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:146) at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:150) at sun.reflect.UnsafeObjectFieldAccessorImpl.set(UnsafeObjectFieldAccessorImpl.java:63) at java.lang.reflect.Field.set(Field.java:657) at org.apache.solr.client.solrj.beans.DocumentObjectBinder$DocField.set(DocumentObjectBinder.java:374) ... 45 more What do Im wrong? :( -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Can-not-set-java-sql-Timestamp-field-created-to-java-util-Date-tp4013717.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Selective Sorting in Solr
Here is what I posted on StackOverflow: The boost in edismax can be used for this. It is applied to all scores, but if it is a small value, it will only make a difference for ties or near-ties. Significant differences in the base score will not be reordered. See: http://wiki.apache.org/solr/ExtendedDisMax#boost_.28Boost_Function.2C_multiplicative.29 wunder On Oct 15, 2012, at 5:16 AM, Upayavira wrote: > sort=score desc, priority desc > > Won't that do it? > > Upayavira > > On Mon, Oct 15, 2012, at 09:14 AM, Sandip Agarwal wrote: >> Hi, >> >> I have many documents indexed into Solr. I am now facing a requirement >> where the search results should be returned sorted based on their scores. >> In the *case of non-exact matches*, if there is a tie, another level of >> sorting is to be applied on a field called priority. >> >> I am using solr with django-haystack in django 1.4. >> >> What can/should I do to achieve my requirement? >> What I tried: >> I have ordered the SearchQuerySet method by('-score', 'priority'), but >> this >> also applies to exact matches having the same score. What should I try to >> achieve the above? >> Is it even possible to achieve what I am trying? >> >> I have even posted a question on StackOverflow: >> http://stackoverflow.com/questions/12890165/selective-sorting-in-solr >> >> Hoping for your guidance. >> >> -- >> Regards, >> Sandip Agarwal -- Walter Underwood wun...@wunderwood.org
Re: core.SolrCore - java.io.FileNotFoundException
Hi, Erick Thanks for your advice. My mergeFactor is set to 10, so it's impossible have so many segments, specially some .fdx, .fdt file is just empty. And sometime indexing is working fine, ended with 200+ files in data dir. My deployment is having two core and two shard for every core, using autocommit , DIH is used for pull data from DB, merge policies is using TieredMergePolicy. there is nothing customized. I am wondering how could empty .fdx file generated. may be some config in indexConfig is wrong. My final index is about 20G, having 40m+ docs. here is part of my solrconfig.xml - 32 100 10 15000 false - PS, I found an other kind of log, but I am not sure it's the reason or the consequence. I am planing to open debug log, to gather more information tomorrow. 2012-10-14 10:13:19,854 ERROR update.CommitTracker - auto commit error...:java.io.FileNotFoundException: _cwj.fdt at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:266) at org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:177) at org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:103) at org.apache.lucene.index.IndexWriter.prepareFlushedSegment(IndexWriter.java:2126) at org.apache.lucene.index.DocumentsWriter.publishFlushedSegment(DocumentsWriter.java:495) at org.apache.lucene.index.DocumentsWriter.finishFlush(DocumentsWriter.java:474) at org.apache.lucene.index.DocumentsWriterFlushQueue$SegmentFlushTicket.publish(DocumentsWriterFlushQueue.java:201) at org.apache.lucene.index.DocumentsWriterFlushQueue.innerPurge(DocumentsWriterFlushQueue.java:119) at org.apache.lucene.index.DocumentsWriterFlushQueue.tryPurge(DocumentsWriterFlushQueue.java:148) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:435) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:551) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2657) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2793) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2773) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:531) at org.apache.solr.update.CommitTracker.run(CommitTracker.java:214) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2012/10/15 Erick Erickson > I have no idea how you managed to get so many files in > your index directory, but that's definitely weird. How it > relates to your "file not found", I'm not quite sure, but it > could be something as simple as you've run out of file > handles. > > So you could try upping the number of > file handles as a _temporary_ fix just to see if that's > the problem. See your op-system's manuals for > how. > > If it does work, then I'd run an optimize > down to one segment and remove all the segment > files _other_ than that one segment. NOTE: this > means things like .fdt, .fdx, .tii files etc. NOT things > like segments.gen and segments_1. Make a > backup of course before you try this. > > But I think that's secondary. To generate this many > fiels I suspect you've started a lot of indexing > jobs that you then abort (hard kill?). To get this > many files I'd guess it's something programmatic, > but that's a guess. > > How are you committing? Autocommit? From a SolrJ > (or equivalent) program? Have you implemented any > custom merge policies? > > But to your immediate problem. You can try running > CheckIndex (here's a tutorial from 2.9, but I think > it's still good): > http://java.dzone.com/news/lucene-and-solrs-checkindex > > If that doesn't help (and you can run it in diagnostic mode, > without the --fix flag to see what it _would_ do) then I'm > afraid you'll probably have to re-index. > > And you've got to get to the root of why you have so > many segment files. That number is just crazy > > Best > Erick > > On Sun, Oct 14, 2012 at 11:20 PM, Jun Wang wrote: > > PS, I have found that there lots of segment in index directory, and most > of > > them is empty, like . totoal file number is 35314 in index directory. > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37
Re: Solr4 - no examples of postingsFormat in schema.xml
On 10/15/2012 2:47 AM, Alan Woodward wrote: The extra codecs are supplied in a separate jar file now (lucene-codecs-4.0.0.jar) - I guess this isn't being packaged into solr.war by default? You should be able to download it here: http://search.maven.org/remotecontent?filepath=org/apache/lucene/lucene-codecs/4.0.0/lucene-codecs-4.0.0-javadoc.jar and drop it into the lib/ directory. This should not be required, because I am building from source. I compiled Solr from lucene-solr source checked out from branch_4x. I grepped the entire tree for lucene-codec and found nothing. It turns out that running 'ant generate-maven-artifacts' created the jar file -- along with a huge number of other jars that I don't need. It took an extremely long time to run, for a jar that's a little over 300KB. I would argue that the codecs jar should be created by compiling a dist target for Solr. Someone else should determine whether it's appropriate to put it in the .war file, but I think it's important enough to make available without compiling everything in the Lucene universe. ncindex@bigindy5 /index/src/branch_4x $ find . | grep "\.jar$" | grep codec ./solr/core/lib/commons-codec-1.7.jar ./dist/maven/org/apache/lucene/lucene-codecs/4.1-SNAPSHOT/lucene-codecs-4.1-20121015.165734-1.jar ./dist/maven/org/apache/lucene/lucene-codecs/4.1-SNAPSHOT/lucene-codecs-4.1-20121015.165734-1-javadoc.jar ./dist/maven/org/apache/lucene/lucene-codecs/4.1-SNAPSHOT/lucene-codecs-4.1-20121015.165734-1-sources.jar ./lucene/analysis/phonetic/lib/commons-codec-1.7.jar ./lucene/build/codecs/lucene-codecs-4.1-SNAPSHOT.jar ./lucene/build/codecs/lucene-codecs-4.1-SNAPSHOT-javadoc.jar ./lucene/build/codecs/lucene-codecs-4.1-SNAPSHOT-src.jar I put this jar in my lib, and now I get a new error when I try the BloomFilter postingsFormat: SEVERE: null:java.lang.UnsupportedOperationException: Error - org.apache.lucene.codecs.bloom.BloomFilteringPostingsFormat has been constructed without a choice of PostingsFormat at org.apache.lucene.codecs.bloom.BloomFilteringPostingsFormat.fieldsConsumer(BloomFilteringPostingsFormat.java:139) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:130) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:335) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:82) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:483) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:559) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2656) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2792) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2772) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:525) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:87) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) at org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1007) at org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1750) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)
Re: Using
Hi, Thanks for the suggestions. Didn't work for me :( I'm calling which depends on org.eclipse.jetty:jetty-server which depends on org.eclipse.jetty.orbit:jettty-servlet I think I'm experiencing https://jira.codehaus.org/browse/JETTY-1493. The pom file for http://repo1.maven.org/maven2/org/eclipse/jetty/orbit/javax.servlet/3.0.0.v201112011016/javax.servlet-3.0.0.v201112011016.pom contains orbit, so ivy looks for http://repo1.maven.org/maven2/org/eclipse/jetty/orbit/javax.servlet/3.0.0.v201112011016/javax.servlet-3.0.0.v201112011016.orbit rather than http://repo1.maven.org/maven2/org/eclipse/jetty/orbit/javax.servlet/3.0.0.v201112011016/javax.servlet-3.0.0.v201112011016.jar hence my troubles. I'm an IVY newbie so maybe there is something I'm missing here? Is there another 'conf' value other than 'default' I can use? Thanks, Tricia On Fri, Oct 12, 2012 at 4:32 PM, P Williams wrote: > Hi, > > Has anyone tried using name="solr-test-framework" rev="4.0.0" conf="test->default"/> with Apache > IVY in their project? > > rev 3.6.1 works but any of the 4.0.0 ALPHA, BETA and release result in: > [ivy:resolve] :: problems summary :: > [ivy:resolve] WARNINGS > [ivy:resolve] [FAILED ] > org.eclipse.jetty.orbit#javax.servlet;3.0.0.v201112011016!javax.servlet.orbit: > (0ms) > [ivy:resolve] shared: tried > [ivy:resolve] > C:\Users\pjenkins\.ant/shared/org.eclipse.jetty.orbit/javax.servlet/3.0.0.v201112011016/orbits/javax.servlet.orbit > [ivy:resolve] public: tried > [ivy:resolve] > http://repo1.maven.org/maven2/org/eclipse/jetty/orbit/javax.servlet/3.0.0.v201112011016/javax.servlet-3.0.0.v201112011016.orbit > [ivy:resolve] :: > [ivy:resolve] :: FAILED DOWNLOADS:: > [ivy:resolve] :: ^ see resolution messages for details ^ :: > [ivy:resolve] :: > [ivy:resolve] :: > org.eclipse.jetty.orbit#javax.servlet;3.0.0.v201112011016!javax.servlet.orbit > [ivy:resolve] :: > [ivy:resolve] > [ivy:resolve] > [ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS > > Can anybody point me to the source of this error or a workaround? > > Thanks, > Tricia >
Re: Solr Cloud and Hadoop
Thank you very much Otis, regular old Solr distribute search was the piece I was missing. Now it's hands on time! -- Rui
Re: Any filter to map mutiple tokens into one ?
On 10/14/12 12:19 PM, Jack Krupansky wrote: There's a miscommunication here somewhere. Is Solr 4.0 still passing "*:*" to the analyzer? Show us the parsed query for "*:*", as well as the debugQuery "explain" for the score. I'm not quite sure what you mean by the parsed query for "*:*". This fake analyzer using NGramTokenizer divides "*:*" into three tokens "*", ":", and "*", on purpose to simulate our Tokenizer's behavior. An excerpt of he XML results from the query is pasted in the bottom of this message. I mean, "*:*" (MatchAllDocsQuery) has a "constant score", so there isn't any way for it to be "suboptimal". That's exactly the point I'd like to raise. No matter what analyzers are assigned to fields, the hit score for "*:*" must remain 1.0, but it's not happening when an analyzer that divides "*:*" are in use. Here's an excerpt of the query response. Notice this element, which should not be there, in my opinion: DisjunctionMaxQuery((name:"* : *"^0.5)) There is a space between * and :, and another space between : and *. 0 33 on 2.2 10 edismax name^0.5 *,score on 0 *:* GB18030TEST Test with some GB18030 encoded characters No accents here 这是一个功能 This is a feature (translated) 这份文件是很有光泽 This document is very shiny (translated) 0.0 0,USD true 1415830106215022592 0.14764866 ... *:* *:* (+MatchAllDocsQuery(*:*) DisjunctionMaxQuery((name:"* : *"^0.5)))/no_coord +*:* (name:"* : *"^0.5) 0.14764866 = (MATCH) sum of: 0.14764866 = (MATCH) MatchAllDocsQuery, product of: 0.14764866 = queryNorm ExtendedDismaxQParser ...
Re: Using
Apologies, there was a typo in my last message. org.eclipse.jetty.orbit:jettty-servlet should have been org.eclipse.jetty.orbit:javax.servlet On Mon, Oct 15, 2012 at 11:19 AM, P Williams wrote: > Hi, > > Thanks for the suggestions. Didn't work for me :( > > I'm calling > conf="test->default"/> > > which depends on org.eclipse.jetty:jetty-server > which depends on org.eclipse.jetty.orbit:jettty-servlet > > I think I'm experiencing https://jira.codehaus.org/browse/JETTY-1493. > > The pom file for > http://repo1.maven.org/maven2/org/eclipse/jetty/orbit/javax.servlet/3.0.0.v201112011016/javax.servlet-3.0.0.v201112011016.pom > contains orbit, so ivy looks for > http://repo1.maven.org/maven2/org/eclipse/jetty/orbit/javax.servlet/3.0.0.v201112011016/javax.servlet-3.0.0.v201112011016.orbit > rather > than > http://repo1.maven.org/maven2/org/eclipse/jetty/orbit/javax.servlet/3.0.0.v201112011016/javax.servlet-3.0.0.v201112011016.jar > hence > my troubles. > > I'm an IVY newbie so maybe there is something I'm missing here? Is there > another 'conf' value other than 'default' I can use? > > Thanks, > Tricia > > > > On Fri, Oct 12, 2012 at 4:32 PM, P Williams < > williams.tricia.l...@gmail.com> wrote: > >> Hi, >> >> Has anyone tried using > name="solr-test-framework" rev="4.0.0" conf="test->default"/> with >> Apache IVY in their project? >> >> rev 3.6.1 works but any of the 4.0.0 ALPHA, BETA and release result in: >> [ivy:resolve] :: problems summary :: >> [ivy:resolve] WARNINGS >> [ivy:resolve] [FAILED ] >> org.eclipse.jetty.orbit#javax.servlet;3.0.0.v201112011016!javax.servlet.orbit: >> (0ms) >> [ivy:resolve] shared: tried >> [ivy:resolve] >> C:\Users\pjenkins\.ant/shared/org.eclipse.jetty.orbit/javax.servlet/3.0.0.v201112011016/orbits/javax.servlet.orbit >> [ivy:resolve] public: tried >> [ivy:resolve] >> http://repo1.maven.org/maven2/org/eclipse/jetty/orbit/javax.servlet/3.0.0.v201112011016/javax.servlet-3.0.0.v201112011016.orbit >> [ivy:resolve] :: >> [ivy:resolve] :: FAILED DOWNLOADS:: >> [ivy:resolve] :: ^ see resolution messages for details ^ :: >> [ivy:resolve] :: >> [ivy:resolve] :: >> org.eclipse.jetty.orbit#javax.servlet;3.0.0.v201112011016!javax.servlet.orbit >> [ivy:resolve] :: >> [ivy:resolve] >> [ivy:resolve] >> [ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS >> >> Can anybody point me to the source of this error or a workaround? >> >> Thanks, >> Tricia >> > >
Re: Any filter to map mutiple tokens into one ?
And you're absolutely certain you see "*:*" being passed to your analyzer in the final release of Solr 4.0??? -- Jack Krupansky -Original Message- From: T. Kuro Kurosaka Sent: Monday, October 15, 2012 1:28 PM To: solr-user@lucene.apache.org Subject: Re: Any filter to map mutiple tokens into one ? On 10/14/12 12:19 PM, Jack Krupansky wrote: There's a miscommunication here somewhere. Is Solr 4.0 still passing "*:*" to the analyzer? Show us the parsed query for "*:*", as well as the debugQuery "explain" for the score. I'm not quite sure what you mean by the parsed query for "*:*". This fake analyzer using NGramTokenizer divides "*:*" into three tokens "*", ":", and "*", on purpose to simulate our Tokenizer's behavior. An excerpt of he XML results from the query is pasted in the bottom of this message. I mean, "*:*" (MatchAllDocsQuery) has a "constant score", so there isn't any way for it to be "suboptimal". That's exactly the point I'd like to raise. No matter what analyzers are assigned to fields, the hit score for "*:*" must remain 1.0, but it's not happening when an analyzer that divides "*:*" are in use. Here's an excerpt of the query response. Notice this element, which should not be there, in my opinion: DisjunctionMaxQuery((name:"* : *"^0.5)) There is a space between * and :, and another space between : and *. 0 33 on 2.2 10 edismax name^0.5 *,score on 0 *:* GB18030TEST Test with some GB18030 encoded characters No accents here 这是一个功能 This is a feature (translated) 这份文件是很有光泽 This document is very shiny (translated) 0.0 0,USD true 1415830106215022592 0.14764866 ... *:* *:* (+MatchAllDocsQuery(*:*) DisjunctionMaxQuery((name:"* : *"^0.5)))/no_coord +*:* (name:"* : *"^0.5) 0.14764866 = (MATCH) sum of: 0.14764866 = (MATCH) MatchAllDocsQuery, product of: 0.14764866 = queryNorm ExtendedDismaxQParser ...
Re: Solr4 - no examples of postingsFormat in schema.xml
> > This should not be required, because I am building from source. I compiled > Solr from lucene-solr source checked out from branch_4x. I grepped the > entire tree for lucene-codec and found nothing. > > It turns out that running 'ant generate-maven-artifacts' created the jar file > -- along with a huge number of other jars that I don't need. It took an > extremely long time to run, for a jar that's a little over 300KB. > > I would argue that the codecs jar should be created by compiling a dist > target for Solr. Someone else should determine whether it's appropriate to > put it in the .war file, but I think it's important enough to make available > without compiling everything in the Lucene universe. I agree - it looks as though the codecs module wasn't added to the solr build when it was split off. I've created a JIRA ticket (https://issues.apache.org/jira/browse/SOLR-3947) and added a patch. On the error below, I'll have to defer to someone who knows how this actually works... > > I put this jar in my lib, and now I get a new error when I try the > BloomFilter postingsFormat: > > SEVERE: null:java.lang.UnsupportedOperationException: Error - > org.apache.lucene.codecs.bloom.BloomFilteringPostingsFormat has been > constructed without a choice of PostingsFormat >at > org.apache.lucene.codecs.bloom.BloomFilteringPostingsFormat.fieldsConsumer(BloomFilteringPostingsFormat.java:139) >at > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:130) >at > org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:335) >at > org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) >at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117) >at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) >at > org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:82) >at > org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:483) >at > org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422) >at > org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:559) >at > org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2656) >at > org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2792) >at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2772) >at > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:525) >at > org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:87) >at > org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) >at > org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1007) >at > org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69) >at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) >at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) >at org.apache.solr.core.SolrCore.execute(SolrCore.java:1750) >at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455) >at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276) > >
Re: Spatial Search response time complexity
Hi TJ. If you use a circle query shape, it's O(N), plus it puts all the points in memory. If you use a rectangle via bbox then I'm not sure but its fast enough that I wouldn't worry about it. If my understanding is correct on Lucene TrieRange fields, it's O(Log(N)). If you want fast filtering no matter what the query shape is, then I suggest Solr 4.0 SpatialRecursivePrefixTreeFieldType ("location_rpt" in the example schema) ~ David Smiley On Oct 9, 2012, at 5:00 PM, TJ Tong wrote: > Hi all, > > Does anyone know the Solr (lucene)spatial search time complexity, such as > geofilt on LatLonType fields? Is it logN? > > Thanks! > TJ > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Spatial-Search-response-time-complexity-tp4012801.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Any filter to map mutiple tokens into one ?
On 10/15/12 10:35 AM, Jack Krupansky wrote: And you're absolutely certain you see "*:*" being passed to your analyzer in the final release of Solr 4.0??? I don't have a direct evidence. This is the only theory I have that explains why changing FieldType causes the sub-optimal scores. If you know of a way to tell if a tokenizer is really invoked, let me know. -- Jack Krupansky -Original Message- From: T. Kuro Kurosaka Sent: Monday, October 15, 2012 1:28 PM To: solr-user@lucene.apache.org Subject: Re: Any filter to map mutiple tokens into one ? On 10/14/12 12:19 PM, Jack Krupansky wrote: There's a miscommunication here somewhere. Is Solr 4.0 still passing "*:*" to the analyzer? Show us the parsed query for "*:*", as well as the debugQuery "explain" for the score. I'm not quite sure what you mean by the parsed query for "*:*". This fake analyzer using NGramTokenizer divides "*:*" into three tokens "*", ":", and "*", on purpose to simulate our Tokenizer's behavior. An excerpt of he XML results from the query is pasted in the bottom of this message. I mean, "*:*" (MatchAllDocsQuery) has a "constant score", so there isn't any way for it to be "suboptimal". That's exactly the point I'd like to raise. No matter what analyzers are assigned to fields, the hit score for "*:*" must remain 1.0, but it's not happening when an analyzer that divides "*:*" are in use. Here's an excerpt of the query response. Notice this element, which should not be there, in my opinion: DisjunctionMaxQuery((name:"* : *"^0.5)) There is a space between * and :, and another space between : and *. 0 33 on 2.2 10 edismax name^0.5 *,score on 0 *:* GB18030TEST Test with some GB18030 encoded characters No accents here 这是一个功能 This is a feature (translated) 这份文件是很有光泽 This document is very shiny (translated) 0.0 0,USD true 1415830106215022592 0.14764866 ... *:* *:* (+MatchAllDocsQuery(*:*) DisjunctionMaxQuery((name:"* : *"^0.5)))/no_coord +*:* (name:"* : *"^0.5) 0.14764866 = (MATCH) sum of: 0.14764866 = (MATCH) MatchAllDocsQuery, product of: 0.14764866 = queryNorm ExtendedDismaxQParser ...
solrcloud: what if ZK instances are evanescent?
Hi Folks, I have been looking at solrcloud to solve some of our problems with solr in a distributed environment. As you know, in such an environment, every instance of solr or zookeeper can come into existence and go out of existence - at any time. So what happens if instances of ZK disappear and re-appear with different hostnames and DNS entries? How would solr know about these instances and how would it re-sync with these instances? In essence my question is: what if the hostname and port of the ZK instance no longer exists - how will solrcloud discover the new instance(s)? Thanks, John -- View this message in context: http://lucene.472066.n3.nabble.com/solrcloud-what-if-ZK-instances-are-evanescent-tp4013740.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr4 - no examples of postingsFormat in schema.xml
See discussion on https://issues.apache.org/jira/browse/SOLR-3843, this was apparently intentional. That also links to the following: http://wiki.apache.org/solr/SolrConfigXml#codecFactory, which suggests you need to use solr.SchemaCodecFactory for per-field codecs - this might solve your postingsFormat exception. On 15 Oct 2012, at 18:41, Alan Woodward wrote: > >> >> This should not be required, because I am building from source. I compiled >> Solr from lucene-solr source checked out from branch_4x. I grepped the >> entire tree for lucene-codec and found nothing. >> >> It turns out that running 'ant generate-maven-artifacts' created the jar >> file -- along with a huge number of other jars that I don't need. It took >> an extremely long time to run, for a jar that's a little over 300KB. >> >> I would argue that the codecs jar should be created by compiling a dist >> target for Solr. Someone else should determine whether it's appropriate to >> put it in the .war file, but I think it's important enough to make available >> without compiling everything in the Lucene universe. > > I agree - it looks as though the codecs module wasn't added to the solr build > when it was split off. I've created a JIRA ticket > (https://issues.apache.org/jira/browse/SOLR-3947) and added a patch. > > On the error below, I'll have to defer to someone who knows how this actually > works... > >> >> I put this jar in my lib, and now I get a new error when I try the >> BloomFilter postingsFormat: >> >> SEVERE: null:java.lang.UnsupportedOperationException: Error - >> org.apache.lucene.codecs.bloom.BloomFilteringPostingsFormat has been >> constructed without a choice of PostingsFormat >> at >> org.apache.lucene.codecs.bloom.BloomFilteringPostingsFormat.fieldsConsumer(BloomFilteringPostingsFormat.java:139) >> at >> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:130) >> at >> org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:335) >> at >> org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) >> at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117) >> at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) >> at >> org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:82) >> at >> org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:483) >> at >> org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422) >> at >> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:559) >> at >> org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2656) >> at >> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2792) >> at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2772) >> at >> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:525) >> at >> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:87) >> at >> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) >> at >> org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1007) >> at >> org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69) >> at >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) >> at >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1750) >> at >> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455) >> at >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276) >> >> >
Re: Solr 4 spatial search - point intersects polygon
Hi Jorge, Please see the notes on Polygons: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4#JTS_.2BAC8_WKT_.2BAC8_Polygon_notes This bullet in particular is relevant: • The standard way to specify a rectangle in WKT is a Polygon -- WKT doesn't have a rectangle shape. If you want to specify a Rectangle via WKT (instead of the Spatial4j basic non-WKT syntax), you should take care to specify the coordinates in counter-clockwise order, the WKT standard. If this is done wrong then the rectangle will go the opposite direction longitudinally, even if it means one that spans nearly the entire globe (>180 degrees width). OpenLayers seems to not honor the WKT standard here, and depending on the corner you drag the rectangle from, might use a clockwise order. Some systems like PostGIS don't care what the ordering is, but the problem there is that there is then no way to specify a rectangle that has >= 180 width because there would be ambiguity. Spatial4j follows the WKT spec. You aren't the first to have run into this problem. Perhaps I should add a mode in which you cannot specify rectangles with a width >= 180 but in exchange your rectangle will always go the way you intended (assuming always < 180) without having to worry about coordinate order. ~ David Smiley On Oct 8, 2012, at 5:25 AM, Jorge Suja wrote: > Hi everyone, > > I've been playing around with the new spatial search functionalities > included in the newer versions of solr (solr 4.1 and solr trunk 5.0), and > i've found something strange when I try to find a point inside a polygon > (particularly inside a square). > > You can reproduce this problem using the spatial-solr-sandbox project that > has the following config for the fields: > > /[...] > units="degrees" /> > [...] > multiValued="false" /> > [...]/ > > I'm trying to find the following document: > > / > G292223 > Dubai > 55.28 25.252220 > > / > I want to test if this point is located inside a polygon so i'm using the > following query: > > /q=geohash:"Intersects(POLYGON((55.18 25.352220,55.38 > 25.352220,55.38 25.152220,55.18 25.152220,55.18 25.352220)))"/ > > As you can see, it's a small square that contains the point described > before. I get some results, but that document is not there, and the ones > returned are wrong since they are not even inside the square. > > / > > G1809498 > Guilin > 110.286390 25.281940 > > > [...]/ > > However, if i change a little bit the shape of the square (just changed a > little bit one corner), it returns the result as expected > > /q=geohash:"Intersects(POLYGON((55.18 25.352220,*55.48* > 25.352220,55.38 25.152220,55.18 25.152220,55.18 25.352220)))"/ > > Now it returns a single result and it's OK > > / > > G292223 > Dubai > 55.28 25.252220 > > / > > > If i use a bbox with the same size and position than the first square, it > returns correctly the document. > > /q=geohash:"Intersects(55.18 25.152220 55.38 25.352220)" > > > > G292223 > Dubai > 55.28 25.252220 > > / > > If you draw another polygon such a triangle it works well too. > > I've tested this against different points and it's always the same, it seems > that if you draw a straight square (or rectangle), > it can't find the point inside it, and it returns wrong results. > > Am i doing anything wrong? > > Thanks in advance > > Jorge > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-4-spatial-search-point-intersects-polygon-tp4012402.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: ExternalFileField/FileFloatSource improvements
Hi Alan, I don't have any direct feedback... but I know there is an issue that you may want to be aware of (and incorporate?) - https://issues.apache.org/jira/browse/SOLR-3514 Otis -- Search Analytics - http://sematext.com/search-analytics/index.html Performance Monitoring - http://sematext.com/spm/index.html On Mon, Oct 15, 2012 at 9:37 AM, Alan Woodward wrote: > Hi list, > > I'm having a go at improving the performance of ExternalFileField (relevant > to this thread: > http://lucene.472066.n3.nabble.com/Reloading-ExternalFileField-blocks-Solr-td4012399.html#a4013305), > and thought I'd get some feedback. What do people think of the following? > > - FileFloatSource needs to be updated in three cases: > - when new segments are added > - when segments are merged > - when the external file source is updated > > In our use-case, new documents will not have values in the external file (it > contains things like click-data, which will only appear after the document > has been in the index for a while), so we don't need to reload when new > segments are added. > > My plan is to hook the cache refresh into either newSearcher or postCommit. > I change the FileFloatSource internals to be keyed on individual > SegmentReaders rather than top-level IndexReaders, so existing float caches > don't need to be reloaded for unchanged segments; I (somehow?) detect if > segments with empty caches contain new documents (in which case we can just > give them all default values) or are the result of merges (in which case we > need to reload the external file and repopulate). > > I also plan to modify the reloadCaches update handler so that instead of just > clearing the cache (and hence slowing down the next query to hit, as the new > caches are lazy-loaded), it reloads the file in the background and then cuts > over to the new caches. > > I'll open a JIRA and post patches once I've begun the actual implementation, > but if anybody notices something that would stop this working, it would be > nice to hear about it before I start… :-) > > Thanks, > > Alan Woodward
Re: exception when starting single instance solr-4.0.0
My first guess would be a classpath error given this references lucene3x. Since all that's deprecated, is there any chance you're somehow getting a current trunk (5x) jar in there somehow? Because I see no such error when I start 4.0... Best Erick On Mon, Oct 15, 2012 at 8:42 AM, Bernd Fehling wrote: > Hi, > while starting solr-4.0.0 I get the following exception: > > SEVERE: null:java.lang.IllegalAccessError: > class org.apache.lucene.codecs.lucene3x.PreFlexRWPostingsFormat cannot access > its superclass org.apache.lucene.codecs.lucene3x.Lucene3xPostingsFormat > > > Very strange, because some lines earlier in the logs I have: > > Oct 15, 2012 2:30:24 PM org.apache.solr.core.SolrConfig initLibs > INFO: Adding specified lib dirs to ClassLoader > Oct 15, 2012 2:30:24 PM org.apache.solr.core.SolrResourceLoader > replaceClassLoader > INFO: Adding 'file:/srv/www/solr/solr-4.0.0/lib/lucene-core-4.0-SNAPSHOT.jar' > to classloader > > Why is solr-4.0.0 thinking that the superclass is not there? > > Any ideas? > > Regards > Bernd
Re: exception when starting single instance solr-4.0.0
: SEVERE: null:java.lang.IllegalAccessError: : class org.apache.lucene.codecs.lucene3x.PreFlexRWPostingsFormat cannot access : its superclass org.apache.lucene.codecs.lucene3x.Lucene3xPostingsFormat that sounds like a classpath error. : Very strange, because some lines earlier in the logs I have: : : Oct 15, 2012 2:30:24 PM org.apache.solr.core.SolrConfig initLibs : INFO: Adding specified lib dirs to ClassLoader : Oct 15, 2012 2:30:24 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader : INFO: Adding 'file:/srv/www/solr/solr-4.0.0/lib/lucene-core-4.0-SNAPSHOT.jar' to classloader ...and that looks like a mistake. based on that log line, you either have a copy of the lucene core jar in the implicit "lib" dir for your solr core, or you have an explicit directive pointed somewhere that contains a copy of the lucene-core jar -- either way telling slr to load the lucene-core jar as a plugin. but lucene-core should not be loaded as a plugin. lucene-core is already in the solr.war, and should have been loaded long before SolrConfig started looking for plugin libraries. which means you probably have two copies of the lucene-core jar ... and if you have two copies of that jar, you probably have two copies of oher lucene jars. which begs the questions: * what is you solr home dir? (i'me guessing maybe it's "/srv/www/solr/solr-4.0.0/" ?) * why do you have a copy of lucene-core in /srv/www/solr/solr-4.0.0/lib ? * what directives do you have in your solrconfig.xml and why? -Hoss
Re: How do I make Soft Commits thru' EmbeddedSolrServer visible to Searcher?
After a bit of research; I realized that if I am using EmbeddedSolrServer then I need to also do a hard commit in the Searcher (which runs in a separate jvm). So I tried that; but am getting a LockException. Looks like the EmbeddedSolrServer locks the Solr index for writing & when I try to do a commit in the searcher the LockException is thrown. Is there any way around this? -- View this message in context: http://lucene.472066.n3.nabble.com/How-do-I-make-Soft-Commits-thru-EmbeddedSolrServer-visible-to-Searcher-tp4012776p4013769.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr 4.0 spatial questions
Hi Matt. The documentation is here: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 The sort / relevancy section is a TODO; I've been improving this document lately a bit at a time lately. My comments are within... On Oct 5, 2012, at 10:10 AM, Matt Mitchell wrote: > Hi, > > Apologies if some of this has been asked before. I searched the list, > found similar questions but the suggestions didn't solve my issues. > > I've been playing with the new spatial features in Solr trunk, very > cool. I successfully indexed a MULTIPOLYGON and could see my query > working using the "Intersects" function <- that is very exciting! My > question is, how can I find out more info on this stuff? Some of the > things I'm looking for, specifically: > > What functions are available? For example, is there a "contains" > function? Is there java source-code I could look at to figure out > what's available? SpatialOperation.java. For what's in Lucene / Solr 4.0, the only operation that is effectively implemented right now is INTERSECTS. WITHIN is supported by PointVector field type but that is semantically equivalent to INTERSECTS when the indexed data is Points, and PointVector as its name suggests only supports points. In the future, I figure my employer will have the need for a WITHIN and CONTAINS operation, and I know how to add that to the RecursivePrefixTree based field types. It won't be easy. I believe Chris Male has already done this on the ElasticSearch port of Lucene spatial, but I haven't looked at it. > Is there a way to dynamically buffer a geometry, then query on that > buffered geometry? I have this at work but it's not yet in the open-source offering. It's pretty easy thanks to JTS, which does the hard work (it's just a method call). Once we get an open-source extensible WKT parser in Spatial4j (which Chris has already done for ElasticSearch, so it's going to happen in the very near future), we can then add a buffer operation. > Can I get the distance (as a pseudo field) to a stored/indexed > MULTIPOLYGON from a given point? If you are already sorting it, then see the example below (notice the "distdeg" pseudo-field alias). The solution below will work even if you don't sort it but it will trigger RAM requirements that are a bit needless. If you don't want the RAM requirements, then you should perform this calculation yourself at the client. > What about sorting by distance to MULTIPOLYGON from point? Yes... though I'm not happy with the implementation. I recommend you index a field just for the center point. If there is going to be only one per document, then use PointVector or LatLonType. If there is multiple... then you're stuck with the existing implementation with seems to work but definitely isn't scalable for real-time-search nor for millions of documents or more. Here's a comment on the JIRA issue where I left an example: https://issues.apache.org/jira/browse/SOLR-3304?focusedCommentId=13456188&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13456188 That query is: http://localhost:8983/solr/select?q=*%3A*&wt=xml&fq={!%20v=$sq}&sq=store:%22Intersects%28Circle%2854.729696,-98.525391%20d=10%29%29%22&debugQuery=on&sort=query%28$sortsq%29+asc&fl=id,store,score,distdeg:query%28$sortsq%29&sortsq={!%20score=distance%20v=$sq} > Can or will it be possible to transform shapes, for example select the > minimum-bounding-box of a complex shape? Another example would be > extracting the center point of a polygon. BBox of an indexed shape is not really supported so you'd have to index the bbox as a rectangle, probably via Lucene 5 spatial BBoxStrategy. For a query shape... that is one of those operations, like a buffer, that I'd like to add. > I've tried to sort and get the distance using some of the tips on the > Wiki, but couldn't get any of it to work. > > I'd be glad to get some of this into the Wiki too. Just to repeat: The documentation is here: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 ~ David
Re: Solr4 without slf4j bindings -- apparent catch-22
: I'm trying to get a Solr4 install going, building without slf4j bindings. I ... : If I use the standard .war file, sharedLib works as I would expect it to. The : solr.xml file is found and it finds the sharedLib directory just fine, as you : can see from this log excerpt: ... : INFO: Adding 'file:/index/solr4/lib/slf4j-api-1.7.2.jar' to classloader ... : The problem that I am having is with the -excl-slf4j.war file, which I am : trying to use in order to use log4j instead of jdk logging. When I do that, : it seems to be unable to find the sharedLib folder in solr.xml. Because it : can't find any slf4j bindings at all, I cannot see what's going on in the log. : Entire log included: I think one, or both, of us is confused about how the dist-war-excl-slf4j target is intended to be used. I'm fairly certain you can't try to use slf4j/log4j from the sharedLib -- because at that point a lot of solr has already been initialized and already started doing logging so slf4j should have already tried to resolve the binding it should use, found nothing, and picked it's default NOP implementation -- as you can see in your logs... : SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". : SLF4J: Defaulting to no-operation (NOP) logger implementation : SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further : details. I *believe* the intended way to us the -excl-slf4j.war is by having the servlet container load both the log4j.jar and the slf4j binding for log4j -- either by putting it in the jetty/lib, or by specifing them in the runtime classpath -- but I think you also need to configure log4j at the servlet container level so that it will be initialized. : I also tried putting the slf4j jars in /opt/solr4/lib (jetty's lib directory). : Unsurprisingly, there was no change. Where can I put the jars to make this did you move them or copy them, because it wouldn't suprise me if having duplicate copies of those slf4j jars in sharedLib broke logging in solr even if things were configured and working properly at the jetty level. -Hoss
Re: Solr4 without slf4j bindings -- apparent catch-22
: As an interim measure, I tried putting the jars in a separate directory and : added a commandline option for the classpath. I also downgraded to 1.6.4, : because despite asking for a war without it, the war still contains slf4j-api : version 1.6.4. The log still shows that it failed to find a logger binding - : no difference from above. As a followup to my other comments: I think the reason the slf4j-api jar is left in the war is because the "exclusion" is only of the specifc binding used. users can't arbitrarily drop in any version of slf4j that they want at runtime, the slf4j-api has to match what solr was compiled against so that the logging calls solr makes will still work. : -Djava.util.logging.config.file=etc/logging.properties option. Trying to set : that property in jetty.xml according to the wiki didn't work. I notice that : the example says 'mortbay' ... perhaps Jetty 8 does it differently? Very probably -Hoss
Re: Solr4 without slf4j bindings -- apparent catch-22
> slf4j-api has to match what solr was compiled against so that the logging > calls solr makes will still work. To my knowledge, that's not strictly true: http://www.slf4j.org/faq.html#compatibility Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Mon, Oct 15, 2012 at 3:35 PM, Chris Hostetter wrote: > slf4j-api has > to match what solr was compiled against so that the logging calls solr > makes will still work.
Re: Solr4 - no examples of postingsFormat in schema.xml
On 10/15/2012 12:38 PM, Alan Woodward wrote: See discussion on https://issues.apache.org/jira/browse/SOLR-3843, this was apparently intentional. That also links to the following: http://wiki.apache.org/solr/SolrConfigXml#codecFactory, which suggests you need to use solr.SchemaCodecFactory for per-field codecs - this might solve your postingsFormat exception. I already added this to my solrconfig.xml as a top-level element: Once I added this, I tried Bloom, but I had an incorrect name. That resulted in this error, showing that the codecFactory config element gave me more choices than Lucene40 and Lucene41: SEVERE: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.codecs.PostingsFormat with name 'Bloom' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath.The current classpath supports the following names: [Lucene40, Lucene41, Pulsing41, SimpleText, Memory, BloomFilter, Direct] Once I got that, I knew I had made some progress, so I changed it to BloomFilter and got the error in the previous message. Repasting here without the full stacktrace: SEVERE: null:java.lang.UnsupportedOperationException: Error - org.apache.lucene.codecs.bloom.BloomFilteringPostingsFormat has been constructed without a choice of PostingsFormat Based on that error message, along with something I remember reading during my Google travels, I suspect that not all codecs (BloomFilter being a prime example) have whatever corresponding Solr bits are required. Thanks, Shawn
Re: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7
: on Tomcat I setup the system property pointing to solr/home path, : unfortunatelly when I start tomcat the solr.xml is ignored and only the Please elaborate on how exactly you pointed tomcat at your solr/home. you mentioned "system property" but when using system properties to set the Solr Home" you wnat to set "solr.solr.home" .. "solr/home" is the JNDI variable name used as an alternative. if you look at the logging when solr first starts up, you should ese several messages about how/where it's trying to locate the Solr Home Dir ... please double check that it's finding the one you intended. Please give us more details about those log messages related to the solr home dir, as well as how you are trying to set it, and what your directory structure looks like in tomcat. If you haven't seen it yet... https://wiki.apache.org/solr/SolrTomcat -Hoss
Re: Testing Solr4 - first impressions and problems
: I have autocommit turned completely off -- both values set to zero. The DIH ... : When I first set this up back on 1.4.1, I had some kind of severe problem when : autocommit was turned on. I can no longer remember what it caused, but it was : a huge showstopper of some kind. the key question about using autocommit is wether or not you use "openSearcher" with it and wether you have the updateLog turned on. as i understand it: if you don't care about real time get, or transaction recovery of "uncommited documents" on hard crash, or any of the Solr Coud features, then you don't need the updateLog -- and you shouldn't add it to your existing configs when upgrading to Solr4. any existing usage (or non-usage) you had of autocommit should continue to work fine. If you *do* care about things that require the updateLog, then you want to ensure that you are doing "hard commits" (ie: perisisting the index to disk) relatively frequently in order to keep the size of the updateLog from growing w/o bound -- but in Solr 4, doing a hard commit no longer requires that you open a new searcher. opening a new searcher and dealing with the cache loading is one of the main reasons people typically avoided autoCommit in the past. So if you look at the Solr 4 example: it uses the updateLog combined with a 15 second autoCommit that has openSearcher=false -- meaning that the autocommit logic is ensuring that anytime the index has modifications they are written to disk every 15 seconds, but the new documents aren't exposed to search clients as a result of those autocommits, and if a client uses real time get, or if there is a a hard crash, the uncommited docs are still available in the udpateLog. For your usecase and upgrade: don't add the updateLog to your configs, and don't add autocommit to your configs, and things should work fine. if you decide you wnat to start using something that requires the updateLog, you should probably add a short autoCommit with openSearcher=false. -Hoss
Re: Solr4 without slf4j bindings -- apparent catch-22
On 10/15/2012 1:32 PM, Chris Hostetter wrote: I think one, or both, of us is confused about how the dist-war-excl-slf4j target is intended to be used. You arevery likely correct, and it's probably me that's confused. I'm fairly certain you can't try to use slf4j/log4j from the sharedLib -- because at that point a lot of solr has already been initialized and already started doing logging so slf4j should have already tried to resolve the binding it should use, found nothing, and picked it's default NOP implementation -- as you can see in your logs... : SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". : SLF4J: Defaulting to no-operation (NOP) logger implementation : SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further : details. I *believe* the intended way to us the -excl-slf4j.war is by having the servlet container load both the log4j.jar and the slf4j binding for log4j -- either by putting it in the jetty/lib, or by specifing them in the runtime classpath -- but I think you also need to configure log4j at the servlet container level so that it will be initialized. I tried both of these-- putting them in jetty's lib, as well as putting them inan arbitrary directory and and putting the relative path (blah/filename.jar) on the commandline with -cp (and -classpath). I suspect what I will need to do is create the standard war, extract it, fiddle with the contents, and then make a new war. Not terribly automated, but upgrading is not something I will be doing all that often. In my test environment (where multiple back to back compiles may be commonplace) it will be a bit painful, but I suppose I can just build the standard war and use jdk logging there, until I'm ready to deploy to production. : I also tried putting the slf4j jars in /opt/solr4/lib (jetty's lib directory). : Unsurprisingly, there was no change. Where can I put the jars to make this did you move them or copy them, because it wouldn't suprise me if having duplicate copies of those slf4j jars in sharedLib broke logging in solr even if things were configured and working properly at the jetty level. I thought of this, and was using 'mv' for each test iteration. Thanks, Shawn
Re: Testing Solr4 - first impressions and problems
On 10/15/2012 2:51 PM, Chris Hostetter wrote: For your usecase and upgrade: don't add the updateLog to your configs, and don't add autocommit to your configs, and things should work fine. if you decide you wnat to start using something that requires the updateLog, you should probably add a short autoCommit with openSearcher=false. Thank you for your answer. Using updateLog seems to have another downside -- a huge hit to performance. It wouldn't be terrible on incremental updates. These happen once a minute and normally complete extremely quickly - less than a second, followed by a commit that may take 2-3 seconds. If it took 5-10 seconds instead of 3, that's not too bad. But when you are expecting a process to take three hours and it actually takes 8-10 hours, it's another story. Shawn
With Grouping enabled, 0 results yields maxScore of -Infinity
I see that when there are 0 results with the grouping enabled, the max score is -Infinity which causes parsing problems on my client. Without grouping enabled the max score is 0.0. Is there any particular reason for this difference? If not, would there be any resistance to submitting a patch that will set the score to 0 if the numFound is 0 in the grouping component? I see code that sets the max score to -Infinity and then will set it to a different value when iterating over some set of scores. With 0 scores, then it stays as -Infinity and serializes out as such. I'll be more than happy to work on this patch but before I do, I wanted to check that I am not missing something first. Thanks Amit
How many documents in each Lucene segment?
Is there any way to easily determine how many documents exist in a Lucene index segment? Ideally I want to check the document counts in segments on an index that is being built by a large MySQL dataimport, before the dataimport completes. If that's not possible, I can take steps to do a smaller import and make sure the changes are committed. Thanks, Shawn
RE: How many documents in each Lucene segment?
Easiest way I know of without parsing any of the index files is to take the size of the fdx file in bytes and divide by 8. This will give you the exact number of documents before 4.0, and a close approximation in 4.0. Though, the fdx file might not be on disk if you haven't committed. -Michael -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: Monday, October 15, 2012 9:21 PM To: solr-user@lucene.apache.org Subject: How many documents in each Lucene segment? Is there any way to easily determine how many documents exist in a Lucene index segment? Ideally I want to check the document counts in segments on an index that is being built by a large MySQL dataimport, before the dataimport completes. If that's not possible, I can take steps to do a smaller import and make sure the changes are committed. Thanks, Shawn
Solr Autocomplete
Hi, I am using mysql for solr indexing data in solr. I have two fields: "name" and "college". How can I add auto suggest based on these two fields? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autocomplete-tp4013859.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autocomplete
> I am using mysql for solr indexing data in solr. I have two > fields: "name" > and "college". How can I add auto suggest based on these two > fields? Here is a blog post and code an example. http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/
Re: Solr Autocomplete
http://find.searchhub.org/?q=autosuggest+OR+autocomplete - Original Message - | From: "Rahul Paul" | To: solr-user@lucene.apache.org | Sent: Monday, October 15, 2012 9:01:14 PM | Subject: Solr Autocomplete | | Hi, | I am using mysql for solr indexing data in solr. I have two fields: | "name" | and "college". How can I add auto suggest based on these two fields? | | | | -- | View this message in context: | http://lucene.472066.n3.nabble.com/Solr-Autocomplete-tp4013859.html | Sent from the Solr - User mailing list archive at Nabble.com. |
Re: How many documents in each Lucene segment?
On 10/15/2012 8:06 PM, Michael Ryan wrote: Easiest way I know of without parsing any of the index files is to take the size of the fdx file in bytes and divide by 8. This will give you the exact number of documents before 4.0, and a close approximation in 4.0. Though, the fdx file might not be on disk if you haven't committed. When you are importing 12 million documentsfrom a database, you get LOTS of completed segments even if there is no commit until the end. The ramBuffer fills up pretty quick. I intend to figure out how many documents are in the segments (ramBufferSizeMB=256) and try out an autoCommit setting a little bit lower than that. I had trouble with autoCommit on previous versions, but with 4.0 I can turn off openSearcher, which may allow it to work right. Thanks, Shawn
Re: exception when starting single instance solr-4.0.0
The solr home dir is as suggested for solr 4.0 to be located below jetty. So my directory structure is: /srv/www/solr/solr-4.0.0/ -- dist ** has all apache solr and lucene libs not in .war -- lib** has all other libs not in .war and not in dist, but required -- jetty ** the jetty copied from solr/example with context, etc, webapps, ... jetty/solr ** solr with its subdirectories jetty/solr/conf jetty/solr/data jetty/solr/solr.xml Currently lucene-core is also in lib directory because of the error message. I thought this would fix my problem, but no change so if I remove it the error remains. In solrconfig.xml I have only two lib directives: Strange thing is, solr/example starts without problems and I could also start my solr-4.0.0 development installation from eclipse with runjettyrun. Just tested, after removing lucene-core from lib directory the error remains the same. Seriously a stupid config error, but where? Regards Bernd Am 15.10.2012 21:05, schrieb Chris Hostetter: > > : SEVERE: null:java.lang.IllegalAccessError: > : class org.apache.lucene.codecs.lucene3x.PreFlexRWPostingsFormat cannot > access > : its superclass org.apache.lucene.codecs.lucene3x.Lucene3xPostingsFormat > > that sounds like a classpath error. > > : Very strange, because some lines earlier in the logs I have: > : > : Oct 15, 2012 2:30:24 PM org.apache.solr.core.SolrConfig initLibs > : INFO: Adding specified lib dirs to ClassLoader > : Oct 15, 2012 2:30:24 PM org.apache.solr.core.SolrResourceLoader > replaceClassLoader > : INFO: Adding > 'file:/srv/www/solr/solr-4.0.0/lib/lucene-core-4.0-SNAPSHOT.jar' to > classloader > > ...and that looks like a mistake. based on that log line, you either have > a copy of the lucene core jar in the implicit "lib" dir for your solr > core, or you have an explicit directive pointed > somewhere that contains a copy of the lucene-core jar -- either way > telling slr to load the lucene-core jar as a plugin. > > but lucene-core should not be loaded as a plugin. lucene-core is already > in the solr.war, and should have been loaded long before SolrConfig > started looking for plugin libraries. > > which means you probably have two copies of the lucene-core jar ... and > if you have two copies of that jar, you probably have two copies of oher > lucene jars. > > which begs the questions: > > * what is you solr home dir? (i'me guessing maybe it's > "/srv/www/solr/solr-4.0.0/" ?) > * why do you have a copy of lucene-core in /srv/www/solr/solr-4.0.0/lib ? > * what directives do you have in your solrconfig.xml and why? > > > -Hoss >