Need solr query help

2013-05-09 Thread Abhishek tiwari
We are doing spatial search. with following logic.
a) There are shops in a city . Each provides the facility of home delivery
b) each shop has different  max_delivery_distance .

Now my query is suppose some one is searching from point P1 with radius R.

User wants the result of shops those can deliver him.(distance between P1
to shop s1 say d1 should be less than max_delivery distance say md1 )

how can i implement this by solr spatial query.


More Like This and Caching

2013-05-09 Thread Giammarco Schisani
Hi all,

Could anybody explain which Solr cache (e.g. queryResultCache,
documentCache, fieldCache, etc.) can be used by the More Like This handler?

One of my colleagues had previously suggested that the More Like This
handler does not take advantage of any of the Solr caches.

However, if I issue two identical MLT requests to the same Solr instance,
the second request will execute much faster than the first request (for
example, the first request will execute in 200ms and the second request
will execute in 20ms). This makes me believe that at least one of the Solr
caches is being used by the More Like This handler.

I think the "documentCache" is the cache that is most likely being used,
but would you be able to confirm?

As information, I am currently using Solr version 3.6.1.

Kind regards,
Giammarco Schisani


Re: Re: Re: Re: Shard update error when using DIH

2013-05-09 Thread heaven
Thank you all, guys.
 
Your advises work great and I don't see any errors in Solr logs anymore.
 
Best,
Alex

Monday 29 April 2013, you wrote:


On 29 April 2013 14:55, heaven <[hidden email][1]> wrote: > Got these errors 
after switching the field type to long: >  *  *crm-test:* > 
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: > 
Unknown fieldtype 'long' specified on field _version_ 

You have probably edited your schema. The default one has  towards the top of the file. 

Regards, Gora 





*If you reply to this email, your message will be added to the discussion 
below:* 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059740.html[2]
 
To unsubscribe from Shard update error when using DIH, click here[3].

NAML[4] 




[1] /user/SendEmail.jtp?type=node&node=4059740&i=0
[2] 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059740.html
[3] 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe
_by_code&node=4035502&code=YWhlYXZlbjg3QGdtYWlsLmNvbXw0MDM1NTAyfDE3
MDI0ODI4OTY=
[4] 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_view
er&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.B
asicNamespace-nabble.view.web.template.NabbleNamespace-
nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21
nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-
send_instant_email%21nabble%3Aemail.naml




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4061812.html
Sent from the Solr - User mailing list archive at Nabble.com.

filter result by numFound in Result Grouping

2013-05-09 Thread Shalom Ben-Zvi Kazaz
Hello list
In one of our search that we use Result Grouping we have a need to
filter results to only groups that have more then one document in the
group, or more specifically to groups that have two documents.
Is it possible in some way?

Thank you


Re: Solr 4.3 fails in startup when dataimporthandler declaration is included in solrconfig.xml

2013-05-09 Thread Jan Høydahl
My question was: When you move DIH libs to Solr's classloader (e.g. 
instanceDir/lib and refer from solrconfig.xml), and remove solr.war from 
tomcat/lib, what error msg do you then get?

Also make sure to delete the old tomcat/webapps/solr folder just to make sure 
you're starting from scratch

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

9. mai 2013 kl. 01:54 skrev William Pierce :

> The reason I placed the solr.war in tomcat/lib was -- I guess -- because 
> that's way I had always done it since 1.3 days.  Our tomcat instance(s) run 
> nothing other than solr - so that seemed as good a place as any.
> 
> The DIH jars that I placed in the tomcat/lib are: 
> solr-dataimporthandler-4.3.0.jar and solr-dataimporthandler-extras-4.3.0.jar. 
>  Are there any dependent jars that also need to be added that I am unaware of?
> 
> On the specific errors - I get a stack trace noted in the first email that 
> began this thread but repeated here for convenience:
> 
> ERROR - 2013-05-08 10:43:48.185; org.apache.solr.core.CoreContainer; Unable 
> to create core: collection1
> org.apache.solr.common.SolrException: 
> org/apache/solr/util/plugin/SolrCoreAware
>   at org.apache.solr.core.SolrCore.(SolrCore.java:821)
>   at org.apache.solr.core.SolrCore.(SolrCore.java:618)
>   at 
> org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949)
>   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984)
>   at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
>   at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
>   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
>   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>   at java.lang.Thread.run(Unknown Source)
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/solr/util/plugin/SolrCoreAware
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(Unknown Source)
>   at java.security.SecureClassLoader.defineClass(Unknown Source)
>   at java.net.URLClassLoader.defineClass(Unknown Source)
>   at java.net.URLClassLoader.access$100(Unknown Source)
>   at java.net.URLClassLoader$1.run(Unknown Source)
>   at java.net.URLClassLoader$1.run(Unknown Source)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(Unknown Source)
>   at java.lang.ClassLoader.loadClass(Unknown Source)
>   at java.lang.ClassLoader.loadClass(Unknown Source)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Unknown Source)
>   at 
> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1700)
>   at java.lang.ClassLoader.loadClass(Unknown Source)
>   at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
>   at java.lang.ClassLoader.loadClass(Unknown Source)
>   at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
>   at java.lang.ClassLoader.loadClass(Unknown Source)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Unknown Source)
>   at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:448)
>   at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:396)
>   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:518)
>   at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:592)
>   at 
> org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:154)
>   at org.apache.solr.core.SolrCore.(SolrCore.java:758)
>   ... 13 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.solr.util.plugin.SolrCoreAware
>   at java.net.URLClassLoader$1.run(Unknown Source)
>   at java.net.URLClassLoader$1.run(Unknown Source)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(Unknown Source)
>   at java.lang.ClassLoader.loadClass(Unknown Source)
>   at java.lang.ClassLoader.loadClass(Unknown Source)
>   ... 40 more
> ERROR - 2013-05-08 10:43:48.189; org.apache.solr.common.SolrException; 
> null:org.apache.solr.common.SolrException: Unable to create core: collection1
>   at 
> org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1450)
>   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:993)
>   at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
>   at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
>   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.Executors

Portability of Solr index

2013-05-09 Thread mukesh katariya
I have built a SOLR Index on Windows 7 Enterprise, 64 Bit. I copy the index
to Centos release 6.2, 32 Bit OS.

The index is readable and the application is able to load data from the
index on Linux. But there are a few fields on which FQ Queries dont work on
Linux , but same FQ Query work on windows.

I have a situation where in i have to prepare index on windows and port it
on Linux. I need the index to be portable.

The only thing which is not working is the FQ Queries.

Inside the BlockTreeTermsReader seekExact API, I have enabled debugging and
system out statements scanToTermLeaf: block fp=1705107 prefix=0 nextEnt=0
(of 167)
target=1RD0JIHMr9aw4RPPuS0DVzB2tKf38FfjKaEg7HsYDd7EtAOpE9FYvvj5ryB7679r4KNnlIaze‌​vPo
h7qabtLhXw== [31 52 44 30 4a 49 48 4d 72 39 61 77 34 52 50 50 75 53 30 44 56
7a 42 32 74 4b 66 33 38 46 66 6a 4b 61 45 67 37 48 73 59 44 64 37 45 74 41
4f 70 45 39 46 59 76 76 6a 35 72 79 42 37 36 37 39 72 34 4b 4e 6e 6c 49 61
7a 65 76 50 6f d a 68 37 71 61 62 74 4c 68 58 77 3d 3d] term= [] This is a
Term Query, and target bytes to match 

As per the algorithm it runs through the term and tries to match , now the
6th term is a exact match, but there is a problem of few bytescycle: term 6
(of 167)
suffix=1RD0JIHMr9aw4RPPuS0DVzB2tKf38FfjKaEg7HsYDd7EtAOpE9FYvvj5ryB7679r4KNnlIaze‌​vPo
h7qabtLhXw== [31 52 44 30 4a 49 48 4d 72 39 61 77 34 52 50 50 75 53 30 44 56
7a 42 32 74 4b 66 33 38 46 66 6a 4b 61 45 67 37 48 73 59 44 64 37 45 74 41
4f 70 45 39 46 59 76 76 6a 35 72 79 42 37 36 37 39 72 34 4b 4e 6e 6c 49 61
7a 65 76 50 6f a 68 37 71 61 62 74 4c 68 58 77 3d 3d] Prefix:=0 Suffix:=89
target.offset:=0 target.length :=90 targetLimit :=89 

from the first comment 50 6f d a 68 37 from the second comment 50 6f a 68
37. The test scenario is the index is built on linux and i am testing the
index through solr api on windows machine. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Portability-of-Solr-index-tp4061783.html
Sent from the Solr - User mailing list archive at Nabble.com.


ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Hi, observing lots of these errors with SolrCloud

Here is the instruction I am using to run services:
zookeeper:
  1: cd /opt/zookeeper/
  2: sudo bin/zkServer.sh start zoo1.cfg
  3: sudo bin/zkServer.sh start zoo2.cfg
  4: sudo bin/zkServer.sh start zoo3.cfg

shards:
  1: cd /opt/solr-cluster/shard1/
 sudo su solr -c "java -Xmx4096M
-DzkHost=localhost:2181,localhost:2182,localhost:2183
-Dbootstrap_confdir=./solr/conf -Dcollection.configName=Carmen -DnumShards=2
-jar start.jar etc/jetty.xml etc/jetty-logging.xml &"
  2: cd ../shard2/
 sudo su solr -c "java -Xmx4096M
-DzkHost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar
etc/jetty.xml etc/jetty-logging.xml &"

replicas:
  1: cd ../replica1/
 sudo su solr -c "java -Xmx4096M
-DzkHost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar
etc/jetty.xml etc/jetty-logging.xml &"
  2: cd ../replica2/
 sudo su solr -c "java -Xmx4096M
-DzkHost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar
etc/jetty.xml etc/jetty-logging.xml &"

zoo1.cfg:
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/opt/zookeeper/data/1
# the port at which the clients will connect
clientPort=2181

server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

zoo2.cfg and zoo3.cfg are the same except dataDir and client port
respectively.

Also very often I see: org.apache.solr.common.SolrException: No registered
leader was found and lots of other errors. Just updated jetty.xml and set
org.eclipse.jetty.server.Request.maxFormContentSize to 10MB and restarted
the cluster — half of errors gone, but this one about IOException still
here.

I am re-indexing a few models (rails application), they have from 1 000 000
to 20 000 000 of records. For indexing I have a queue (mongodb) and a few
workers which process it in batches of 200-500 records.

All Solr and Zookeeper instances are launched on the same server: 2 intel
xenon processors, 8 total cores, 32Gb of memory and rapid RAID storage.

Please help me to figure out what could be the reason for those errors and
how can fix them. Please tell me if I can provide some more information
about the server setup, logs, errors, etc.

Best,
Alex

 

Shard 1:
 
Replica 1:
 
Shard 2:
 
Replica 2:
 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/ColrCloud-IOException-occured-when-talking-to-server-at-tp4061831.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.2 rollback not working

2013-05-09 Thread mark12345

So for all current versions of Solr, rollback will not work for SolrCloud? 
Will this change in the future, or will rollback always be unsupported for
SolrCloud?

This did catch me by surprise.  Should the SolrJ documentation be updated to
reflect this behavior?

http://lucene.apache.org/solr/4_3_0/solr-solrj/org/apache/solr/client/solrj/SolrServer.html#rollback%28%29

  




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-2-rollback-not-working-tp4060393p4061834.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Forget to mention Solr is 4.2 and zookepeer 3.4.5

I do not do manual commits and prefer softCommit each second and autoCommit
each 3 minutes.

the problem happened again, lots of errors in logs and no description.
Cluster state changed, on the shard 2 replica became a leader, former leader
get in to recovering mode.
The error happened when
1. Shard1 tried to forward an update to Shard2, and this was the initial
error From Shard2:
ClusterState says we are the leader,​ but locally we don't think so
2. Shard2 forwarded the update to the Replica2 and get:
org.apache.solr.common.SolrException: Request says it is coming from
leader,​ but we are the leader

Please see attachments

Topology:
 
Shard1:
 
Replica1:
 
Shard2:
 
Replica2:
 

All errors from the screenshots appears each time the server load gets
higher. Only I started a few more queue workers, load gets higher and
cluster becomes unstable. So I have doubts about reliability. Could any docs
be lost during these errors or should I just ignore those?

I understand that 4 solr instances and 3 zookeeper could be too many for a
single machine, there could be not enough resources, etc. But anyway it
should not cause anything like that. The worst scenario there should be is a
timeout error, when Solr not responding and my queue processors could handle
that and resend a request after a while.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/ColrCloud-IOException-occured-when-talking-to-server-at-tp4061831p4061839.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Zookeeper log:
 1  *2013-05-09 03:03:07,177* [myid:3] - WARN 
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2183:Follower@118] - Got zxid
0x20001 expected 0x1
 2  *2013-05-09 03:36:52,918* [myid:3] - ERROR
[CommitProcessor:3:NIOServerCnxn@180] - Unexpected Exception: 
 3  java.nio.channels.CancelledKeyException
 4  at
sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
 5  at
sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
 6  at
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153)
 7  at
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076)
 8  at
org.apache.zookeeper.server.NIOServerCnxn.process(NIOServerCnxn.java:1113)
 9  at
org.apache.zookeeper.server.DataTree.setWatches(DataTree.java:1327)
10  at
org.apache.zookeeper.server.ZKDatabase.setWatches(ZKDatabase.java:384)
11  at
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:304)
12  at
org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
13  *2013-05-09 03:36:52,928* [myid:3] - ERROR
[CommitProcessor:3:NIOServerCnxn@180] - Unexpected Exception: 
14  java.nio.channels.CancelledKeyException
15  at
sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
16  at
sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
17  at org.apache.zookeeper.server.NIOServerCnxn.s*2013-05-09
04:26:04,790* [myid:2] - WARN 
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@349] - caught end
of stream exception
18  EndOfStreamException: Unable to read additional data from client
sessionid 0x23e88bdaf81, likely client has closed socket
19  at
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
20  at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
21  at java.lang.Thread.run(Thread.java:679)
22  tionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
23  at
sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
24  at
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153)
25  at
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076)
26  at
org.apache.zookeeper.server.NIOServerCnxn.process(NIOServerCnxn.java:1113)
27  at
org.apache.zookeeper.server.WatchManager.triggerWatch(WatchManager.java:120)
28  at
org.apache.zookeeper.server.WatchManager.triggerWatch(WatchManager.java:92)
29  at
org.apache.zookeeper.server.DataTree.setData(DataTree.java:620)
30  at
org.apache.zookeeper.server.DataTree.processTxn(DataTree.java:807)
31  at
org.apache.zookeeper.server.ZKDatabase.processTxn(ZKDatabase.java:329)
32  at
org.apache.zookeeper.server.ZooKeeperServer.processTxn(ZooKeeperServer.java:965)
33  at
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:116)
34  at
org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
35  *2013-05-09 04:27:04,002* [myid:3] - ERROR
[CommitProcessor:3:NIOServerCnxn@180] - Unexpected Exception: 
36  java.nio.channels.CancelledKeyException
37  at
sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
38  at
sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
39  at
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153)
40  at
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076)
41  at
org.apache.zookeeper.server.NIOServerCnxn.process(NIOServerCnxn.java:1113)
42  at
org.apache.zookeeper.server.WatchManager.triggerWatch(WatchManager.java:120)
43  at
org.apache.zookeeper.server.WatchManager.triggerWatch(WatchManager.java:92)
44  at
org.apache.zookeeper.server.DataTree.deleteNode(DataTree.java:591)
45  at
org.apache.zookeeper.server.DataTree.killSession(DataTree.java:966)
46  at
org.apache.zookeeper.server.DataTree.processTxn(DataTree.java:818)
47  at
org.apache.zookeeper.server.ZKDatabase.processTxn(ZKDatabase.java:329)
48  at
org.apache.zookeeper.server.ZooKeeperServer.processTxn(ZooKeeperServer.java:965)
49  at
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:116)
50  at
org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
51  *2013-05-09 04:36:00,485* [myid:3] - WARN 
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2183:NIOServerCnxn@349] - caught end
of stream exception
52  EndOfStreamException: Unable to read additional data from client
sessionid 0x33e88bdc0

Re: Solr 4.2 rollback not working

2013-05-09 Thread Mark Miller
At the least it should throw an exception if you try rollback with SolrCloud - 
though now there is discussion about removing it entirely.

But yes, it's not supported and there are no real plans to support it.

- Mark

On May 9, 2013, at 7:21 AM, mark12345  wrote:

> 
> So for all current versions of Solr, rollback will not work for SolrCloud? 
> Will this change in the future, or will rollback always be unsupported for
> SolrCloud?
> 
> This did catch me by surprise.  Should the SolrJ documentation be updated to
> reflect this behavior?
> 
> http://lucene.apache.org/solr/4_3_0/solr-solrj/org/apache/solr/client/solrj/SolrServer.html#rollback%28%29
> 
>   
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-4-2-rollback-not-working-tp4060393p4061834.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Can confirm this lead to data loss. I have 1217427 records in database and
only 1217216 indexed. Which does mean that Solr gave a successful response
and then did not added some documents to the index.

Seems like SolrCloud is not a production-ready solution, would be good if
there was a warning in the Solr wiki about that.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/ColrCloud-IOException-occured-when-talking-to-server-at-tp4061831p4061847.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.3 fails in startup when dataimporthandler declaration is included in solrconfig.xml

2013-05-09 Thread William Pierce
I got this to work (thanks, Jan, and all).  It turns out that DIH jars need 
to be included explicitly by specifying in solrconfig.xml or placed in some 
default path under solr.home.  I placed these jars in instanceDir/lib and it 
worked.  Previously I had reported it as not working - this was because I 
had mistakenly left a copy of the jars under tomcat/lib.


Bill

-Original Message- 
From: Jan Høydahl

Sent: Thursday, May 09, 2013 2:58 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4.3 fails in startup when dataimporthandler declaration is 
included in solrconfig.xml


My question was: When you move DIH libs to Solr's classloader (e.g. 
instanceDir/lib and refer from solrconfig.xml), and remove solr.war from 
tomcat/lib, what error msg do you then get?


Also make sure to delete the old tomcat/webapps/solr folder just to make 
sure you're starting from scratch


--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

9. mai 2013 kl. 01:54 skrev William Pierce :

The reason I placed the solr.war in tomcat/lib was -- I guess -- because 
that's way I had always done it since 1.3 days.  Our tomcat instance(s) 
run nothing other than solr - so that seemed as good a place as any.


The DIH jars that I placed in the tomcat/lib are: 
solr-dataimporthandler-4.3.0.jar and 
solr-dataimporthandler-extras-4.3.0.jar.  Are there any dependent jars 
that also need to be added that I am unaware of?


On the specific errors - I get a stack trace noted in the first email that 
began this thread but repeated here for convenience:


ERROR - 2013-05-08 10:43:48.185; org.apache.solr.core.CoreContainer; 
Unable to create core: collection1
org.apache.solr.common.SolrException: 
org/apache/solr/util/plugin/SolrCoreAware

  at org.apache.solr.core.SolrCore.(SolrCore.java:821)
  at org.apache.solr.core.SolrCore.(SolrCore.java:618)
  at 
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949)

  at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984)
  at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
  at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
  at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
  at java.util.concurrent.FutureTask.run(Unknown Source)
  at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
  at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
  at java.util.concurrent.FutureTask.run(Unknown Source)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
  at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.NoClassDefFoundError: 
org/apache/solr/util/plugin/SolrCoreAware

  at java.lang.ClassLoader.defineClass1(Native Method)
  at java.lang.ClassLoader.defineClass(Unknown Source)
  at java.security.SecureClassLoader.defineClass(Unknown Source)
  at java.net.URLClassLoader.defineClass(Unknown Source)
  at java.net.URLClassLoader.access$100(Unknown Source)
  at java.net.URLClassLoader$1.run(Unknown Source)
  at java.net.URLClassLoader$1.run(Unknown Source)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(Unknown Source)
  at java.lang.ClassLoader.loadClass(Unknown Source)
  at java.lang.ClassLoader.loadClass(Unknown Source)
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Unknown Source)
  at 
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1700)

  at java.lang.ClassLoader.loadClass(Unknown Source)
  at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
  at java.lang.ClassLoader.loadClass(Unknown Source)
  at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
  at java.lang.ClassLoader.loadClass(Unknown Source)
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Unknown Source)
  at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:448)
  at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:396)

  at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:518)
  at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:592)
  at 
org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:154)

  at org.apache.solr.core.SolrCore.(SolrCore.java:758)
  ... 13 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.solr.util.plugin.SolrCoreAware

  at java.net.URLClassLoader$1.run(Unknown Source)
  at java.net.URLClassLoader$1.run(Unknown Source)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(Unknown Source)
  at java.lang.ClassLoader.loadClass(Unknown Source)
  at java.lang.ClassLoader.loadClass(Unknown Source)
  ... 40 more
ERROR - 2013-05-08 10:43:48.189; org.apache.solr.common.SolrException; 
null:org.apache.solr.common.SolrException: Unable to create core: 
collection1
  at 
org.

Re: Portability of Solr index

2013-05-09 Thread Alexandre Rafalovitch
What is the query/term you are looking for? I wonder if the difference
is due to newline treatment on different platforms.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, May 9, 2013 at 1:49 AM, mukesh katariya
 wrote:
> I have built a SOLR Index on Windows 7 Enterprise, 64 Bit. I copy the index
> to Centos release 6.2, 32 Bit OS.
>
> The index is readable and the application is able to load data from the
> index on Linux. But there are a few fields on which FQ Queries dont work on
> Linux , but same FQ Query work on windows.
>
> I have a situation where in i have to prepare index on windows and port it
> on Linux. I need the index to be portable.
>
> The only thing which is not working is the FQ Queries.
>
> Inside the BlockTreeTermsReader seekExact API, I have enabled debugging and
> system out statements scanToTermLeaf: block fp=1705107 prefix=0 nextEnt=0
> (of 167)
> target=1RD0JIHMr9aw4RPPuS0DVzB2tKf38FfjKaEg7HsYDd7EtAOpE9FYvvj5ryB7679r4KNnlIaze‌vPo
> h7qabtLhXw== [31 52 44 30 4a 49 48 4d 72 39 61 77 34 52 50 50 75 53 30 44 56
> 7a 42 32 74 4b 66 33 38 46 66 6a 4b 61 45 67 37 48 73 59 44 64 37 45 74 41
> 4f 70 45 39 46 59 76 76 6a 35 72 79 42 37 36 37 39 72 34 4b 4e 6e 6c 49 61
> 7a 65 76 50 6f d a 68 37 71 61 62 74 4c 68 58 77 3d 3d] term= [] This is a
> Term Query, and target bytes to match
>
> As per the algorithm it runs through the term and tries to match , now the
> 6th term is a exact match, but there is a problem of few bytescycle: term 6
> (of 167)
> suffix=1RD0JIHMr9aw4RPPuS0DVzB2tKf38FfjKaEg7HsYDd7EtAOpE9FYvvj5ryB7679r4KNnlIaze‌vPo
> h7qabtLhXw== [31 52 44 30 4a 49 48 4d 72 39 61 77 34 52 50 50 75 53 30 44 56
> 7a 42 32 74 4b 66 33 38 46 66 6a 4b 61 45 67 37 48 73 59 44 64 37 45 74 41
> 4f 70 45 39 46 59 76 76 6a 35 72 79 42 37 36 37 39 72 34 4b 4e 6e 6c 49 61
> 7a 65 76 50 6f a 68 37 71 61 62 74 4c 68 58 77 3d 3d] Prefix:=0 Suffix:=89
> target.offset:=0 target.length :=90 targetLimit :=89
>
> from the first comment 50 6f d a 68 37 from the second comment 50 6f a 68
> 37. The test scenario is the index is built on linux and i am testing the
> index through solr api on windows machine.
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Portability-of-Solr-index-tp4061783.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Fuzzy searching documents over multiple fields using Solr

2013-05-09 Thread britske
Not sure if this has ever come up (or perhaps even implemented without me
knowing) , but I'm interested in doing Fuzzy search over multiple fields
using Solr. 

What I mean is the ability to returns documents based on some 'distance
calculation' without documents having to match 100% to the query. 

Usecase: a user is searching for a tv with a couple of filters selected. No
tv matches all filters. How to come up with a bunch of suggestions that
match the selected filters as closely as possible? The hard part is to
determine what 'closely' means in this context, etc.

This relates to (approximate) nearest neighbor, Kd-trees, etc. Has anyone
ever tried to do something similar? any plugins, etc? or reasons Solr/Lucene
would/wouldn't be the correct system to build on?

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Fuzzy-searching-documents-over-multiple-fields-using-Solr-tp4061867.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Fuzzy searching documents over multiple fields using Solr

2013-05-09 Thread Jack Krupansky
A simple "OR" boolean query will boost documents that have more matches. You 
can also selectively boost individual OR terms to control importance. And do 
and "AND" for the required terms, like "tv".


-- Jack Krupansky
-Original Message- 
From: britske

Sent: Thursday, May 09, 2013 11:21 AM
To: solr-user@lucene.apache.org
Subject: Fuzzy searching documents over multiple fields using Solr

Not sure if this has ever come up (or perhaps even implemented without me
knowing) , but I'm interested in doing Fuzzy search over multiple fields
using Solr.

What I mean is the ability to returns documents based on some 'distance
calculation' without documents having to match 100% to the query.

Usecase: a user is searching for a tv with a couple of filters selected. No
tv matches all filters. How to come up with a bunch of suggestions that
match the selected filters as closely as possible? The hard part is to
determine what 'closely' means in this context, etc.

This relates to (approximate) nearest neighbor, Kd-trees, etc. Has anyone
ever tried to do something similar? any plugins, etc? or reasons Solr/Lucene
would/wouldn't be the correct system to build on?

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Fuzzy-searching-documents-over-multiple-fields-using-Solr-tp4061867.html
Sent from the Solr - User mailing list archive at Nabble.com. 



4.3 logging setup

2013-05-09 Thread richardg
On all prior index version I setup my log via the logging.properties file in
/usr/local/tomcat/conf, it looked like this:

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

handlers = 1catalina.org.apache.juli.FileHandler,
2localhost.org.apache.juli.FileHandler,
3manager.org.apache.juli.FileHandler,
4host-manager.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler

.handlers = 1catalina.org.apache.juli.FileHandler,
java.util.logging.ConsoleHandler


# Handler specific properties.
# Describes specific configuration info for Handlers.


1catalina.org.apache.juli.FileHandler.level = WARNING
1catalina.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
1catalina.org.apache.juli.FileHandler.prefix = catalina.

2localhost.org.apache.juli.FileHandler.level = FINE
2localhost.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
2localhost.org.apache.juli.FileHandler.prefix = localhost.

3manager.org.apache.juli.FileHandler.level = FINE
3manager.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
3manager.org.apache.juli.FileHandler.prefix = manager.

4host-manager.org.apache.juli.FileHandler.level = FINE
4host-manager.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
4host-manager.org.apache.juli.FileHandler.prefix = host-manager.

java.util.logging.ConsoleHandler.level = FINE
java.util.logging.ConsoleHandler.formatter =
java.util.logging.SimpleFormatter



# Facility specific properties.
# Provides extra control for each logger.


org.apache.catalina.core.ContainerBase.[Catalina].[localhost].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].handlers =
2localhost.org.apache.juli.FileHandler

org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].level
= INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].handlers
= 3manager.org.apache.juli.FileHandler

org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].level
= INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].handlers
= 4host-manager.org.apache.juli.FileHandler

# For example, set the org.apache.catalina.util.LifecycleBase logger to log
# each component that extends LifecycleBase changing state:
#org.apache.catalina.util.LifecycleBase.level = FINE

# To see debug messages in TldLocationsCache, uncomment the following line:
#org.apache.jasper.compiler.TldLocationsCache.level = FINE

After upgrading to 4.3 today the files defined aren't being logged to.  I
know things have changed for logging w/ 4.3 but how can I get it setup like
it was before?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Fuzzy searching documents over multiple fields using Solr

2013-05-09 Thread Geert-Jan Brits
I didn't mention it but I'd like individual fields to contribute to the
overall score on a continuum instead of 1 (match) and 0 (no match), which
will lead to more fine-grained scoring.

A contrived example: all other things equal a tv of 40 inch should score
higher than a 38 inch tv when searching for a 42 inch tv.
This based on some distance modeling on the 'size' -field. (eg:
score(42,40) = 0.6 and score(42,38) = 0,4).
Other qualitative fields may be modeled in the same way: (e.g: restaurants
with field 'price' with values: 'budget','mid-range', 'expensive', ...)

Any way to incorporate this?



2013/5/9 Jack Krupansky 

> A simple "OR" boolean query will boost documents that have more matches.
> You can also selectively boost individual OR terms to control importance.
> And do and "AND" for the required terms, like "tv".
>
> -- Jack Krupansky
> -Original Message- From: britske
> Sent: Thursday, May 09, 2013 11:21 AM
> To: solr-user@lucene.apache.org
> Subject: Fuzzy searching documents over multiple fields using Solr
>
>
> Not sure if this has ever come up (or perhaps even implemented without me
> knowing) , but I'm interested in doing Fuzzy search over multiple fields
> using Solr.
>
> What I mean is the ability to returns documents based on some 'distance
> calculation' without documents having to match 100% to the query.
>
> Usecase: a user is searching for a tv with a couple of filters selected. No
> tv matches all filters. How to come up with a bunch of suggestions that
> match the selected filters as closely as possible? The hard part is to
> determine what 'closely' means in this context, etc.
>
> This relates to (approximate) nearest neighbor, Kd-trees, etc. Has anyone
> ever tried to do something similar? any plugins, etc? or reasons
> Solr/Lucene
> would/wouldn't be the correct system to build on?
>
> Thanks
>
>
>
> --
> View this message in context: http://lucene.472066.n3.**
> nabble.com/Fuzzy-searching-**documents-over-multiple-**
> fields-using-Solr-tp4061867.**html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Use case for storing positions and offsets in index?

2013-05-09 Thread KnightRider
Thanks Jack & Jason



-
Thanks
-K'Rider
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Use-case-for-storing-positions-and-offsets-in-index-tp4061376p4061890.html
Sent from the Solr - User mailing list archive at Nabble.com.


Grouping search results by field returning all search results for a given query

2013-05-09 Thread Luis Carlos Guerrero Covo
Hi,

I'm using solr to maintain an index of items that belong to different
companies. I want the search results to be returned in a way that is fair
to all companies, thus I wish to group the results such that each company
has 1 item in each group, and the groups of results should be returned
sorted by score.

example:
--

20 companies

first 100 results

1-20 results - (company1 highest score item, company2 highest score item,
etc..)
20-40 results - (company1 second highest score item, company 2 second
highest score item, etc..)
...

 --

I'm trying to use the field collapsing feature but I have only been able to
create the first group of results by using
group.limit=1,group.field=companyid. If I raise the group.limit value, I
would be violating the 'fairness rule' because more than one result of a
company would be returned in the first group of results.

Can I achieve the desired search result using SOLR, or do I have to look at
other options?

thank you,

Luis Guerrero


Re: Fuzzy searching documents over multiple fields using Solr

2013-05-09 Thread Jack Krupansky
You can use function queries to boost documents as well. Sorry, but it can 
get messy to figure out.


See:
http://wiki.apache.org/solr/FunctionQuery

See also the edismax "bf" parameter:
http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29

-- Jack Krupansky

-Original Message- 
From: Geert-Jan Brits

Sent: Thursday, May 09, 2013 12:32 PM
To: solr-user@lucene.apache.org
Subject: Re: Fuzzy searching documents over multiple fields using Solr

I didn't mention it but I'd like individual fields to contribute to the
overall score on a continuum instead of 1 (match) and 0 (no match), which
will lead to more fine-grained scoring.

A contrived example: all other things equal a tv of 40 inch should score
higher than a 38 inch tv when searching for a 42 inch tv.
This based on some distance modeling on the 'size' -field. (eg:
score(42,40) = 0.6 and score(42,38) = 0,4).
Other qualitative fields may be modeled in the same way: (e.g: restaurants
with field 'price' with values: 'budget','mid-range', 'expensive', ...)

Any way to incorporate this?



2013/5/9 Jack Krupansky 


A simple "OR" boolean query will boost documents that have more matches.
You can also selectively boost individual OR terms to control importance.
And do and "AND" for the required terms, like "tv".

-- Jack Krupansky
-Original Message- From: britske
Sent: Thursday, May 09, 2013 11:21 AM
To: solr-user@lucene.apache.org
Subject: Fuzzy searching documents over multiple fields using Solr


Not sure if this has ever come up (or perhaps even implemented without me
knowing) , but I'm interested in doing Fuzzy search over multiple fields
using Solr.

What I mean is the ability to returns documents based on some 'distance
calculation' without documents having to match 100% to the query.

Usecase: a user is searching for a tv with a couple of filters selected. 
No

tv matches all filters. How to come up with a bunch of suggestions that
match the selected filters as closely as possible? The hard part is to
determine what 'closely' means in this context, etc.

This relates to (approximate) nearest neighbor, Kd-trees, etc. Has anyone
ever tried to do something similar? any plugins, etc? or reasons
Solr/Lucene
would/wouldn't be the correct system to build on?

Thanks



--
View this message in context: http://lucene.472066.n3.**
nabble.com/Fuzzy-searching-**documents-over-multiple-**
fields-using-Solr-tp4061867.**html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Grouping search results by field returning all search results for a given query

2013-05-09 Thread Jason Hellman
Luis,

I am presuming you do not have an overarching grouping value here…and simply 
wish to show a standard search result that shows 1 item per company.

You should be able to accomplish your second page of desired items (the second 
item from each of your 20 represented companies) by using the group.offset 
parameter.  This will shift the position in the returned array of documents to 
the value provided.

Thus:

group.limit=1&group.field=companyid&group.offset=1

…would return the second item in each companyid group matching your current 
query.

Jason

On May 9, 2013, at 10:30 AM, Luis Carlos Guerrero Covo 
 wrote:

> Hi,
> 
> I'm using solr to maintain an index of items that belong to different
> companies. I want the search results to be returned in a way that is fair
> to all companies, thus I wish to group the results such that each company
> has 1 item in each group, and the groups of results should be returned
> sorted by score.
> 
> example:
> --
> 
> 20 companies
> 
> first 100 results
> 
> 1-20 results - (company1 highest score item, company2 highest score item,
> etc..)
> 20-40 results - (company1 second highest score item, company 2 second
> highest score item, etc..)
> ...
> 
> --
> 
> I'm trying to use the field collapsing feature but I have only been able to
> create the first group of results by using
> group.limit=1,group.field=companyid. If I raise the group.limit value, I
> would be violating the 'fairness rule' because more than one result of a
> company would be returned in the first group of results.
> 
> Can I achieve the desired search result using SOLR, or do I have to look at
> other options?
> 
> thank you,
> 
> Luis Guerrero



RE: More Like This and Caching

2013-05-09 Thread David Parks
I'm not the expert here, but perhaps what you're noticing is actually the
OS's disk cache. The actual solr index isn't cached by solr, but as you read
the blocks off disk the OS disk cache probably did cache those blocks for
you. On the 2nd run the index blocks were read out of memory.

There was a very extensive discussion on this list not long back titled:
"Re: SolrCloud loadbalancing, replication, and failover" look that thread up
and you'll get a lot of in-depth on the topic.

David


-Original Message-
From: Giammarco Schisani [mailto:giamma...@schisani.com] 
Sent: Thursday, May 09, 2013 2:59 PM
To: solr-user@lucene.apache.org
Subject: More Like This and Caching

Hi all,

Could anybody explain which Solr cache (e.g. queryResultCache,
documentCache, fieldCache, etc.) can be used by the More Like This handler?

One of my colleagues had previously suggested that the More Like This
handler does not take advantage of any of the Solr caches.

However, if I issue two identical MLT requests to the same Solr instance,
the second request will execute much faster than the first request (for
example, the first request will execute in 200ms and the second request will
execute in 20ms). This makes me believe that at least one of the Solr caches
is being used by the More Like This handler.

I think the "documentCache" is the cache that is most likely being used, but
would you be able to confirm?

As information, I am currently using Solr version 3.6.1.

Kind regards,
Giammarco Schisani



Re: 4.3 logging setup

2013-05-09 Thread Jason Hellman
From:

http://lucene.apache.org/solr/4_3_0/changes/Changes.html#4.3.0.upgrading_from_solr_4.2.0

Slf4j/logging jars are no longer included in the Solr webapp. All logging jars 
are now in example/lib/ext. Changing logging impls is now as easy as updating 
the jars in this folder with those necessary for the logging impl you would 
like. If you are using another webapp container, these jars will need to go in 
the corresponding location for that container. In conjunction, the 
dist-excl-slf4j and dist-war-excl-slf4 build targets have been removed since 
they are redundent. See the Slf4j documentation, SOLR-3706, and SOLR-4651 for 
more details.

It should just require you provide your preferred logging jars within an 
appropriate classpath. 


On May 9, 2013, at 9:24 AM, richardg  wrote:

> On all prior index version I setup my log via the logging.properties file in
> /usr/local/tomcat/conf, it looked like this:
> 
> # Licensed to the Apache Software Foundation (ASF) under one or more
> # contributor license agreements.  See the NOTICE file distributed with
> # this work for additional information regarding copyright ownership.
> # The ASF licenses this file to You under the Apache License, Version 2.0
> # (the "License"); you may not use this file except in compliance with
> # the License.  You may obtain a copy of the License at
> #
> # http://www.apache.org/licenses/LICENSE-2.0
> #
> # Unless required by applicable law or agreed to in writing, software
> # distributed under the License is distributed on an "AS IS" BASIS,
> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> # See the License for the specific language governing permissions and
> # limitations under the License.
> 
> handlers = 1catalina.org.apache.juli.FileHandler,
> 2localhost.org.apache.juli.FileHandler,
> 3manager.org.apache.juli.FileHandler,
> 4host-manager.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler
> 
> .handlers = 1catalina.org.apache.juli.FileHandler,
> java.util.logging.ConsoleHandler
> 
> 
> # Handler specific properties.
> # Describes specific configuration info for Handlers.
> 
> 
> 1catalina.org.apache.juli.FileHandler.level = WARNING
> 1catalina.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
> 1catalina.org.apache.juli.FileHandler.prefix = catalina.
> 
> 2localhost.org.apache.juli.FileHandler.level = FINE
> 2localhost.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
> 2localhost.org.apache.juli.FileHandler.prefix = localhost.
> 
> 3manager.org.apache.juli.FileHandler.level = FINE
> 3manager.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
> 3manager.org.apache.juli.FileHandler.prefix = manager.
> 
> 4host-manager.org.apache.juli.FileHandler.level = FINE
> 4host-manager.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
> 4host-manager.org.apache.juli.FileHandler.prefix = host-manager.
> 
> java.util.logging.ConsoleHandler.level = FINE
> java.util.logging.ConsoleHandler.formatter =
> java.util.logging.SimpleFormatter
> 
> 
> 
> # Facility specific properties.
> # Provides extra control for each logger.
> 
> 
> org.apache.catalina.core.ContainerBase.[Catalina].[localhost].level = INFO
> org.apache.catalina.core.ContainerBase.[Catalina].[localhost].handlers =
> 2localhost.org.apache.juli.FileHandler
> 
> org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].level
> = INFO
> org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].handlers
> = 3manager.org.apache.juli.FileHandler
> 
> org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].level
> = INFO
> org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].handlers
> = 4host-manager.org.apache.juli.FileHandler
> 
> # For example, set the org.apache.catalina.util.LifecycleBase logger to log
> # each component that extends LifecycleBase changing state:
> #org.apache.catalina.util.LifecycleBase.level = FINE
> 
> # To see debug messages in TldLocationsCache, uncomment the following line:
> #org.apache.jasper.compiler.TldLocationsCache.level = FINE
> 
> After upgrading to 4.3 today the files defined aren't being logged to.  I
> know things have changed for logging w/ 4.3 but how can I get it setup like
> it was before?
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: More Like This and Caching

2013-05-09 Thread Jason Hellman
Purely from empirical observation, both the DocumentCache and QueryResultCache 
are being populated and reused in reloads of a simple MLT search.  You can see 
in the cache inserts how much extra-curricular activity is happening to 
populate the MLT data by how many inserts and lookups occur on the first load. 

(lifted right out of the MLT wiki http://wiki.apache.org/solr/MoreLikeThis )

http://localhost:8983/solr/select?q=apache&mlt=true&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fl=id,score

There is no activity in the filterCache, fieldCache, or fieldValueCache - and 
that makes plenty of sense.

On May 9, 2013, at 11:12 AM, David Parks  wrote:

> I'm not the expert here, but perhaps what you're noticing is actually the
> OS's disk cache. The actual solr index isn't cached by solr, but as you read
> the blocks off disk the OS disk cache probably did cache those blocks for
> you. On the 2nd run the index blocks were read out of memory.
> 
> There was a very extensive discussion on this list not long back titled:
> "Re: SolrCloud loadbalancing, replication, and failover" look that thread up
> and you'll get a lot of in-depth on the topic.
> 
> David
> 
> 
> -Original Message-
> From: Giammarco Schisani [mailto:giamma...@schisani.com] 
> Sent: Thursday, May 09, 2013 2:59 PM
> To: solr-user@lucene.apache.org
> Subject: More Like This and Caching
> 
> Hi all,
> 
> Could anybody explain which Solr cache (e.g. queryResultCache,
> documentCache, fieldCache, etc.) can be used by the More Like This handler?
> 
> One of my colleagues had previously suggested that the More Like This
> handler does not take advantage of any of the Solr caches.
> 
> However, if I issue two identical MLT requests to the same Solr instance,
> the second request will execute much faster than the first request (for
> example, the first request will execute in 200ms and the second request will
> execute in 20ms). This makes me believe that at least one of the Solr caches
> is being used by the More Like This handler.
> 
> I think the "documentCache" is the cache that is most likely being used, but
> would you be able to confirm?
> 
> As information, I am currently using Solr version 3.6.1.
> 
> Kind regards,
> Giammarco Schisani
> 



Re: 4.3 logging setup

2013-05-09 Thread richardg
Thanks for responding.  My issue is I've never changed anything w/ logging, I
have always used the built in Juli.  I've never messed w/ any jar files,
just had edit the logging.properties file.  I don't know where I would get
the jars for juli or where to put them, if that is what is needed.  I had
read what you posted before I just can't make any sense of it.

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875p4061901.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2013-05-09 Thread Sergiu Bivol
I have a similar problem. With 5 shards, querying 500K rows fails, but 400K is 
fine.
Querying individual shards for 1.5 million rows works.
All solr instances are v4.2.1 and running on separate Ubuntu VMs.
It is not random, can be always reproduced by adding &rows=50 to a query 
where numFound is > 500K

Is this a configuration issue, where some setting can be increased?

-
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.


Re: 4.3 logging setup

2013-05-09 Thread Jason Hellman
If you nab the jars in example/lib/ext and place them within the appropriate 
folder in Tomcat (and this will somewhat depend on which version of Tomcat you 
are using…let's presume tomcat/lib as a brute-force approach) you should be 
back in business.

On May 9, 2013, at 11:41 AM, richardg  wrote:

> Thanks for responding.  My issue is I've never changed anything w/ logging, I
> have always used the built in Juli.  I've never messed w/ any jar files,
> just had edit the logging.properties file.  I don't know where I would get
> the jars for juli or where to put them, if that is what is needed.  I had
> read what you posted before I just can't make any sense of it.
> 
> Thanks
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875p4061901.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: 4.3 logging setup

2013-05-09 Thread Jan Høydahl
Hi,

FIrst of all, to setup loggin using Log4J (which is really better than JULI), 
copy all the jars from Jetty's lib/ext over to tomcat's lib folder, see 
instructions here: http://wiki.apache.org/solr/SolrLogging#Solr_4.3_and_above. 
You can place your log4j.properties in tomcat/lib as well so it will be read 
automatically.

Now when you start your Tomcat, you will find a file tomcat/logs/solr.log in 
nicer format than before, with one log entry per line instead of two, and 
automatic log file rotation and cleaning.

However, if you like to switch to Java Util logging, do the following:

1. Download slf4j version 1.6.6 (since that's what we use). 
http://www.slf4j.org/dist/slf4j-1.6.6.zip
2. Unpack, and pull out the file slf4j-jdk14-1.6.6.jar
3. Remove tomcat/lib/slf4j-log4j12-1.6.6.jar and copy slf4j-jdk14-1.6.6.jar to 
tomcat/lib instead
4. Use your old logging.properties (either place it on classpath or point to it 
with startup opt)

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

9. mai 2013 kl. 20:41 skrev richardg :

> Thanks for responding.  My issue is I've never changed anything w/ logging, I
> have always used the built in Juli.  I've never messed w/ any jar files,
> just had edit the logging.properties file.  I don't know where I would get
> the jars for juli or where to put them, if that is what is needed.  I had
> read what you posted before I just can't make any sense of it.
> 
> Thanks
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875p4061901.html
> Sent from the Solr - User mailing list archive at Nabble.com.



RE: Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2013-05-09 Thread Sergiu Bivol
Adding the original message.

Thank you
Sergiu

-Original Message-
From: Sergiu Bivol [mailto:sbi...@blackberry.com]
Sent: Thursday, May 09, 2013 2:50 PM
To: solr-user@lucene.apache.org
Subject: RE: Invalid version (expected 2, but 60) or the data in not in 
'javabin' format

I have a similar problem. With 5 shards, querying 500K rows fails, but 400K is 
fine.
Querying individual shards for 1.5 million rows works.
All solr instances are v4.2.1 and running on separate Ubuntu VMs.
It is not random, can be always reproduced by adding &rows=50 to a query 
where numFound is > 500K

Is this a configuration issue, where some setting can be increased?

-
From: Ahmet Arslan 
Subject: Invalid version (expected 2, but 60) or the data in not in 'javabin' 
format
Date: Mon, 21 Jan 2013 22:35:10 GMT

Hi,

I am was hitting the following exception when doing distributed search.
I am faceting on an int field named contentID. For some queries it was giving 
this error.
For some queries it just works fine.

localhost:8080/solr/kanu/select/?shards=localhost:8080/solr/rega,localhost:8080/solr/kanu&indent=true&q=karar&start=0&rows=15&hl=false&wt=xml&facet=true&facet.limit=-1&facet.sort=false&json.nl=arrarr&fq=isXml:false&mm=100%&facet.field=contentID&f.contentID.facet.mincount=2

Same search URL works fine for cores (kanu and rega) individually.

Plus if I use rega core as base search URL it works too. e.g.
localhost:8080/solr/rega/select/?shards=localhost:8080...

I see that rega core has lots of unique values for contentID field.
So my conclusion is, if a shard response is "too big" this happens.

This is a bad usage of faceting and I will remove faceting on that field since 
it was added
accidentally.

I still want to share stack traces since error message is somehow misleading.

Jan 21, 2013 10:36:53 PM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: java.lang.RuntimeException: 
Invalid version
(expected 2, but 60) or the data in not in 'javabin' format
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:300)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1701)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: Invalid version (expected 2, but 60) or 
the data in
not in 'javabin' format
at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:109)
at 
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:385)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182)
at 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:166)
at 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:133)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
... 1 more


When I add &shards.tolerant=true exception becomes:

Jan 21, 2013 10:51:51

Re: 4.3 logging setup

2013-05-09 Thread richardg
I had already copied those jars over and gotten the app to start(it wouldn't
without them).  I was able configure solf4j/log4j logging using the
log4j.properties in the /lib folder to start logging but I don't want to
switch.  I have alerts set on the wording that the juli logging puts out but
everything I've tried to get it to work has failed.  I have older
indexes(4.2 and under) running on the server that are still able to log
correctly, it is just 4.3.  I am obviously missing something.

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875p4061907.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: 4.3 logging setup

2013-05-09 Thread Shawn Heisey

On 5/9/2013 12:54 PM, Jason Hellman wrote:

If you nab the jars in example/lib/ext and place them within the appropriate 
folder in Tomcat (and this will somewhat depend on which version of Tomcat you 
are using…let's presume tomcat/lib as a brute-force approach) you should be 
back in business.

On May 9, 2013, at 11:41 AM, richardg  wrote:


Thanks for responding.  My issue is I've never changed anything w/ logging, I
have always used the built in Juli.  I've never messed w/ any jar files,
just had edit the logging.properties file.  I don't know where I would get
the jars for juli or where to put them, if that is what is needed.  I had
read what you posted before I just can't make any sense of it.


I've been looking into this a little bit. Tomcat's juli is an apache 
reimplementation of java.util.logging.  Solr uses SLF4J, but before 4.3, 
Solr's slf4j was bound to java.util.logging ... which I would bet was 
being intercepted by tomcat and sent through the juli config.


With 4.3, SLF4J is bound to log4j by default.  If you stick with this 
binding, then you need to configure log4j instead of juli.


Richard, you could go back to java.util.logging (the way earlier 
versions had it) with this procedure, and this will probably restore the 
ability to configure logging with juli.


- Delete the following jars from Solr's example lib/ext:
-- jul-to-slf4j-1.6.6.jar
-- log4j-1.2.16.jar
-- slf4j-log4j12-1.6.6.jar
- Download slf4j version 1.6.6 from their website.
- Copy the following jars from the download into lib/ext:
-- log4j-over-slf4j-1.6.6.jar
-- slf4j-jdk14-1.6.6.jar
- Copy all jars in lib/ext to tomcat's lib directory.

http://www.slf4j.org/dist/
http://www.slf4j.org

Alternatively, you could copy the jars from lib/ext to a directory in 
your classpath, or add Solr's lib/ext to your classpath.


If you want to upgrade to the newest slf4j, you can, you'll just have to 
use the new version for all slf4j jars.


Please let me know whether this worked for you so we can get a proper 
procedure up on the wiki.


Thanks,
Shawn



Re: Grouping search results by field returning all search results for a given query

2013-05-09 Thread Luis Carlos Guerrero Covo
Thank you for the prompt reply jason. The group.offset parameter is working
for me, now I can iterate through all items for each company. The problem
I'm having right now is pagination. Is there a way how this can be
implemented out of the box with solr?

Before I was using the group.main=true for easy pagination of results, but
it seems like I'll have to ditch that and use the standard grouping format
returned by solr for the group.offset parameter to be useful. Since all
groups don't have the same number of items, I'll have to carefully
calculate the results that should be returned for each page of 20 items and
probably make several solr calls per page rendered.


On Thu, May 9, 2013 at 1:07 PM, Jason Hellman <
jhell...@innoventsolutions.com> wrote:

> Luis,
>
> I am presuming you do not have an overarching grouping value here…and
> simply wish to show a standard search result that shows 1 item per company.
>
> You should be able to accomplish your second page of desired items (the
> second item from each of your 20 represented companies) by using the
> group.offset parameter.  This will shift the position in the returned array
> of documents to the value provided.
>
> Thus:
>
> group.limit=1&group.field=companyid&group.offset=1
>
> …would return the second item in each companyid group matching your
> current query.
>
> Jason
>
> On May 9, 2013, at 10:30 AM, Luis Carlos Guerrero Covo <
> lcguerreroc...@gmail.com> wrote:
>
> > Hi,
> >
> > I'm using solr to maintain an index of items that belong to different
> > companies. I want the search results to be returned in a way that is fair
> > to all companies, thus I wish to group the results such that each company
> > has 1 item in each group, and the groups of results should be returned
> > sorted by score.
> >
> > example:
> > --
> >
> > 20 companies
> >
> > first 100 results
> >
> > 1-20 results - (company1 highest score item, company2 highest score item,
> > etc..)
> > 20-40 results - (company1 second highest score item, company 2 second
> > highest score item, etc..)
> > ...
> >
> > --
> >
> > I'm trying to use the field collapsing feature but I have only been able
> to
> > create the first group of results by using
> > group.limit=1,group.field=companyid. If I raise the group.limit value, I
> > would be violating the 'fairness rule' because more than one result of a
> > company would be returned in the first group of results.
> >
> > Can I achieve the desired search result using SOLR, or do I have to look
> at
> > other options?
> >
> > thank you,
> >
> > Luis Guerrero
>
>


-- 
Luis Carlos Guerrero Covo
M.S. Computer Engineering
(57) 3183542047


Re: Grouping search results by field returning all search results for a given query

2013-05-09 Thread Jason Hellman
I would think pagination is resolved by obtaining the numFound value for your 
returned groups.  If you have numFound=6 then each page of 20 items (one item 
per company) would imply a total of 6 pages.

You'll have to arbitrate for the variance here…but it would seem to me you need 
as many "pages" as the highest value in the numFound field for all groups.  
This shouldn't require requerying but will definitely require a little 
intelligence on the web app to handle the groups that are less than the largest 
size.

Hope that's useful!

On May 9, 2013, at 12:23 PM, Luis Carlos Guerrero Covo 
 wrote:

> Thank you for the prompt reply jason. The group.offset parameter is working
> for me, now I can iterate through all items for each company. The problem
> I'm having right now is pagination. Is there a way how this can be
> implemented out of the box with solr?
> 
> Before I was using the group.main=true for easy pagination of results, but
> it seems like I'll have to ditch that and use the standard grouping format
> returned by solr for the group.offset parameter to be useful. Since all
> groups don't have the same number of items, I'll have to carefully
> calculate the results that should be returned for each page of 20 items and
> probably make several solr calls per page rendered.
> 
> 
> On Thu, May 9, 2013 at 1:07 PM, Jason Hellman <
> jhell...@innoventsolutions.com> wrote:
> 
>> Luis,
>> 
>> I am presuming you do not have an overarching grouping value here…and
>> simply wish to show a standard search result that shows 1 item per company.
>> 
>> You should be able to accomplish your second page of desired items (the
>> second item from each of your 20 represented companies) by using the
>> group.offset parameter.  This will shift the position in the returned array
>> of documents to the value provided.
>> 
>> Thus:
>> 
>> group.limit=1&group.field=companyid&group.offset=1
>> 
>> …would return the second item in each companyid group matching your
>> current query.
>> 
>> Jason
>> 
>> On May 9, 2013, at 10:30 AM, Luis Carlos Guerrero Covo <
>> lcguerreroc...@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> I'm using solr to maintain an index of items that belong to different
>>> companies. I want the search results to be returned in a way that is fair
>>> to all companies, thus I wish to group the results such that each company
>>> has 1 item in each group, and the groups of results should be returned
>>> sorted by score.
>>> 
>>> example:
>>> --
>>> 
>>> 20 companies
>>> 
>>> first 100 results
>>> 
>>> 1-20 results - (company1 highest score item, company2 highest score item,
>>> etc..)
>>> 20-40 results - (company1 second highest score item, company 2 second
>>> highest score item, etc..)
>>> ...
>>> 
>>> --
>>> 
>>> I'm trying to use the field collapsing feature but I have only been able
>> to
>>> create the first group of results by using
>>> group.limit=1,group.field=companyid. If I raise the group.limit value, I
>>> would be violating the 'fairness rule' because more than one result of a
>>> company would be returned in the first group of results.
>>> 
>>> Can I achieve the desired search result using SOLR, or do I have to look
>> at
>>> other options?
>>> 
>>> thank you,
>>> 
>>> Luis Guerrero
>> 
>> 
> 
> 
> -- 
> Luis Carlos Guerrero Covo
> M.S. Computer Engineering
> (57) 3183542047



Re: 4.3 logging setup

2013-05-09 Thread richardg
These are the files I have in my /lib folder:

slf4j-api-1.6.6
log4j-1.2.16
jul-to-slf4j-1.6.6
jcl-over-slf4j-1.6.6
slf4j-jdk14-1.6.6
log4j-over-slf4j-1.6.6

Currently everything seems to be logging like before.  After I followed the
instructions in Jan's post replacing slf4j-log4j12-1.6.6.jar with this
slf4j-jdk14-1.6.6.jar it all started working.  Shawn I then removed
everything as you instructed and put in just  log4j-over-slf4j-1.6.6.jar and
slf4j-jdk14-1.6.6.jar but the index showed an error and wouldn't start.  So
that is why I have those 6 files in there now, I'm not sure if
log4j-over-slf4j-1.6.6.jar this file is needed or not.  Let me know if you
need me to test anything else.

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875p4061922.html
Sent from the Solr - User mailing list archive at Nabble.com.


Does Distributed Search are Cached Only the By Node That Runs Query?

2013-05-09 Thread Furkan KAMACI
I have Solr 4.2.1 and run them as SolrCloud. When I do a search on
SolrCloud as like that:

ip_of_node_1:8983solr/select?q=*:*&rows=1

and when I check admin page I see that:

I have 5 GB Java Heap. 616.32 MB is dark gray, 3.13 GB is gray.

Before my search it was something like: 150 MB dark gray, 500 MB gray.

I understand that when I do a search like that, fields are cached. However
when I look at other SolrCloud nodes' admin pages there are no differences.
Why that query is cached only by the node that I run that query on?


Re: 4.3 logging setup

2013-05-09 Thread Shawn Heisey

On 5/9/2013 1:41 PM, richardg wrote:

These are the files I have in my /lib folder:

slf4j-api-1.6.6
log4j-1.2.16
jul-to-slf4j-1.6.6
jcl-over-slf4j-1.6.6
slf4j-jdk14-1.6.6
log4j-over-slf4j-1.6.6

Currently everything seems to be logging like before.  After I followed the
instructions in Jan's post replacing slf4j-log4j12-1.6.6.jar with this
slf4j-jdk14-1.6.6.jar it all started working.  Shawn I then removed
everything as you instructed and put in just  log4j-over-slf4j-1.6.6.jar and
slf4j-jdk14-1.6.6.jar but the index showed an error and wouldn't start.  So
that is why I have those 6 files in there now, I'm not sure if
log4j-over-slf4j-1.6.6.jar this file is needed or not.  Let me know if you
need me to test anything else.


You're on the right track.  Your list just has two files that shouldn't 
be there - log4j-1.2.16 and jul-to-slf4j-1.6.6.  They are probably not 
causing any real problems, but they might in the future.


Remove those and you will have the exact list I was looking for.  If 
that doesn't work, use a paste website (pastie.org and others) to send a 
log showing the errors you get.


Thanks,
Shawn



Is the CoreAdmin RENAME method atomic?

2013-05-09 Thread Lan
We need to implement a locking mechanism for a full-reindexing SOLR server
pool. We could use a database, Zookeeper as our locking mechanism but thats
a lot of work. Could solr do it?

I noticed the core admin RENAME function
(http://wiki.apache.org/solr/CoreAdmin#RENAME) Is this an synchronous atomic
operation?

What I'm thinking is we create a solr core named 'lock' and any process that
wants to obtain a solr server from the pool tries to rename the 'lock' core
to say 'lock.someuniqueid'. If it fails, then it tries another server in the
pools or waits a bit. If it succeeds, it reindexes it's data and then
renames 'lock.someuniqueid' back to 'lock' to return the server back to the
pool.









--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-the-CoreAdmin-RENAME-method-atomic-tp4061944.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Frequent OOM - (Unknown source in logs).

2013-05-09 Thread shreejay
We ended up using a Solr 4.0 (now 4.2) without the cloud option. And it seems
to be holding good. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Frequent-OOM-Unknown-source-in-logs-tp4029361p4061945.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrCloud Sorting Results By Relevance

2013-05-09 Thread Furkan KAMACI
When I make a search at Solr 4.2.1 that runs as SolrCloud I get:



First one has that boost:


1.3693064


Second one has that:


1.7501166


and third one:


1.0387472


Here is default schema for Nutch:
http://svn.apache.org/viewvc/nutch/tags/release-2.1/conf/schema-solr4.xml?revision=1388536&view=markup

Do I miss something or result are already sorted by relevance by Solr?


Apache Whirr for SolrCloud with external Zookeeper

2013-05-09 Thread Furkan KAMACI
Hi Folks;

I have tested Solr 4.2.1 as SolrCloud and I think to use 4.3.1 when it is
ready at my pre-production environment. I want to learn that does anybody
uses Apache Whirr for SolrCloud with external Zookeeper ensemble? What
folks are using for such kind of purposes?


Status of EDisMax

2013-05-09 Thread André Widhani
Hi,

what is the current status of the Extended DisMax Query Parser? The release 
notes for Solr 3.1 say it was experimental at that time (two years back).

The current wiki page for EDisMax does not contain any such statement. We 
recently ran into the issue described in SOLR-2649 (using q.op=AND) which I 
think is a very fundamental defect making it unusable at least in our case.

Thanks,
André



Negative Boosting at Recent Versions of Solr?

2013-05-09 Thread Furkan KAMACI
I know that whilst Lucene allows negative boosts, Solr does not. However
did it change with newer versions of Solr (I use Solr 4.2.1) or still same?


Re: Apache Whirr for SolrCloud with external Zookeeper

2013-05-09 Thread Otis Gospodnetic
I've never encountered anyone using Whirr to launch Solr even though
that's possible - http://issues.apache.org/jira/browse/WHIRR-465

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Thu, May 9, 2013 at 5:28 PM, Furkan KAMACI  wrote:
> Hi Folks;
>
> I have tested Solr 4.2.1 as SolrCloud and I think to use 4.3.1 when it is
> ready at my pre-production environment. I want to learn that does anybody
> uses Apache Whirr for SolrCloud with external Zookeeper ensemble? What
> folks are using for such kind of purposes?


Re: Apache Whirr for SolrCloud with external Zookeeper

2013-05-09 Thread Furkan KAMACI
I saw that ticket and wanted to ask it to mail list. I want to give it a
try and feedback to mail list. What folks use for such kind of purposes?


2013/5/10 Otis Gospodnetic 

> I've never encountered anyone using Whirr to launch Solr even though
> that's possible - http://issues.apache.org/jira/browse/WHIRR-465
>
> Otis
> --
> Solr & ElasticSearch Support
> http://sematext.com/
>
>
>
>
>
> On Thu, May 9, 2013 at 5:28 PM, Furkan KAMACI 
> wrote:
> > Hi Folks;
> >
> > I have tested Solr 4.2.1 as SolrCloud and I think to use 4.3.1 when it is
> > ready at my pre-production environment. I want to learn that does anybody
> > uses Apache Whirr for SolrCloud with external Zookeeper ensemble? What
> > folks are using for such kind of purposes?
>


Re: Negative Boosting at Recent Versions of Solr?

2013-05-09 Thread Jack Krupansky
Solr does support both additive and multiplicative boosts. Although Solr 
doesn't support negative multiplicative boosts on query terms, it does 
support fractional multiplicative boosts (0.25) which do allow you to 
de-boost a term.


The boosts for individual query terms and for the edismax "qf" parameter 
cannot be negative, but can be fractional.


The edismax "bf" parameter give a function query that provides an additive 
boost, which could be negative.


The edismax "boost" parameter gives a function query that provides a 
multiplicative boost - which could be negative, so it’s not absolutely true 
that doesn't support negative boosts.


-- Jack Krupansky

-Original Message- 
From: Furkan KAMACI

Sent: Thursday, May 09, 2013 6:08 PM
To: solr-user@lucene.apache.org
Subject: Negative Boosting at Recent Versions of Solr?

I know that whilst Lucene allows negative boosts, Solr does not. However
did it change with newer versions of Solr (I use Solr 4.2.1) or still same? 



Re: Index compatibility between Solr releases.

2013-05-09 Thread Erick Erickson
Solr strives to keep backwards-compatible 1 major revision, so 4.x
should be able to work with 3.x indexes. One caution though, well
actually two.

1> If you have a master/slave setup, upgrade the _slaves_ first. If
you upgrade a master fist and it merges segments, then the slaves
won't be able to read the 4.x formst.

2> make backups first ...

BTW, when the segments are written, they should be written in 4.x
format. So I've heard of people doing the migration, then forcing an
optimize just to bring all the segments up to the 4.x format.

Best
Erick

On Tue, May 7, 2013 at 3:28 PM, Skand Gupta  wrote:
> We have a fairly large (in the order of 10s of TB) indices built using Solr
> 3.5. We are considering migrating to Solr 4.3 and was wondering what the
> policy is on maintaining backward compatibility of the indices? Will 4.3
> work with my 3.5 indexes? Because of the large data size, I would ideally
> like to move new data to 4.3 and gradually re-index all the 3.5 indices.
>
> Thanks,
> - Skand.


Re: Index corrupted detection from http get command.

2013-05-09 Thread Erick Erickson
There's no way to do this that I know of. There's the checkindex
tool, but it's fairly expensive resource-wise and there's no HTTP
command to do it.

Best
Erick

On Tue, May 7, 2013 at 8:04 PM, Michel Dion  wrote:
> Hello,
>
> I'm look for a way to detect solr index corruption using a http get
> command. I've look at the /admin/ping and /admin/luke request handlers but
> not sure if the their status provide guarantees that everything is all
> right. The idea is to be able to tell a load balancer to put a given solr
> instance out of rotation if its index is  corrupted.
>
> Thanks
>
> Michel


Re: transientCacheSize doesn't seem to have any effect, except on startup

2013-05-09 Thread Erick Erickson
I'm slammed with stuff and have to leave for vacation Saturday morning
so I'll be going silent for a while, sorry

Best
Erick

On Wed, May 8, 2013 at 11:27 AM, didier deshommes  wrote:
> Any idea on this? I still cannot get the combination of transient cores and
> transientCacheSize to work as I think it should: give me the ability to
> create a large number cores and automatically load and unload them for me
> based on a limit that I set.
>
> If anyone else is using this feature and it is working for you, let me know
> how you got it working!
>
>
> On Fri, May 3, 2013 at 2:11 PM, didier deshommes  wrote:
>
>>
>> On Fri, May 3, 2013 at 11:18 AM, Erick Erickson 
>> wrote:
>>
>>> The cores aren't loaded (or at least shouldn't be) for getting the status.
>>> The _names_ of the cores should be returned, but those are (supposed) to
>>> be
>>> retrieved from a list rather than loaded cores. So are you sure that's
>>> not what
>>> you are seeing? How are you determining whether the cores are actually
>>> loaded
>>> or not?
>>>
>>>
>> I'm looking at the output of :
>>
>> $ curl "http://localhost:8983/solr/admin/cores?wt=json&action=status";
>>
>> cores that are loaded have a "startTime" and "upTime" value. Cores that
>> are unloaded don't appear in the output at all. For example, I created 3
>> transient cores with "transientCacheSize=2" . When I asked for a list of
>> all cores, all 3 cores were returned. I explicitly unloaded 1 core and got
>> back 2 cores when I asked for the list again.
>>
>> It would be nice if cores had a "isTransient" and a "isCurrentlyLoaded"
>> value so that one could see exactly which cores are loaded.
>>
>>
>>
>>
>>> That said, it's perfectly possible that the status command is doing
>>> something we
>>> didn't anticipate, but I took a quick look at the code (got to rush to a
>>> plane)
>>> and CoreAdminHandler _appears_ to be just returning whatever info it can
>>> about an unloaded core for status. I _think_ you'll get more info if the
>>> core has ever been loaded though, even though if it's been removed from
>>> the transient cache. Ditto for the create action.
>>>
>>> So let's figure out whether you're really seeing loaded cores or not, and
>>> then
>>> raise a JIRA if so...
>>>
>>> Thanks for reporting!
>>> Erick
>>>
>>> On Thu, May 2, 2013 at 1:27 PM, didier deshommes 
>>> wrote:
>>> > Hi,
>>> > I've been very interested in the transient core feature of solr to
>>> manage a
>>> > large number of cores. I'm especially interested in this use case, that
>>> the
>>> > wiki lists at http://wiki.apache.org/solr/LotsOfCores (looks to be down
>>> > now):
>>> >
>>> >>loadOnStartup=false transient=true: This is really the use-case. There
>>> are
>>> > a large number of cores in your system that are short-duration use. You
>>> > want Solr to load them as necessary, but unload them when the cache gets
>>> > full on an LRU basis.
>>> >
>>> > I'm creating 10 transient core via core admin like so
>>> >
>>> > $ curl "
>>> >
>>> http://localhost:8983/solr/admin/cores?wt=json&action=CREATE&name=new_core2&instanceDir=collection1/&dataDir=new_core2&transient=true&loadOnStartup=false
>>> > "
>>> >
>>> > and have "transientCacheSize=2" in my solr.xml file, which I take means
>>> I
>>> > should have at most 2 transient cores loaded at any time. The problem is
>>> > that these cores are still loaded when when I ask solr to list cores:
>>> >
>>> > $ curl "http://localhost:8983/solr/admin/cores?wt=json&action=status";
>>> >
>>> > From the explanation in the wiki, it looks like solr would manage
>>> loading
>>> > and unloading transient cores for me without having to worry about them,
>>> > but this is not what's happening.
>>> >
>>> > The situation is different when I restart solr; it does the "right
>>> thing"
>>> > by loading the maximum cores set by transientCacheSize. When I add more
>>> > cores, the old behavior happens again, where all created transient cores
>>> > are loaded in solr.
>>> >
>>> > I'm using the development branch lucene_solr_4_3 to run my example. I
>>> can
>>> > open a jira if need be.
>>>
>>
>>


Re: Apache Whirr for SolrCloud with external Zookeeper

2013-05-09 Thread Otis Gospodnetic
Great, let us know how it works for you. Blog post?

Otis
Solr & ElasticSearch Support
http://sematext.com/
On May 9, 2013 6:30 PM, "Furkan KAMACI"  wrote:

> I saw that ticket and wanted to ask it to mail list. I want to give it a
> try and feedback to mail list. What folks use for such kind of purposes?
>
>
> 2013/5/10 Otis Gospodnetic 
>
> > I've never encountered anyone using Whirr to launch Solr even though
> > that's possible - http://issues.apache.org/jira/browse/WHIRR-465
> >
> > Otis
> > --
> > Solr & ElasticSearch Support
> > http://sematext.com/
> >
> >
> >
> >
> >
> > On Thu, May 9, 2013 at 5:28 PM, Furkan KAMACI 
> > wrote:
> > > Hi Folks;
> > >
> > > I have tested Solr 4.2.1 as SolrCloud and I think to use 4.3.1 when it
> is
> > > ready at my pre-production environment. I want to learn that does
> anybody
> > > uses Apache Whirr for SolrCloud with external Zookeeper ensemble? What
> > > folks are using for such kind of purposes?
> >
>


Re: SolrCloud: IOException occured when talking to server at

2013-05-09 Thread Shawn Heisey
On 5/9/2013 7:31 AM, heaven wrote:
> Can confirm this lead to data loss. I have 1217427 records in database and
> only 1217216 indexed. Which does mean that Solr gave a successful response
> and then did not added some documents to the index.
> 
> Seems like SolrCloud is not a production-ready solution, would be good if
> there was a warning in the Solr wiki about that.

You've got some kind of underlying problem here.  Here are my guesses
about what that might be:

- An improperly configured Linux firewall and/or SELinux is enabled.
- The hardware is already overtaxed by other software.
- Your zkClientTimeout value is extremely small.
- Your GC pauses are large.
- You're running into an open file limit.

Here's what you could do to resolve each of these:

- Disable the firewall and selinux, reboot.
- Stop other software.
- The example zkClientTimeout is 15 seconds. Try 30-60.
- See http://wiki.apache.org/solr/SolrPerformanceProblems for some GC ideas.
- Increase the file and process limits.  For most versions of Linux, in
/etc/security/limits.conf:

solr hardnproc   6144
solr softnproc   4096
solr hardnofile  65536
solr softnofile  49152

These numbers should be sufficient for deployments considerably larger
than yours.

SolrCloud is not only production ready, it's being used by many many
people for extremely large indexes.  My own SolrCloud deployment is
fairly small with only 1.5 million docs, but it's extremely stable.  I
also have a somewhat large (77 million docs) non-cloud deployment.

Are you running 4.2.1?  I feel fairly certain based on your screenshots
that you are not running 4.3, but I can't tell which version you are
running.  There are some bugs in the 4.3 release, a 4.3.1 will be
released soon.  If you had planned to upgrade, you should wait for 4.3.1
or 4.4.

NB, and something you might already know: When talking about
production-ready, you can't run everything on the same server.  You need
at least three - two of them can run Solr and zookeeper, and the third
runs zookeeper.  This single-server setup is fine for a proof-of-concept.

Thanks,
Shawn



Re: SolrCloud Sorting Results By Relevance

2013-05-09 Thread Otis Gospodnetic
Hits are sorted by relevance score by default. You are listing boost.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On May 9, 2013 5:16 PM, "Furkan KAMACI"  wrote:

> When I make a search at Solr 4.2.1 that runs as SolrCloud I get:
>
> 
>
> First one has that boost:
>
> 
> 1.3693064
> 
>
> Second one has that:
>
> 
> 1.7501166
> 
>
> and third one:
>
> 
> 1.0387472
> 
>
> Here is default schema for Nutch:
>
> http://svn.apache.org/viewvc/nutch/tags/release-2.1/conf/schema-solr4.xml?revision=1388536&view=markup
>
> Do I miss something or result are already sorted by relevance by Solr?
>


Re: Status of EDisMax

2013-05-09 Thread Otis Gospodnetic
Didn't check that issue,  but edismax is not experimental any more - most
solr users use it.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On May 9, 2013 5:36 PM, "André Widhani"  wrote:

> Hi,
>
> what is the current status of the Extended DisMax Query Parser? The
> release notes for Solr 3.1 say it was experimental at that time (two years
> back).
>
> The current wiki page for EDisMax does not contain any such statement. We
> recently ran into the issue described in SOLR-2649 (using q.op=AND) which I
> think is a very fundamental defect making it unusable at least in our case.
>
> Thanks,
> André
>
>


Re: Does Distributed Search are Cached Only the By Node That Runs Query?

2013-05-09 Thread Otis Gospodnetic
You are looking at jvm heap but attributing it to caching only. Not quite
right...there are other things in that jvm heap.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On May 9, 2013 3:55 PM, "Furkan KAMACI"  wrote:

> I have Solr 4.2.1 and run them as SolrCloud. When I do a search on
> SolrCloud as like that:
>
> ip_of_node_1:8983solr/select?q=*:*&rows=1
>
> and when I check admin page I see that:
>
> I have 5 GB Java Heap. 616.32 MB is dark gray, 3.13 GB is gray.
>
> Before my search it was something like: 150 MB dark gray, 500 MB gray.
>
> I understand that when I do a search like that, fields are cached. However
> when I look at other SolrCloud nodes' admin pages there are no differences.
> Why that query is cached only by the node that I run that query on?
>


Re: More Like This and Caching

2013-05-09 Thread Otis Gospodnetic
This is correct,  doc cache for previously read docs regardless of which
query read them and query cache for repeat query. Plus OS cache for actual
index files.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On May 9, 2013 2:32 PM, "Jason Hellman" 
wrote:

> Purely from empirical observation, both the DocumentCache and
> QueryResultCache are being populated and reused in reloads of a simple MLT
> search.  You can see in the cache inserts how much extra-curricular
> activity is happening to populate the MLT data by how many inserts and
> lookups occur on the first load.
>
> (lifted right out of the MLT wiki http://wiki.apache.org/solr/MoreLikeThis)
>
>
> http://localhost:8983/solr/select?q=apache&mlt=true&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fl=id,score
>
> There is no activity in the filterCache, fieldCache, or fieldValueCache -
> and that makes plenty of sense.
>
> On May 9, 2013, at 11:12 AM, David Parks  wrote:
>
> > I'm not the expert here, but perhaps what you're noticing is actually the
> > OS's disk cache. The actual solr index isn't cached by solr, but as you
> read
> > the blocks off disk the OS disk cache probably did cache those blocks for
> > you. On the 2nd run the index blocks were read out of memory.
> >
> > There was a very extensive discussion on this list not long back titled:
> > "Re: SolrCloud loadbalancing, replication, and failover" look that
> thread up
> > and you'll get a lot of in-depth on the topic.
> >
> > David
> >
> >
> > -Original Message-
> > From: Giammarco Schisani [mailto:giamma...@schisani.com]
> > Sent: Thursday, May 09, 2013 2:59 PM
> > To: solr-user@lucene.apache.org
> > Subject: More Like This and Caching
> >
> > Hi all,
> >
> > Could anybody explain which Solr cache (e.g. queryResultCache,
> > documentCache, fieldCache, etc.) can be used by the More Like This
> handler?
> >
> > One of my colleagues had previously suggested that the More Like This
> > handler does not take advantage of any of the Solr caches.
> >
> > However, if I issue two identical MLT requests to the same Solr instance,
> > the second request will execute much faster than the first request (for
> > example, the first request will execute in 200ms and the second request
> will
> > execute in 20ms). This makes me believe that at least one of the Solr
> caches
> > is being used by the More Like This handler.
> >
> > I think the "documentCache" is the cache that is most likely being used,
> but
> > would you be able to confirm?
> >
> > As information, I am currently using Solr version 3.6.1.
> >
> > Kind regards,
> > Giammarco Schisani
> >
>
>


Re: Per Shard Replication Factor

2013-05-09 Thread Otis Gospodnetic
Could these just be different collections? Then sharding and replication is
independent.  And you can reduce replication factor as the index ages.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On May 9, 2013 1:43 AM, "Steven Bower"  wrote:

> Is it currently possible to have per-shard replication factor?
>
> A bit of background on the use case...
>
> If you are hashing content to shards by a known factor (lets say date
> ranges, 12 shards, 1 per month) it might be the case that most of your
> search traffic would be directed to one particular shard (eg. the current
> month shard) and having increased query capacity in that shard would be
> useful... this could be extended to many use cases such as data hashed by
> organization, type, etc.
>
> Thanks,
>
> steve
>


Re: 4.3 logging setup

2013-05-09 Thread Jan Høydahl
I've updated the WIKI: 
http://wiki.apache.org/solr/SolrLogging#Switching_from_Log4J_logging_back_to_Java-util_logging

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

9. mai 2013 kl. 21:57 skrev Shawn Heisey :

> On 5/9/2013 1:41 PM, richardg wrote:
>> These are the files I have in my /lib folder:
>> 
>> slf4j-api-1.6.6
>> log4j-1.2.16
>> jul-to-slf4j-1.6.6
>> jcl-over-slf4j-1.6.6
>> slf4j-jdk14-1.6.6
>> log4j-over-slf4j-1.6.6
>> 
>> Currently everything seems to be logging like before.  After I followed the
>> instructions in Jan's post replacing slf4j-log4j12-1.6.6.jar with this
>> slf4j-jdk14-1.6.6.jar it all started working.  Shawn I then removed
>> everything as you instructed and put in just  log4j-over-slf4j-1.6.6.jar and
>> slf4j-jdk14-1.6.6.jar but the index showed an error and wouldn't start.  So
>> that is why I have those 6 files in there now, I'm not sure if
>> log4j-over-slf4j-1.6.6.jar this file is needed or not.  Let me know if you
>> need me to test anything else.
> 
> You're on the right track.  Your list just has two files that shouldn't be 
> there - log4j-1.2.16 and jul-to-slf4j-1.6.6.  They are probably not causing 
> any real problems, but they might in the future.
> 
> Remove those and you will have the exact list I was looking for.  If that 
> doesn't work, use a paste website (pastie.org and others) to send a log 
> showing the errors you get.
> 
> Thanks,
> Shawn
> 



Re: SOLR Error: Document is missing mandatory uniqueKey field

2013-05-09 Thread zaheer.java
Here is the stack trace:

DEBUG - 2013-05-09 18:53:06.411;
org.apache.solr.update.processor.LogUpdateProcessor; PRE_UPDATE
add{,id=(null)} {wt=javabin&version=2}
DEBUG - 2013-05-09 18:53:06.411;
org.apache.solr.update.processor.LogUpdateProcessor; PRE_UPDATE FINISH
{wt=javabin&version=2}
INFO  - 2013-05-09 18:53:06.412;
org.apache.solr.update.processor.LogUpdateProcessor; [orderitemsStage]
webapp=/solr path=/update params={wt=javabin&version=2}
{add=[488653_0_0_141_388 (1434610076088270848), 488653_0_0_141_388
(1434610076090368000), 488653_0_0_141_388 (1434610076091416576),
488653_0_0_141_388 (1434610076091416577), 488653_0_0_141_388
(1434610076092465152), 488653_0_0_141_388 (1434610076093513728),
488653_0_0_141_388 (1434610076094562304), 488653_0_0_141_388
(1434610076094562305), 488653_0_0_141_388 (1434610076095610880),
488653_0_0_141_388 (1434610076096659456), ... (4031 adds)]} 0 2790
ERROR - 2013-05-09 18:53:06.412; org.apache.solr.common.SolrException;
org.apache.solr.common.SolrException: Document is missing mandatory
uniqueKey field: orderItemKey
at
org.apache.solr.update.AddUpdateCommand.getIndexedId(AddUpdateCommand.java:88)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:517)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:396)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:656)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-Error-Document-is-missing-mandatory-uniqueKey-field-tp4062177p4062178.html
Sent from the Solr - User mailing list archive at Nabble.com.


SOLR Error: Document is missing mandatory uniqueKey field

2013-05-09 Thread zaheer.java
I repeatedly get this error while adding documents to SOLR using SOLRJ
"Document is missing mandatory uniqueKey field: orderItemKey".  This field
is defined as uniqueKey in the Document Schema. I've made sure that I'm
passing this field from Java by logging it upfront. 

As suggested somwhere, I've tried upgrading from 4.0 to 4.3, and also made
the field as "required=false". 

Please help me debug or get a resolution to this problem.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-Error-Document-is-missing-mandatory-uniqueKey-field-tp4062177.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: dataimport handler

2013-05-09 Thread William Bell
It does not work anymore in 4.x.

${dih.last_index_time} does work, but the entity version does not.

Bill



On Tue, May 7, 2013 at 4:19 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> Using ${dih..last_index_time} should work. Make sure you put
> it in quotes in your query.
>
>
> On Tue, May 7, 2013 at 12:07 PM, Eric Myers  wrote:
>
> > In the  data import handler  I have multiple entities.  Each one
> > generates a date in the
> > dataimport.properties i.e. entityname.last_index_time.
> >
> > How do I reference the specific entity time in my delta queries?
> >
> > Thanks
> >
> > Eric
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: SOLR Error: Document is missing mandatory uniqueKey field

2013-05-09 Thread Shawn Heisey
On 5/9/2013 7:44 PM, zaheer.java wrote:
> I repeatedly get this error while adding documents to SOLR using SOLRJ
> "Document is missing mandatory uniqueKey field: orderItemKey".  This field
> is defined as uniqueKey in the Document Schema. I've made sure that I'm
> passing this field from Java by logging it upfront. 
> 
> As suggested somwhere, I've tried upgrading from 4.0 to 4.3, and also made
> the field as "required=false". 

If you have a uniqueKey defined in your schema, then every document must
define that field or you'll get the error message you're seeing.  That's
the entire point of a uniqueKey.  It is pretty much the same concept as
a primary key on a database table.

There is one main difference between uniqueKey and a DB primary key -
the database will prevent you from inserting a record with the same ID
as an existing record, but Solr uses it to allow easy reindexing.
Sending a document with the same ID as an existing document will cause
Solr to delete the old one before inserting the new one.

Certain Solr features, notably distributed search, require a uniqueKey.
 SolrCloud uses distributed search so it also requires it.

If you're not using features that require uniqueKey, and you don't need
Solr to delete duplicate documents, then you can remove that from your
schema.  It's not recommended, but it should work.

Thanks,
Shawn



Re: SOLR guidance required

2013-05-09 Thread Shawn Heisey
On 5/9/2013 9:41 PM, Kamal Palei wrote:
> I hope there must be some mechanism, by which I can associate salary,
> experience, age etc with resume document during indexing. And when
> I search for resumes I can give all filters accordingly and can retrieve
> 100 records and strait way I can show 100 records to user without doing any
> mysql query. Please let me know if this is feasible. If so, kindly give me
> some pointer how do I do it.

If you define fields for these values in your schema, then you can send
send filter queries to restrict the search.  Solr will filter invalid
documents out and only send the results that match your requirements.
Some examples of the filter queries you can use are below.  You can add
more than one of these; they will be ANDed together.

&fq=age:[21 TO 45]
&fq=experience:[2 TO *]
&fq=salaryReq:[* TO 55000]

If you're using a Solr API (for Java, PHP, etc) rather than constructing
a URL to send directly to Solr, then the API will have a mechanism for
adding filters to your query.

One caveat: unless you can write code that will automatically extract
this information from a resume and/or application, then you will need
someone doing data entry that drives the indexing, or you will need
prospective employees to fill out a computerized form for their application.

Thanks,
Shawn



RE: Is the CoreAdmin RENAME method atomic?

2013-05-09 Thread David Parks
Find the discussion titled "Indexing off the production servers" just a week
ago in this same forum, there is a significant discussion of this feature
that you will probably want to review.


-Original Message-
From: Lan [mailto:dung@gmail.com] 
Sent: Friday, May 10, 2013 3:42 AM
To: solr-user@lucene.apache.org
Subject: Is the CoreAdmin RENAME method atomic?

We need to implement a locking mechanism for a full-reindexing SOLR server
pool. We could use a database, Zookeeper as our locking mechanism but thats
a lot of work. Could solr do it?

I noticed the core admin RENAME function
(http://wiki.apache.org/solr/CoreAdmin#RENAME) Is this an synchronous atomic
operation?

What I'm thinking is we create a solr core named 'lock' and any process that
wants to obtain a solr server from the pool tries to rename the 'lock' core
to say 'lock.someuniqueid'. If it fails, then it tries another server in the
pools or waits a bit. If it succeeds, it reindexes it's data and then
renames 'lock.someuniqueid' back to 'lock' to return the server back to the
pool.









--
View this message in context:
http://lucene.472066.n3.nabble.com/Is-the-CoreAdmin-RENAME-method-atomic-tp4
061944.html
Sent from the Solr - User mailing list archive at Nabble.com.



multiValue schema example

2013-05-09 Thread manju16832003
Hi,
I have two table to be indexed.
Table-1: Person [id,name,addr]
Table-2: Docs [id,doc_name,person_id]

Relationship is: One person can have many documents.

I would like to get json as follows
[ { "id" : "name", "address",["doc_id1", "doc_id2", "doc_id3,etc"] }]

How would I configure this in Solr.
How would I write configure the queries in data-config.xml

I tried this way. Main query and there is sub-query that returns many rows
to main query.

 


 


This is how my schema.xml data filed configuration

  
   
   

   


I couldn't succeed. 
What is the way to achieve the above scenario.

Thanks in advance.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/multiValue-schema-example-tp4062209.html
Sent from the Solr - User mailing list archive at Nabble.com.