> {{doc.title}}
> {{doc.text}}
> {{doc.metadata}}
>
>
>
> 3) As for the license- I take your ribbing in the spirit in which it was
> intended :) Seriously though- this is my first open source contribution, so
> I haven't given licensing a lot of though. What would a more appropriate
> license be?
>
> Fergie
>
> On Sun, Feb 17, 2013 at 12:43 PM, Erik Hatcher wrote:
>
>> Fergie -
>>
>> Nice!
>>
>> I was able to get this working on a Solr 4.1 "example" instance following
>> these steps:
>>
>> * Adjusting SERVERROOT in bootstrap/js/solrstrap.js to
>> http://localhost:8983/solr/collection1/select/
>> * Changed line #38 in the same file to this:
>>
>> rs.append(hitTemplate({title: result.response.docs[i].name,
>> text: result.response.docs[i].text}));
>>
>> Just changing ".title" to ".name" since Solr's exampledocs/*.xml files use
>> "name" not "title".
>>
>> I like projects like this, making it really point and click easy to see
>> and work with Solr. I'll just point out the important caveat that you
>> mention, that it's "Designed for "open" solr instances" and "needs clear
>> access to /select", as this is something easy to overlook at first
>> (beautiful) glance and think we can just go to production without taking
>> the necessary other steps to prevent Solr from being exposed directly.
>>
>> This is a nice start to a fun way to get started with Solr.
>>
>> A few questions:
>>
>> What would it take to get the full document object passed into the hit
>> template? And what would that hit template then look like? (navigating
>> say a "doc" object in the template rather than each field being passed
>> explicitly)
>>
>> Right now it's called from the above line of code (is hitTemplate()
>> mapping to the id="hit-template" in solrstramp.html part of handlebars
>> magic? Or is this explicit somewhere?)
>>
>> Here's the current hit template:
>>
>>
>> is it appropriate to use external cache for whole shards
I'm indexing and searching documents using solr 6.x. It is quite efficient when there are fewer shards and fewer cluster units. However, when the number of shards exceeds 30 and the size of each shard is 30G, the search performance is significantly reduced. Currently, usercache in solr is actively used, so we plan queryResultCache for the entire shards. Is this the right solution what trying to use an external cache?(for example, redis, memcahced, apache ignite, etc.) -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: is it appropriate to use external cache for whole shards
Thank you for answer. We will improve our system based on what you said. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
SolrJ: SolrQuery and ModifiableSolrParams
Hello, What is the difference between setting parameters via SolrQuery vs ModifiableSolrParams? If there is a difference, is there a preferred choice? I'm using Solr 4.6.1. SolrQuery query = new SolrQuery(); query.setParam("wt", "json"); ModifiableSolrParams params = new ModifiableSolrParams(); params.set("wt", "json");
Re: [ANNOUNCE] Apache Solr 4.5.1 released.
Download redirects to 4.5.0 Is there a typo in the server path? On Thu, Oct 24, 2013 at 9:14 AM, Mark Miller wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > October 2013, Apache Solr™ 4.5.1 available > > The Lucene PMC is pleased to announce the release of Apache Solr 4.5.1 > > Solr is the popular, blazing fast, open source NoSQL search platform > from the Apache Lucene project. Its major features include powerful > full-text search, hit highlighting, faceted search, dynamic clustering, > database integration, rich document (e.g., Word, PDF) handling, and > geospatial search. Solr is highly scalable, providing fault tolerant > distributed search and indexing, and powers the search and navigation > features of many of the world's largest internet sites. > > Solr 4.5.1 includes 16 bug fixes as well as Lucene 4.5.1 and its bug > fixes. The release is available for immediate download at: > > http://lucene.apache.org/solr/mirrors-solr-latest-redir.html > > > See the CHANGES.txt file included with the release for a full list of > changes and further details. > > Please report any feedback to the mailing lists > (http://lucene.apache.org/solr/discussion.html) > > Note: The Apache Software Foundation uses an extensive mirroring network > for distributing releases. It is possible that the mirror you are using > may not have replicated the release yet. If that is the case, please try > another mirror. This also goes for Maven access. > > Happy searching, > > Lucene/Solr developers > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.14 (GNU/Linux) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iQIcBAEBAgAGBQJSaUdSAAoJED+/0YJ4eWrI90UP/RGSmLBdvrc/5NZEb7LSCSjW > z4D3wJ2i4a0rLpiW2qA547y/NZ5KZcmrDSzJu0itf8Q/0q+tm7/d30uPg/cdRlgl > wGERcxsyfPfTqBjzdSNNGgNm++tnkkqRJbYEfsG5ApWrKicitU7cPb82m8oCdlnn > 4wnhYt6tfu/EPCglt9ixF7Ukv5o7txMnwWGmkGTbUt8ugp9oOMN/FfGHex/FVxcF > xHhWBLymIJy24APEEF/Mq3UW12hQT+aRof66xBch0fEPVlbDitBa9wNuRNQ98M90 > ZpTl8o0ITMUKjTKNkxZJCO5LQeNwhYaOcM5nIykGadWrXBZo5Ob611ZKeYPZBWCW > Ei88dwJQkXaDcVNLZ/HVcAePjmcALHd3nc4uNfcJB8zvgZOPagMpXW2rRSXFACHM > FdaRezTdH8Uh5zp2n3hsqYCbpDreRoXGXaiOgVZ+8EekVMGYUnMFKdqNlqhVnF6r > tzp+aaCBhGDUD5xUw2w2fb5c9Jh1oIQ9f7fsVH78kgsHShySnte3NbfoFWUClPMX > PwrfWuZpmu9In2ZiJVYSOD6MBqmJ+z3N1bnf1kqsitv7MonkvQkOoDIafW835vG9 > 3aajknE1vazOATSGHIxCtJfqzTEqeqFqVbjG/qS72XIhMey8tVAwjrjcgFnayk9Z > xrG1W1o2sjrYkioJ7nZK > =8++G > -END PGP SIGNATURE-
Re: [ANNOUNCE] Apache Solr 4.5.1 released.
Use a different server than default gets 4.5.1 On Thu, Oct 24, 2013 at 9:35 AM, Jack Park wrote: > Download redirects to 4.5.0 > Is there a typo in the server path? > > On Thu, Oct 24, 2013 at 9:14 AM, Mark Miller wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA1 >> >> October 2013, Apache Solr™ 4.5.1 available >> >> The Lucene PMC is pleased to announce the release of Apache Solr 4.5.1 >> >> Solr is the popular, blazing fast, open source NoSQL search platform >> from the Apache Lucene project. Its major features include powerful >> full-text search, hit highlighting, faceted search, dynamic clustering, >> database integration, rich document (e.g., Word, PDF) handling, and >> geospatial search. Solr is highly scalable, providing fault tolerant >> distributed search and indexing, and powers the search and navigation >> features of many of the world's largest internet sites. >> >> Solr 4.5.1 includes 16 bug fixes as well as Lucene 4.5.1 and its bug >> fixes. The release is available for immediate download at: >> >> http://lucene.apache.org/solr/mirrors-solr-latest-redir.html >> >> >> See the CHANGES.txt file included with the release for a full list of >> changes and further details. >> >> Please report any feedback to the mailing lists >> (http://lucene.apache.org/solr/discussion.html) >> >> Note: The Apache Software Foundation uses an extensive mirroring network >> for distributing releases. It is possible that the mirror you are using >> may not have replicated the release yet. If that is the case, please try >> another mirror. This also goes for Maven access. >> >> Happy searching, >> >> Lucene/Solr developers >> -BEGIN PGP SIGNATURE- >> Version: GnuPG v1.4.14 (GNU/Linux) >> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ >> >> iQIcBAEBAgAGBQJSaUdSAAoJED+/0YJ4eWrI90UP/RGSmLBdvrc/5NZEb7LSCSjW >> z4D3wJ2i4a0rLpiW2qA547y/NZ5KZcmrDSzJu0itf8Q/0q+tm7/d30uPg/cdRlgl >> wGERcxsyfPfTqBjzdSNNGgNm++tnkkqRJbYEfsG5ApWrKicitU7cPb82m8oCdlnn >> 4wnhYt6tfu/EPCglt9ixF7Ukv5o7txMnwWGmkGTbUt8ugp9oOMN/FfGHex/FVxcF >> xHhWBLymIJy24APEEF/Mq3UW12hQT+aRof66xBch0fEPVlbDitBa9wNuRNQ98M90 >> ZpTl8o0ITMUKjTKNkxZJCO5LQeNwhYaOcM5nIykGadWrXBZo5Ob611ZKeYPZBWCW >> Ei88dwJQkXaDcVNLZ/HVcAePjmcALHd3nc4uNfcJB8zvgZOPagMpXW2rRSXFACHM >> FdaRezTdH8Uh5zp2n3hsqYCbpDreRoXGXaiOgVZ+8EekVMGYUnMFKdqNlqhVnF6r >> tzp+aaCBhGDUD5xUw2w2fb5c9Jh1oIQ9f7fsVH78kgsHShySnte3NbfoFWUClPMX >> PwrfWuZpmu9In2ZiJVYSOD6MBqmJ+z3N1bnf1kqsitv7MonkvQkOoDIafW835vG9 >> 3aajknE1vazOATSGHIxCtJfqzTEqeqFqVbjG/qS72XIhMey8tVAwjrjcgFnayk9Z >> xrG1W1o2sjrYkioJ7nZK >> =8++G >> -END PGP SIGNATURE-
First test cloud error question...
Background: all testing done on a Win7 platform. This is my first migration from a single Solr server to a simple cloud. Everything is configured exactly as specified in the wiki. I created a simple 3-node client, all localhost with different server URLs, and a lone external zookeeper. The online admin shows they are all up. I then start an agent which sends in documents to "bootstrap" the index. That's when issues start. A clip from the log shows this: First, I create a SolrDocument with this JSON data: DEBUG 2013-10-24 18:00:09,143 [main] - SolrCloudClient.mapToDocument- {"locator":"EssayNodeType","smallIcon":"\/images\/cogwheel.png","subOf":["NodeType"],"details":["The TopicQuests NodeTypes typology essay type."],"isPrivate":"false","creatorId":"SystemUser","label":["Essay Type"],"largeIcon":"\/images\/cogwheel_sm.png","lastEditDate":Thu Oct 24 18:00:09 PDT 2013,"createdDate":Thu Oct 24 18:00:09 PDT 2013} Then, send it in from SolrJ which has a CloudSolrServer initialized with localhost:2181 and an instance of LBHttpSolrServer initialized with http://localhost:8983/solr/ That trace follows INFO 2013-10-24 18:00:09,145 [main] - Initiating client connection, connectString=localhost:2181 sessionTimeout=1 watcher=org.apache.solr.common.cloud.ConnectionManager@e6c INFO 2013-10-24 18:00:09,148 [main] - Waiting for client to connect to ZooKeeper INFO 2013-10-24 18:00:09,150 [main-SendThread(0:0:0:0:0:0:0:1:2181)] - Opening socket connection to server 0:0:0:0:0:0:0:1/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration) ERROR 2013-10-24 18:00:09,151 [main-SendThread(0:0:0:0:0:0:0:1:2181)] - Unable to open socket to 0:0:0:0:0:0:0:1/0:0:0:0:0:0:0:1:2181 WARN 2013-10-24 18:00:09,151 [main-SendThread(0:0:0:0:0:0:0:1:2181)] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.SocketException: Address family not supported by protocol family: connect at sun.nio.ch.Net.connect(Native Method) at sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:266) I can watch the Zookeeper console running; it's mostly complaining about too many connections from /127.0.0.1 ; I am seeing the errors in the agent's log file. Following that trace in the log is this: INFO 2013-10-24 18:00:09,447 [main-SendThread(127.0.0.1:2181)] - Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration) INFO 2013-10-24 18:00:09,448 [main-SendThread(127.0.0.1:2181)] - Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session DEBUG 2013-10-24 18:00:09,449 [main-SendThread(127.0.0.1:2181)] - Session establishment request sent on 127.0.0.1/127.0.0.1:2181 DEBUG 2013-10-24 18:00:09,449 [main-SendThread(127.0.0.1:2181)] - Could not retrieve login configuration: java.lang.SecurityException: Unable to locate a login configuration INFO 2013-10-24 18:00:09,501 [main-SendThread(127.0.0.1:2181)] - Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x141ece7e6160017, negotiated timeout = 1 INFO 2013-10-24 18:00:09,501 [main-EventThread] - Watcher org.apache.solr.common.cloud.ConnectionManager@42bad8a8 name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent state:SyncConnected type:None path:null path:null type:None INFO 2013-10-24 18:00:09,502 [main] - Client is connected to ZooKeeper DEBUG 2013-10-24 18:00:09,502 [main-SendThread(127.0.0.1:2181)] - Could not retrieve login configuration: java.lang.SecurityException: Unable to locate a login configuration DEBUG 2013-10-24 18:00:09,502 [main-SendThread(127.0.0.1:2181)] - Could not retrieve login configuration: java.lang.SecurityException: Unable to locate a login configuration DEBUG 2013-10-24 18:00:09,503 [main-SendThread(127.0.0.1:2181)] - Could not retrieve login configuration: java.lang.SecurityException: Unable to locate a login configuration DEBUG 2013-10-24 18:00:09,503 [main-SendThread(127.0.0.1:2181)] - Could not retrieve login configuration: java.lang.SecurityException: Unable to locate a login configuration DEBUG 2013-10-24 18:00:09,504 [main-SendThread(127.0.0.1:2181)] - Could not retrieve login configuration: java.lang.SecurityException: Unable to locate a login configuration DEBUG 2013-10-24 18:00:09,504 [main-SendThread(127.0.0.1:2181)] - Could not retrieve login configuration: java.lang.SecurityException: Unable to locate a login configuration DEBUG 2013-10-24 18:00:09,505 [main-SendThread(127.0.0.1:2181)] - Could not retrieve login configuration: java.lang.SecurityException: Unable to locate a login configuration DEBUG 2013-10-24 18:00:09,506 [main-SendThread(127.0.0.1:2181)] - Reading reply sessionid:0x141ece7e6160017, packet:: clientPath:null serverPath:null finished:false header:: 1,3 replyHeader:: 1,541,0 request:: '/clus
Re: First test cloud error question...
Focus turned to the issue of " Unable to open socket to 0:0:0:0:0:0:0:1/0:0:0:0:0:0:0:1:2181" That's apparently been problematic for others as well. It might be at root here. I believe I am able to prove zookeeper is running by asking its status, which reports at least something. I moved the entire SolrCloud installation over to an Ubuntu box. There, I see a new problem: zookeeper doesn't appear to be running, even though it says "STARTED" after booting, but gives no further console messages and returns to the command line as if it ended. If I ask zkServer status, it says "probably not running". In the data directory (/tmp/zookeeper -- same for both windoz and nix), in nix, there is a pid file and nothing else. In the windoz data, there is no pid file, but a /version-2 directory with what appear to be runtime log files -- not the debug ones. Neither installation shows a log4j log file anywhere. I have reason to believe I followed all the instructions in the ZooKeeper Getting Started page accurately. Still, no real cigar... Java on windoz is 1.6.0_31; on ubuntu it is 1.7.0_40 Thanks in advance for any hints. On Thu, Oct 24, 2013 at 6:24 PM, Jack Park wrote: > Background: all testing done on a Win7 platform. This is my first > migration from a single Solr server to a simple cloud. Everything is > configured exactly as specified in the wiki. > > I created a simple 3-node client, all localhost with different server > URLs, and a lone external zookeeper. The online admin shows they are > all up. > > I then start an agent which sends in documents to "bootstrap" the > index. That's when issues start. A clip from the log shows this: > First, I create a SolrDocument with this JSON data: > > DEBUG 2013-10-24 18:00:09,143 [main] - SolrCloudClient.mapToDocument- > {"locator":"EssayNodeType","smallIcon":"\/images\/cogwheel.png","subOf":["NodeType"],"details":["The > TopicQuests NodeTypes typology essay > type."],"isPrivate":"false","creatorId":"SystemUser","label":["Essay > Type"],"largeIcon":"\/images\/cogwheel_sm.png","lastEditDate":Thu Oct > 24 18:00:09 PDT 2013,"createdDate":Thu Oct 24 18:00:09 PDT 2013} > > Then, send it in from SolrJ which has a CloudSolrServer initialized > with localhost:2181 and an instance of LBHttpSolrServer initialized > with http://localhost:8983/solr/ > > That trace follows > > INFO 2013-10-24 18:00:09,145 [main] - Initiating client connection, > connectString=localhost:2181 sessionTimeout=1 > watcher=org.apache.solr.common.cloud.ConnectionManager@e6c > INFO 2013-10-24 18:00:09,148 [main] - Waiting for client to connect > to ZooKeeper > INFO 2013-10-24 18:00:09,150 [main-SendThread(0:0:0:0:0:0:0:1:2181)] > - Opening socket connection to server > 0:0:0:0:0:0:0:1/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate > using SASL (Unable to locate a login configuration) > ERROR 2013-10-24 18:00:09,151 [main-SendThread(0:0:0:0:0:0:0:1:2181)] > - Unable to open socket to 0:0:0:0:0:0:0:1/0:0:0:0:0:0:0:1:2181 > WARN 2013-10-24 18:00:09,151 [main-SendThread(0:0:0:0:0:0:0:1:2181)] > - Session 0x0 for server null, unexpected error, closing socket > connection and attempting reconnect > java.net.SocketException: Address family not supported by protocol > family: connect > at sun.nio.ch.Net.connect(Native Method) > at sun.nio.ch.SocketChannelImpl.connect(Unknown Source) > at > org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:266) > > I can watch the Zookeeper console running; it's mostly complaining > about too many connections from /127.0.0.1 ; I am seeing the errors in > the agent's log file. > > Following that trace in the log is this: > > INFO 2013-10-24 18:00:09,447 [main-SendThread(127.0.0.1:2181)] - > Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not > attempt to authenticate using SASL (Unable to locate a login > configuration) > INFO 2013-10-24 18:00:09,448 [main-SendThread(127.0.0.1:2181)] - > Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating > session > DEBUG 2013-10-24 18:00:09,449 [main-SendThread(127.0.0.1:2181)] - > Session establishment request sent on 127.0.0.1/127.0.0.1:2181 > DEBUG 2013-10-24 18:00:09,449 [main-SendThread(127.0.0.1:2181)] - > Could not retrieve login configuration: java.lang.SecurityException: > Unable to locate a login configuration > INFO 2013-10-24 18:00:09,501 [main-SendThread(127.0.0.1:2181)] - > Session establishment complete on server 127.0.0.1/127.0.0.1:2181, > sessionid = 0x141ece7e6160017, negotiated timeou
Simple (?) zookeeper question
Latest zookeeper is installed on an Ubuntu server box. Java is 1.7 latest build. whereis points to java just fine. /etc/zookeeper is empty. boot zookeeper from /bin as sudo ./zkServer.sh start Console says "Started" /etc/zookeeper now has a .pid file In another console, ./zkServer.sh status returns: "It's probably not running" An interesting fact: the log4j.properties file says there should be a zookeeper.log file in "."; there is no log file. When I do a text search in the zookeeper source code for where it picks up the log4j.properties, nothing is found. Fascinating, what? This must be a common beginner's question, not well covered in web-search for my context. Does it ring any bells? Many thanks. Jack
Re: Simple (?) zookeeper question
After digging deeper (slow for a *nix newbee), I uncovered issues with the java installation. A step in installation of Oracle Java has it that you -install "java" with the path to /bin/java. That done, zookeeper seems to be running. I booted three cores (on the same box) -- this is the simple one-box 3-node cloud test, and used the test code from the Lucidworks course to send over and read some documents. That failed with this: Unknown document router '{name=compositeId}' Lots more research. Closer... On Thu, Oct 31, 2013 at 5:44 PM, Jack Park wrote: > Latest zookeeper is installed on an Ubuntu server box. > Java is 1.7 latest build. > whereis points to java just fine. > /etc/zookeeper is empty. > > boot zookeeper from /bin as sudo ./zkServer.sh start > Console says "Started" > /etc/zookeeper now has a .pid file > In another console, ./zkServer.sh status returns: > "It's probably not running" > > An interesting fact: the log4j.properties file says there should be a > zookeeper.log file in "."; there is no log file. When I do a text > search in the zookeeper source code for where it picks up the > log4j.properties, nothing is found. > > Fascinating, what? This must be a common beginner's question, not > well covered in web-search for my context. Does it ring any bells? > > Many thanks. > Jack
Re: Simple (?) zookeeper question
Alan, That was brilliant! My test harness was behind a couple of notches. Hah! So, now we open yet another can of strange looking creatures, namely: No live SolrServers available to handle this request:[http://127.0.1.1:8983/solr/collection1] at org.apache.solr.client.solrj.impl.CloudSolrServer.directUpdate(CloudSolrServer.java:347) 3 times, once for each URL I passed into the server. Here is the code: String zkurl = "10.1.10.178:2181"; String solrurla = "10.1.10.178:8983"; String solrurlb = "10.1.10.178:7574"; String solrurlc = "10.1.10.178:7590"; LBHttpSolrServer sv = new LBHttpSolrServer(solrurla,solrurlb,solrurlc); CloudSolrServer server = new CloudSolrServer(zkurl,sv); server.setDefaultCollection("collection1"); I am struggling to imagine how 10.1.10.178 got translated to 127.0.1.1 and the port assignments ignored for each URL passed in. That error message seems well known to search engines. One suggestion is to check the zookeeper logs. According to the zookeeper's log4j properties, there should be a zookeeper.log in the zookeeper directory. There is no such log. I went to /etc/zookeeper/Version_2 and looked at log.1 (binary) but could see hints that this might be where the 127.0.1.1 is coming from: zookeeper sending such an error message back. This would suggest that, somehow or other, my nodes are not properly registering themselves, though no error messages were tossed when each node was booted. solr.log for node1 only reflects queries from the admin page. That's what I am working on now. Thanks! On Fri, Nov 1, 2013 at 6:03 AM, Alan Woodward wrote: > Unknown document router errors are usually caused by using different solr and > solrj versions - which version of solr and solrj are you using? > > Alan Woodward > www.flax.co.uk > > > On 1 Nov 2013, at 04:19, Jack Park wrote: > >> After digging deeper (slow for a *nix newbee), I uncovered issues with >> the java installation. A step in installation of Oracle Java has it >> that you -install "java" with the path to /bin/java. That done, >> zookeeper seems to be running. >> >> I booted three cores (on the same box) -- this is the simple one-box >> 3-node cloud test, and used the test code from the Lucidworks course >> to send over and read some documents. That failed with this: >> Unknown document router '{name=compositeId}' >> >> Lots more research. >> Closer... >> >> On Thu, Oct 31, 2013 at 5:44 PM, Jack Park wrote: >>> Latest zookeeper is installed on an Ubuntu server box. >>> Java is 1.7 latest build. >>> whereis points to java just fine. >>> /etc/zookeeper is empty. >>> >>> boot zookeeper from /bin as sudo ./zkServer.sh start >>> Console says "Started" >>> /etc/zookeeper now has a .pid file >>> In another console, ./zkServer.sh status returns: >>> "It's probably not running" >>> >>> An interesting fact: the log4j.properties file says there should be a >>> zookeeper.log file in "."; there is no log file. When I do a text >>> search in the zookeeper source code for where it picks up the >>> log4j.properties, nothing is found. >>> >>> Fascinating, what? This must be a common beginner's question, not >>> well covered in web-search for my context. Does it ring any bells? >>> >>> Many thanks. >>> Jack >
Re: Simple (?) zookeeper question
/clusterstate.json seems to clearly state that all 3 nodes are alive, have ranges, and are active. Still, it would seem that java is still not properly installed. ZooKeeper is dropping zookeeper.out in the /bin directory, which says this, among other things: Server environment:java.home=/usr/local/java/jdk1.7.0_40/jre Server environment:java.class.path=/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../build/classes:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../build/lib/*.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/netty-3.2.2.Final.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/log4j-1.2.15.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/jline-0.9.94.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../zookeeper-3.4.5.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../src/java/lib/*.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../conf: Server environment:java.library.path= /usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib There is no /usr/java/... It's really a mystery where zookeeper is getting these values; everything else seems right. But, for me, here's the amazing chunk of traces (cleaned up a bit) Accepted socket connection from /127.0.0.1:39065 Client attempting to establish new session at /127.0.0.1:39065 Established session 0x1421197e6e90002 with negotiated timeout 15000 for client /127.0.0.1:39065 Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:create cxid:0x1 zxid:0xc0 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:create cxid:0x3 zxid:0xc1 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:delete cxid:0xe zxid:0xc2 txntype:-1 reqpath:n/a Error Path:/live_nodes/127.0.1.1:7590_solr Error:KeeperErrorCode = NoNode for /live_nodes/127.0.1.1:7590_solr Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:delete cxid:0x9f zxid:0xcd txntype:-1 reqpath:n/a Error Path:/collections/collection1/leaders/shard3 Error:KeeperErrorCode = NoNode for /collections/collection1/leaders/shard3 2013-10-31 21:01:19,344 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:create cxid:0xa0 zxid:0xce txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer Got user-level KeeperException when processing sessionid:0x1421197e6e90002 type:create cxid:0xaa zxid:0xd1 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer Accepted socket connection from /10.1.10.180:55528 Client attempting to establish new session at /10.1.10.180:55528 Established session 0x1421197e6e90003 with negotiated timeout 1 for client /10.1.10.180:55528 WARN Exception causing close of session 0x1421197e6e90003 due to java.io.IOException: Connection reset by peer Closed socket connection for client /10.1.10.180:55528 which had sessionid 0x1421197e6e90003 Sockets from 10.1.10.180 are my windoz box shipping solr documents. I am not sure how I am using 55528 unless that's a solrj behavior. Connection reset by peer would suggest something in my code, but my code is a clone of code supplied in a Solr training course. Must be good. Right? I also have no clue what is /127.0.0.1:39065 -- that's not one of my nodes. The quest continues. On Fri, Nov 1, 2013 at 9:21 AM, Jack Park wrote: > Alan, > That was brilliant! > My test harness was behind a couple of notches. > > Hah! So, now we open yet another can of strange looking creatures, namely: > > No live SolrServers available to handle this > request:[http://127.0.1.1:8983/solr/collection1] > at > org.apache.solr.client.solrj.impl.CloudSolrServer.directUpdate(CloudSolrServer.java:347) > > 3 times, once for each URL I passed into the server. Here is the code: > > String zkurl = "10.1.10.178:2181"; > String solrurla = "10.1.10.178:8983"; > String solrurlb = "10.1.10.178:7574"; > String solrurlc = "10.1.10.178:7590"; > > LBHttpSolrServer sv = new LBHttpSolrServer(solrurla,solrurlb,solrurlc); > CloudSolrServer server = new CloudSolrServer(zkurl,sv); > server.setDefaultCollection("collection1"); > > I am struggling to imagine how 10.1.10.178 got translated to 127.0.1.1 > and the port assignments ignored for each URL passed in. > > That error message seems well known to search engines. One suggest
Re: Simple (?) zookeeper question
The top error message at my test harness is this: No live SolrServers available to handle this request: [http://127.0.1.1:8983/solr/collection1, http://127.0.1.1:7574/solr/collection1, http://127.0.1.1:7590/solr/collection1] I have to assume that error message was somehow shipped by zookeeper, because those servers actually exist, to the test harness, at 10.1.10.178, and if I access any one of them from the browser, /solr/collection1 does not work, but /solr/#/collection1 does work. On Fri, Nov 1, 2013 at 10:34 AM, Jack Park wrote: > /clusterstate.json seems to clearly state that all 3 nodes are alive, > have ranges, and are active. > > Still, it would seem that java is still not properly installed. > ZooKeeper is dropping zookeeper.out in the /bin directory, which says > this, among other things: > > Server environment:java.home=/usr/local/java/jdk1.7.0_40/jre > > Server > environment:java.class.path=/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../build/classes:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../build/lib/*.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/netty-3.2.2.Final.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/log4j-1.2.15.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../lib/jline-0.9.94.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../zookeeper-3.4.5.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../src/java/lib/*.jar:/usr/local/lib/SolrCloud/zookeeper/zookeeper-3.4.5/bin/../conf: > > Server environment:java.library.path= > /usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib > > There is no /usr/java/... > It's really a mystery where zookeeper is getting these values; > everything else seems right. > > But, for me, here's the amazing chunk of traces (cleaned up a bit) > > Accepted socket connection from /127.0.0.1:39065 > Client attempting to establish new session at /127.0.0.1:39065 > Established session 0x1421197e6e90002 with negotiated timeout 15000 > for client /127.0.0.1:39065 > Got user-level KeeperException when processing > sessionid:0x1421197e6e90002 type:create cxid:0x1 zxid:0xc0 txntype:-1 > reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists > for /overseer > Got user-level KeeperException when processing > sessionid:0x1421197e6e90002 type:create cxid:0x3 zxid:0xc1 txntype:-1 > reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists > for /overseer > Got user-level KeeperException when processing > sessionid:0x1421197e6e90002 type:delete cxid:0xe zxid:0xc2 txntype:-1 > reqpath:n/a Error Path:/live_nodes/127.0.1.1:7590_solr > Error:KeeperErrorCode = NoNode for /live_nodes/127.0.1.1:7590_solr > Got user-level KeeperException when processing > sessionid:0x1421197e6e90002 type:delete cxid:0x9f zxid:0xcd txntype:-1 > reqpath:n/a Error Path:/collections/collection1/leaders/shard3 > Error:KeeperErrorCode = NoNode for > /collections/collection1/leaders/shard3 > 2013-10-31 21:01:19,344 [myid:] - INFO [ProcessThread(sid:0 > cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException > when processing sessionid:0x1421197e6e90002 type:create cxid:0xa0 > zxid:0xce txntype:-1 reqpath:n/a Error Path:/overseer > Error:KeeperErrorCode = NodeExists for /overseer > Got user-level KeeperException when processing > sessionid:0x1421197e6e90002 type:create cxid:0xaa zxid:0xd1 txntype:-1 > reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists > for /overseer > Accepted socket connection from /10.1.10.180:55528 > Client attempting to establish new session at /10.1.10.180:55528 > Established session 0x1421197e6e90003 with negotiated timeout 1 > for client /10.1.10.180:55528 > WARN Exception causing close of session 0x1421197e6e90003 due to > java.io.IOException: Connection reset by peer > Closed socket connection for client /10.1.10.180:55528 which had > sessionid 0x1421197e6e90003 > > Sockets from 10.1.10.180 are my windoz box shipping solr documents. I > am not sure how I am using 55528 unless that's a solrj behavior. > Connection reset by peer would suggest something in my code, but my > code is a clone of code supplied in a Solr training course. Must be > good. Right? > > I also have no clue what is /127.0.0.1:39065 -- that's not one of my nodes. > > The quest continues. > > On Fri, Nov 1, 2013 at 9:21 AM, Jack Park wrote: >> Alan, >> That was brilliant! >> My test harness was behind a couple of notches. >> >> Hah! So, now we open yet another can of strange looking creatures, namely: >> >&
Re: Simple (?) zookeeper question
Thanks. I reviewed clusterstate.json again; those URLs are alive. Why they are not responding seems to be the mystery du jour. I reviewed my test suite: it is using field names in schema.xml, and the server is configured to use the update responders I installed, all of which work fine in a non-cloud mode. Thanks Jack On Fri, Nov 1, 2013 at 11:12 AM, Shawn Heisey wrote: > On 11/1/2013 12:07 PM, Jack Park wrote: >> >> The top error message at my test harness is this: >> >> No live SolrServers available to handle this request: >> [http://127.0.1.1:8983/solr/collection1, >> http://127.0.1.1:7574/solr/collection1, >> http://127.0.1.1:7590/solr/collection1] >> >> I have to assume that error message was somehow shipped by zookeeper, >> because those servers actually exist, to the test harness, at >> 10.1.10.178, and if I access any one of them from the browser, >> /solr/collection1 does not work, but /solr/#/collection1 does work. > > > Those are *base* urls. By themselves, they return 404. For an example of > how a base URL is used, try /solr/collection1/select?q=*:* instead. > > Any URL with /#/ in it is part of the admin UI, which runs mostly in the > browser and accesses Solr handlers to gather information. It is not Solr > itself. > > Thanks, > Shawn >
Cloud issue as an issue with SolrJ?
I now have a single ZK running standalone on 2121. On the same CPU, I have three nodes. I used a curl to send over two documents, one each to two of the three nodes in the cloud. According to a web query, they are both there. My solrconfig.xml file has a custom update response processor chain defined thus: hello where the added process intercepts a SolrDocument after it is processed and sends it out as a JSON object to TCP socket listeners. The instance of SolrJ I have implemented looks like this: LBHttpSolrServer sv = new LBHttpSolrServer(solrurla,solrurlb,solrurlc); sv.getHttpClient().getParams().setParameter("update.chain", "update"); // "merge"); CloudSolrServer server = new CloudSolrServer(zkurl,sv); server.setDefaultCollection("collection1"); where the commented-out code would call my "merge" update chain. In curl tests, /solr/merge?commit=true ... got a jetty error /solr/merge not found. When I changed that to /solr/update?commit=true... the document got indexed. Thus, commenting out "merge" in favor of "update". In any case (merge, update, or no update.chain setting at all), the SolrJ implementation fails, typically at a zookeeper.out nio exception "socket closed by peer". Rewriting my implementation to this: CloudSolrServer server = new CloudSolrServer(zkurl); server.setDefaultCollection("collection1"); makes no change in behavior. Where is the error thrown? The code to build a doc is this (which reflects my field definitions): SolrInputDocument doc = new SolrInputDocument(); doc.addField( "locator", "doc"+i); doc.addField( "label", "document " + i); doc.addField( "details", "This is document " + i); server.add(doc); The error is thrown at server.add(doc) Many thanks in advance for any observations or suggestions. Cheers Jack
Re: Cloud issue as an issue with SolrJ?
Issue resolved, with great thanks to Tim Casey. The issue was based on my own poor understanding of the mechanics of ZooKeeper. The "host" setting in solr.xml must find the correct value and not default to localhost. Simply hard-wiring host to the network address of the computer made everything work. On Sun, Nov 3, 2013 at 12:04 PM, Jack Park wrote: > I now have a single ZK running standalone on 2121. On the same CPU, I > have three nodes. > > I used a curl to send over two documents, one each to two of the three > nodes in the cloud. According to a web query, they are both there. > > My solrconfig.xml file has a custom update response processor chain > defined thus: > > > >class="org.apache.solr.update.TopicQuestsHarvestProcessFactory"> > hello > > > > > where the added process intercepts a SolrDocument after it is > processed and sends it out as a JSON object to TCP socket listeners. > > The instance of SolrJ I have implemented looks like this: > > LBHttpSolrServer sv = new LBHttpSolrServer(solrurla,solrurlb,solrurlc); > sv.getHttpClient().getParams().setParameter("update.chain", > "update"); // "merge"); >CloudSolrServer server = new CloudSolrServer(zkurl,sv); > server.setDefaultCollection("collection1"); > > where the commented-out code would call my "merge" update chain. > > In curl tests, /solr/merge?commit=true ... got a jetty error > /solr/merge not found. > When I changed that to /solr/update?commit=true... the document got > indexed. Thus, commenting out "merge" in favor of "update". > > In any case (merge, update, or no update.chain setting at all), the > SolrJ implementation fails, typically at a zookeeper.out nio exception > "socket closed by peer". > > Rewriting my implementation to this: >CloudSolrServer server = new CloudSolrServer(zkurl); >server.setDefaultCollection("collection1"); > makes no change in behavior. > > Where is the error thrown? > > The code to build a doc is this (which reflects my field definitions): > > SolrInputDocument doc = new SolrInputDocument(); >doc.addField( "locator", "doc"+i); >doc.addField( "label", "document " + i); >doc.addField( "details", "This is document " + i); >server.add(doc); > > The error is thrown at server.add(doc) > > Many thanks in advance for any observations or suggestions. > > Cheers > Jack
Indexing URLs in Solr?
Figuring out a google query to gain an answer seems difficult given the ambiguity; I have a field: into which I store a URL which, when displayed as a result of a query, looks like this in the admin console: "resourceURL": "http://someotherserver.org/";, The query "resourceURL:*" will find all of them, but there is this question: What does the query look like to find that specific URL? Of course, "resourceURL:http://someotherserver.org/"; doesn't work This one resourceURL=http%3A%2F%2Fsomeotherserver.org%2F fails as well. What am I overlooking? Many thanks in advance. Jack
Re: Indexing URLs in Solr?
Spoke too soon. Hacking rocks! Finally landed on this heuristic, and it works: resourceURL:"http://someotherserver.org/"; On Thu, Nov 7, 2013 at 9:52 AM, Jack Park wrote: > Figuring out a google query to gain an answer seems difficult given > the ambiguity; > > I have a field: > > > > into which I store a URL > > which, when displayed as a result of a query, looks like this in the > admin console: > > "resourceURL": "http://someotherserver.org/";, > > The query "resourceURL:*" will find all of them, but there is this question: > > What does the query look like to find that specific URL? > > Of course, "resourceURL:http://someotherserver.org/"; doesn't work > > This one > resourceURL=http%3A%2F%2Fsomeotherserver.org%2F > > fails as well. > > What am I overlooking? > > Many thanks in advance. > Jack
Solr 4.6.1: Core discovery and default core
Hello, I may have missed this but, how do you specify a default core when using the new-style for the solr.xml? When I view the status of my Solr core setup ( http://localhost:8983/solr/admin/cores?action=STATUS) I see a isDefaultCore speficiation but, i'm not sure where it can from and and where it's located so that it may be changed. false Also when viewing the status I see: collection1 I thought that defaultCoreName was not supported when using the new-style for solr.xml? Also, not sure why it picked up the value "collection1" as I did not specify a default core. Any help is greatly appreciated. Thanks
Re: solr.DirectUpdateHandler2 failed to instantiate
Wow! That's been a while back, and it appears that my journal didn't carry a good trace of what I did. Here's a reconstruction: >From my earlier attempt, which is reflected in this solrconfig.xml entry notice that I am calling solrDirectUpdateHandler2 directly in defining a requestHandler I don't do that anymore. Now, it's this: which took a lot of fishing to sort out, because, being somewhat dyslexic, it took a long time to figure out that I can use "harvest" as a setting in SolrJ, thus: harvestServer = new HttpSolrServer(solrURL); harvestServer.getHttpClient().getParams().setParameter("update.chain", "harvest"); In short, the original exception was based on a gross misinterpretation of how one goes about equating solrconfig.xml with configurations of SolrJ. Hope that helps more than it confuses! Cheers Jack On Thu, Jun 27, 2013 at 9:45 AM, Mark Bennett wrote: > Jack, > > Did you ever find a fix for this? > > I'm having similar issues (different parts of solrconfig) and my guess is > it's a config issue somewhere, vs. a proper casting problem, some nested init > issue. > > Was curious what you found? > > > On Mar 13, 2013, at 11:52 AM, Jack Park wrote: > >> I can safely say that it is not DirectUpdateHandler2 failing; By >> commenting out my own handlers, the system boots without error. >> >> This means that my handlers are problematic in some way. The moment I >> put back just one of my handlers: >> >> >> >> > class="org.apache.solr.update.TopicQuestsDocumentProcessFactory"> >>hello >> >> >> >> >> > class="solr.DirectUpdateHandler2"> >> >> harvest >> >> >> >> >> The problem returns. It simply appears that I cannot declare a named >> requestHandler using that class. >> >> Jack >> >> On Tue, Mar 12, 2013 at 12:22 PM, Jack Park wrote: >>> Indeed! Perhaps the germane part is this, before the failure to >>> instantiate notice: >>> >>> Caused by: java.lang.ClassCastException: class >>> org.apache.solr.update.DirectUpda >>> teHandler2 >>>at java.lang.Class.asSubclass(Unknown Source) >>>at >>> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader. >>> java:432) >>>at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:507) >>> >>> This suggests that I might be doing something wrong elsewhere in >>> solrconfig.xml. >>> >>> The possibly relevant parts (my contributions) are these: >>> >>> >>> >>> >>> >>> >>> >>> >>> >> class="org.apache.solr.update.TopicQuestsDocumentProcessFactory"> >>>hello >>> >>> >>> >>> >>> >> class="solr.DirectUpdateHandler2"> >>> >>> harvest >>> >>> >>> >>> >>> >> class="solr.DirectUpdateHandler2"> >>> >>> partial >>> >>> >>> >>> Thanks >>> Jack >>> >>> On Tue, Mar 12, 2013 at 12:16 PM, Mark Miller wrote: >>>> There should be a stack trace - also, you shouldn't have to do anything >>>> special to use this class. It's the default and only truly supported >>>> implementation… >>>> >>>> - Mark >>>> >>>> On Mar 12, 2013, at 2:53 PM, Jack Park wrote: >>>> >>>>> That messages gives great, but terrible google. Zillions of hits, >>>>> mostly filled with very long log traces, and zero messages (that I >>>>> could find) about what to do about it. >>>>> >>>>> I switched over to using that handler since it has an update log >>>>> specified, and that's the only place I've found how to use update log. >>>>> But, can't boot now. >>>>> >>>>> All the jars are in place; I'm able to import that class in my code. >>>>> >>>>> Is there any news on that issue? >>>>> >>>>> Many thanks >>>>> Jack >>>> >> FLAGS () >
Question about soft commit and updateRequestProcessorChain
If one allows for a soft commit (rather than a hard commit on each request), when does the updateRequestProcessorChain fire? Does it fire after the commit? Many thanks Jack
Re: Question about soft commit and updateRequestProcessorChain
Ok. So, running the update processor chain *is* the commit process? In answer to Erick's question: my habit, an old and apparently bad one, has been to call a hard commit at the end of each update. My question had to do with allowing soft commits to be controlled by settings in solrconfig.xml, say every 30 seconds or something like that (I really haven't studied such options yet). I ask this question because I add an additional call to the update processor, which, after running Lucene, the document is then sent outside to an agent network for further processing. I needed to know if the document was already committed by that time. I am inferring from here that the document has been committed after the first step in the update processor chain, even if that's based on a soft commit. Thanks! JackP On Wed, Aug 7, 2013 at 4:20 PM, Jack Krupansky wrote: > Most update processor chains will be configured with the Run Update > processor as the last processor of the chain. That's were the Lucene index > update and optional commit would be done. > > -- Jack Krupansky > > -Original Message- From: Jack Park > Sent: Wednesday, August 07, 2013 1:04 PM > To: solr-user@lucene.apache.org > Subject: Question about soft commit and updateRequestProcessorChain > > > If one allows for a soft commit (rather than a hard commit on each > request), when does the updateRequestProcessorChain fire? Does it fire > after the commit? > > Many thanks > Jack
Question about plug-in update handler failure
I have an "interceptor" which grabs SolrDocument instances in the update handler chain. It feeds those documents as a JSON string out to an agent system. That system has been running fine all the way up to Solr 4.3.1 I have discovered that, as of 4.4 and now 4.5, the very same config files, agent jar, and test harness shows that no documents are intercepted, even though the index is built. I am wondering if I missed something in changes to Solr beyond 4.3.1 which would invalidate my setup. For the record, earlier trials opened the war and dropped my agent jar into WEB-INF/lib; most recent trials on all systems leaves the war intact and drops the agent jar into collection1/lib -- it still works on 4.3.1, but nothing beyond that. Many thanks in advance for any thoughts. Jack
Re: Question about plug-in update handler failure
Issue resolved. Not a Solr issue; a really hard to discover missing library in my installation. On Thu, Oct 10, 2013 at 7:10 PM, Jack Park wrote: > I have an "interceptor" which grabs SolrDocument instances in the > update handler chain. It feeds those documents as a JSON string out to > an agent system. > > That system has been running fine all the way up to Solr 4.3.1 > I have discovered that, as of 4.4 and now 4.5, the very same config > files, agent jar, and test harness shows that no documents are > intercepted, even though the index is built. > > I am wondering if I missed something in changes to Solr beyond 4.3.1 > which would invalidate my setup. > > For the record, earlier trials opened the war and dropped my agent jar > into WEB-INF/lib; most recent trials on all systems leaves the war > intact and drops the agent jar into collection1/lib -- it still works > on 4.3.1, but nothing beyond that. > > Many thanks in advance for any thoughts. > > Jack
Querying a transitive closure?
This is a question about "isA?" We want to know if M isA B isA?(M,B) For some M, one might be able to look into M to see its type or which class(es) for which it is a subClass. We're talking taxonomic queries now. But, for some M, one might need to ripple up the "transitive closure", looking at all the super classes, etc, recursively. It seems unreasonable to do that over HTTP; it seems more reasonable to grab a core and write a custom isA query handler. But, how do you do that in a SolrCloud? Really curious... Many thanks in advance for ideas. Jack
Re: Querying a transitive closure?
Hi Otis, I fully expect to grow to SolrCloud -- many shards. For now, it's solo. But, my thinking relates to cloud. I look for ways to reduce the number of HTTP round trips through SolrJ. Maybe you have some ideas? Thanks Jack On Wed, Mar 27, 2013 at 10:04 AM, Otis Gospodnetic wrote: > Hi Jack, > > Is this really about HTTP and Solr vs. SolrCloud or more whether > Solr(Cloud) is the right tool for the job and if so how to structure > the schema and queries to make such lookups efficient? > > Otis > -- > Solr & ElasticSearch Support > http://sematext.com/ > > > > > > On Wed, Mar 27, 2013 at 12:53 PM, Jack Park wrote: >> This is a question about "isA?" >> >> We want to know if M isA B isA?(M,B) >> >> For some M, one might be able to look into M to see its type or which >> class(es) for which it is a subClass. We're talking taxonomic queries >> now. >> But, for some M, one might need to ripple up the "transitive closure", >> looking at all the super classes, etc, recursively. >> >> It seems unreasonable to do that over HTTP; it seems more reasonable >> to grab a core and write a custom isA query handler. But, how do you >> do that in a SolrCloud? >> >> Really curious... >> >> Many thanks in advance for ideas. >> Jack
Re: Querying a transitive closure?
Hi Otis, That's essentially the answer I was looking for: each shard (are we talking master + replicas?) has the plug-in custom query handler. I need to build it to find out. What I mean is that there is a taxonomy, say one with a single root for sake of illustration, which grows all the classes, subclasses, and instances. If I have an object that is somewhere in that taxonomy, then it has a zigzag chain of parents up that tree (I've seen that called a "transitive closure". If class B is way up that tree from M, no telling how many queries it will take to find it. Hmmm... recursive ascent, I suppose. Many thanks Jack On Wed, Mar 27, 2013 at 6:52 PM, Otis Gospodnetic wrote: > Hi Jack, > > I don't fully understand the exact taxonomy structure and your needs, > but in terms of reducing the number of HTTP round trips, you can do it > by writing a custom SearchComponent that, upon getting the initial > request, does everything "locally", meaning that it talks to the > local/specified shard before returning to the caller. In SolrCloud > setup with N shards, each of these N shards could be queried in such a > way in parallel, running query/queries on their local shards. > > Otis > -- > Solr & ElasticSearch Support > http://sematext.com/ > > > > > > On Wed, Mar 27, 2013 at 3:11 PM, Jack Park wrote: >> Hi Otis, >> >> I fully expect to grow to SolrCloud -- many shards. For now, it's >> solo. But, my thinking relates to cloud. I look for ways to reduce the >> number of HTTP round trips through SolrJ. Maybe you have some ideas? >> >> Thanks >> Jack >> >> On Wed, Mar 27, 2013 at 10:04 AM, Otis Gospodnetic >> wrote: >>> Hi Jack, >>> >>> Is this really about HTTP and Solr vs. SolrCloud or more whether >>> Solr(Cloud) is the right tool for the job and if so how to structure >>> the schema and queries to make such lookups efficient? >>> >>> Otis >>> -- >>> Solr & ElasticSearch Support >>> http://sematext.com/ >>> >>> >>> >>> >>> >>> On Wed, Mar 27, 2013 at 12:53 PM, Jack Park >>> wrote: >>>> This is a question about "isA?" >>>> >>>> We want to know if M isA B isA?(M,B) >>>> >>>> For some M, one might be able to look into M to see its type or which >>>> class(es) for which it is a subClass. We're talking taxonomic queries >>>> now. >>>> But, for some M, one might need to ripple up the "transitive closure", >>>> looking at all the super classes, etc, recursively. >>>> >>>> It seems unreasonable to do that over HTTP; it seems more reasonable >>>> to grab a core and write a custom isA query handler. But, how do you >>>> do that in a SolrCloud? >>>> >>>> Really curious... >>>> >>>> Many thanks in advance for ideas. >>>> Jack
Re: Querying a transitive closure?
Thank you for this. I had thought about it but reasoned in a naive way: who would do such a thing? Doing so makes the query local: once the object has been retrieved, no further HTTP queries are required. Implementation perhaps entails one request to fetch the presumed parent in order to harvest its transitive closure. I need to think about that. Many thanks Jack On Thu, Mar 28, 2013 at 5:06 AM, Jens Grivolla wrote: > Exactly, you should usually design your schema to fit your queries, and if > you need to retrieve all ancestors then you should index all ancestors so > you can query for them easily. > > If that doesn't work for you then either Solr is not the right tool for the > job, or you need to rethink your schema. > > The description of doing lookups within a tree structure doesn't sound at > all like what you would use a text retrieval engine for, so you might want > to rethink why you want to use Solr for this. But if that "transitive > closure" is something you can calculate at indexing time then the correct > solution is the one Upayavira provided. > > If you want people to be able to help you you need to actually describe your > problem (i.e. what is my data, and what are my queries) instead of diving > into technical details like "reducing HTTP roundtrips". My guess is that if > you need to "reduce HTTP roundtrips" you're probably doing it wrong. > > HTH, > Jens > > > On 03/28/2013 08:15 AM, Upayavira wrote: >> >> Why don't you index all ancestor classes with the document, as a >> multivalued field, then you could get it in one hit. Am I missing >> something? >> >> Upayavira >> >> On Thu, Mar 28, 2013, at 01:59 AM, Jack Park wrote: >>> >>> Hi Otis, >>> That's essentially the answer I was looking for: each shard (are we >>> talking master + replicas?) has the plug-in custom query handler. I >>> need to build it to find out. >>> >>> What I mean is that there is a taxonomy, say one with a single root >>> for sake of illustration, which grows all the classes, subclasses, and >>> instances. If I have an object that is somewhere in that taxonomy, >>> then it has a zigzag chain of parents up that tree (I've seen that >>> called a "transitive closure". If class B is way up that tree from M, >>> no telling how many queries it will take to find it. Hmmm... >>> recursive ascent, I suppose. >>> >>> Many thanks >>> Jack >>> >>> On Wed, Mar 27, 2013 at 6:52 PM, Otis Gospodnetic >>> wrote: >>>> >>>> Hi Jack, >>>> >>>> I don't fully understand the exact taxonomy structure and your needs, >>>> but in terms of reducing the number of HTTP round trips, you can do it >>>> by writing a custom SearchComponent that, upon getting the initial >>>> request, does everything "locally", meaning that it talks to the >>>> local/specified shard before returning to the caller. In SolrCloud >>>> setup with N shards, each of these N shards could be queried in such a >>>> way in parallel, running query/queries on their local shards. >>>> >>>> Otis >>>> -- >>>> Solr & ElasticSearch Support >>>> http://sematext.com/ >>>> >>>> >>>> >>>> >>>> >>>> On Wed, Mar 27, 2013 at 3:11 PM, Jack Park >>>> wrote: >>>>> >>>>> Hi Otis, >>>>> >>>>> I fully expect to grow to SolrCloud -- many shards. For now, it's >>>>> solo. But, my thinking relates to cloud. I look for ways to reduce the >>>>> number of HTTP round trips through SolrJ. Maybe you have some ideas? >>>>> >>>>> Thanks >>>>> Jack >>>>> >>>>> On Wed, Mar 27, 2013 at 10:04 AM, Otis Gospodnetic >>>>> wrote: >>>>>> >>>>>> Hi Jack, >>>>>> >>>>>> Is this really about HTTP and Solr vs. SolrCloud or more whether >>>>>> Solr(Cloud) is the right tool for the job and if so how to structure >>>>>> the schema and queries to make such lookups efficient? >>>>>> >>>>>> Otis >>>>>> -- >>>>>> Solr & ElasticSearch Support >>>>>> http://sematext.com/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Mar 27, 2013 at 12:53 PM, Jack Park >>>>>> wrote: >>>>>>> >>>>>>> This is a question about "isA?" >>>>>>> >>>>>>> We want to know if M isA B isA?(M,B) >>>>>>> >>>>>>> For some M, one might be able to look into M to see its type or which >>>>>>> class(es) for which it is a subClass. We're talking taxonomic queries >>>>>>> now. >>>>>>> But, for some M, one might need to ripple up the "transitive >>>>>>> closure", >>>>>>> looking at all the super classes, etc, recursively. >>>>>>> >>>>>>> It seems unreasonable to do that over HTTP; it seems more reasonable >>>>>>> to grab a core and write a custom isA query handler. But, how do you >>>>>>> do that in a SolrCloud? >>>>>>> >>>>>>> Really curious... >>>>>>> >>>>>>> Many thanks in advance for ideas. >>>>>>> Jack >> >> > >
Re: Flow Chart of Solr
There are three books on Solr, two with that in the title, and one, Taming Text, each of which have been very valuable in understanding Solr. Jack On Wed, Apr 3, 2013 at 5:25 AM, Jack Krupansky wrote: > Sure, yes. But... it comes down to what level of detail you want and need > for a specific task. In other words, there are probably a dozen or more > levels of detail. The reality is that if you are going to work at the Solr > code level, that is very, very different than being a "user" of Solr, and at > that point your first step is to become familiar with the code itself. > > When you talk about "parsing" and "stemming", you are really talking about > the user-level, not the Solr code level. Maybe what you really need is a > cheat sheet that maps a user-visible feature to the main Solr code component > for that implements that user feature. > > There are a number of different forms of "parsing" in Solr - parsing of > what? Queries? Requests? Solr documents? Function queries? > > Stemming? Well, in truth, Solr doesn't even do stemming - Lucene does that. > Lucene does all of the "token filtering". Are you asking for details on how > Lucene works? Maybe you meant to ask how "term analysis" works, which is > split between Solr and Lucene. Or maybe you simply wanted to know when and > where term analysis is done. Tell us your specific problem or specific > question and we can probably quickly give you an answer. > > In truth, NOBODY uses "flow charts" anymore. Sure, there are some user-level > diagrams, but not down to the code level. > > If you could focus on specific questions, we could give you specific > answers. > > "Main steps"? That depends on what level you are working at. Tell us what > problem you are trying to solve and we can point you to the relevant areas. > > In truth, if you become generally familiar with Solr at the user level > (study the wikis), you will already know what the "main steps" are. > > So, it is not "main steps of Solr", but main steps of some specific > "request" of Solr, and for a specified level of detail, and for a specified > area of Solr if greater detail is needed. Be more specific, and then we can > be more specific. > > For now, the general advice for people who need or want to go far beyond the > user level is to "get familiar with the code" - just LOOK at it - a lot of > the package and class names are OBVIOUS, really, and follow the class > hierarchy and code flow using the standard features of any modern Java IDE. > If you are wondering where to start for some specific user-level feature, > please ask specifically about that feature. But... make a diligent effort to > discover and learn on your own before asking open-ended questions. > > Sure, there are lots of things in Lucene and Solr that are rather complex > and seemingly convoluted, and not obvious, but people are more than willing > to help you out if you simply ask a specific question. I mean, not everybody > needs to know the fine detail of query parsing, analysis, building a > Lucene-level stemmer, etc. If we tried to put all of that in a diagram, most > people would be more confused than enlightened. > > At which step are scores calculated? That's more of a Lucene question. Or, > are you really asking what code in Solr invokes Lucene search methods that > calculate basic scores? > > In short, you need to be more specific. Don't force us to guess what problem > you are trying to solve. > > -- Jack Krupansky > > -Original Message- From: Furkan KAMACI > Sent: Wednesday, April 03, 2013 6:52 AM > To: solr-user@lucene.apache.org > Subject: Re: Flow Chart of Solr > > > So, all in all, is there anybody who can write down just main steps of > Solr(including parsing, stemming etc.)? > > > 2013/4/2 Furkan KAMACI > >> I think about myself as an example. I have started to make research about >> Solr just for some weeks. I have learned Solr and its related projects. My >> next step writing down the main steps Solr. We have separated learning >> curve of Solr into two main categories. >> First one is who are using it as out of the box components. Second one is >> developer side. >> >> Actually developer side branches into two way. >> >> First one is general steps of it. i.e. document comes into Solr (i.e. >> crawled data of Nutch). which analyzing processes are going to done >> (stamming, hamming etc.), what will be doing after parsing step by step. >> When a search query happens what happens step by step, at which step >> scores >> are calculated so on so forth. >> Second one is more code specific i.e. which handlers takes into account >> data that will going to be indexed(no need the explain every handler at >> this step) . Which are the analyzer, tokenizer classes and what are the >> flow between them. How response handlers works and what are they. >> >> Also explaining about cloud side is other work. >> >> Some of explanations are currently presents at wiki (but some of them are >> at very deep places at
Re: Flow Chart of Solr
Jack, Is that new book up to the 4.+ series? Thanks The other Jack On Wed, Apr 3, 2013 at 9:19 AM, Jack Krupansky wrote: > And another one on the way: > http://www.amazon.com/Lucene-Solr-Definitive-comprehensive-realtime/dp/1449359957 > > Hopefully that help a lot as well. Plenty of diagrams. Lots of examples. > > -- Jack Krupansky > > -Original Message- From: Jack Park > Sent: Wednesday, April 03, 2013 11:25 AM > > To: solr-user@lucene.apache.org > Subject: Re: Flow Chart of Solr > > There are three books on Solr, two with that in the title, and one, > Taming Text, each of which have been very valuable in understanding > Solr. > > Jack > > On Wed, Apr 3, 2013 at 5:25 AM, Jack Krupansky > wrote: >> >> Sure, yes. But... it comes down to what level of detail you want and need >> for a specific task. In other words, there are probably a dozen or more >> levels of detail. The reality is that if you are going to work at the Solr >> code level, that is very, very different than being a "user" of Solr, and >> at >> that point your first step is to become familiar with the code itself. >> >> When you talk about "parsing" and "stemming", you are really talking about >> the user-level, not the Solr code level. Maybe what you really need is a >> cheat sheet that maps a user-visible feature to the main Solr code >> component >> for that implements that user feature. >> >> There are a number of different forms of "parsing" in Solr - parsing of >> what? Queries? Requests? Solr documents? Function queries? >> >> Stemming? Well, in truth, Solr doesn't even do stemming - Lucene does >> that. >> Lucene does all of the "token filtering". Are you asking for details on >> how >> Lucene works? Maybe you meant to ask how "term analysis" works, which is >> split between Solr and Lucene. Or maybe you simply wanted to know when and >> where term analysis is done. Tell us your specific problem or specific >> question and we can probably quickly give you an answer. >> >> In truth, NOBODY uses "flow charts" anymore. Sure, there are some >> user-level >> diagrams, but not down to the code level. >> >> If you could focus on specific questions, we could give you specific >> answers. >> >> "Main steps"? That depends on what level you are working at. Tell us what >> problem you are trying to solve and we can point you to the relevant >> areas. >> >> In truth, if you become generally familiar with Solr at the user level >> (study the wikis), you will already know what the "main steps" are. >> >> So, it is not "main steps of Solr", but main steps of some specific >> "request" of Solr, and for a specified level of detail, and for a >> specified >> area of Solr if greater detail is needed. Be more specific, and then we >> can >> be more specific. >> >> For now, the general advice for people who need or want to go far beyond >> the >> user level is to "get familiar with the code" - just LOOK at it - a lot of >> the package and class names are OBVIOUS, really, and follow the class >> hierarchy and code flow using the standard features of any modern Java >> IDE. >> If you are wondering where to start for some specific user-level feature, >> please ask specifically about that feature. But... make a diligent effort >> to >> discover and learn on your own before asking open-ended questions. >> >> Sure, there are lots of things in Lucene and Solr that are rather complex >> and seemingly convoluted, and not obvious, but people are more than >> willing >> to help you out if you simply ask a specific question. I mean, not >> everybody >> needs to know the fine detail of query parsing, analysis, building a >> Lucene-level stemmer, etc. If we tried to put all of that in a diagram, >> most >> people would be more confused than enlightened. >> >> At which step are scores calculated? That's more of a Lucene question. Or, >> are you really asking what code in Solr invokes Lucene search methods that >> calculate basic scores? >> >> In short, you need to be more specific. Don't force us to guess what >> problem >> you are trying to solve. >> >> -- Jack Krupansky >> >> -Original Message- From: Furkan KAMACI >> Sent: Wednesday, April 03, 2013 6:52 AM >> To: solr-user@lucene.apache.org >> Subject: Re: Flow Chart of Solr >> >> >> So, all in all, is
Re: Downloaded Solr 4.2.1 Source: Build Failing
What I learned is that I needed to upgrade Ant, then needed to install Ivy; the build.xml in the outer subversion directory has an ant target to install Ivy, and one to run-maven-build. I ran that, then switched to /solr and ran "ant dist" which finished in under 2 minutes. On Sun, Apr 14, 2013 at 10:14 AM, Steve Rowe wrote: > Hi Umesh, > > I have the exact same Java 1.6 version as you, on OS X v10.8.3. > > I downloaded the source distribution from the same mirror as you did, and ran > 'ant dist' under the solr/ directory, and got "BUILD SUCCESSFUL". > > (FYI, building the source distribution is part of the "smoke testing" we do > as part of validating a release, and this passed for me on my OS X 10.8.3 > machine before I voted to release 4.2.1.) > > What version of Ant are you using? > > What command are you using to build? > > Did you try running 'ant clean' from the top level and then re-building? > > Steve > > On Apr 14, 2013, at 7:41 AM, Umesh Prasad wrote: > >> Further update on same. >> Build on Branch >> http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_2_1 succeeds >> fine. >> Build fails only for Source code downloaded from >> http://apache.techartifact.com/mirror/lucene/solr/4.2.1/solr-4.2.1-src.tgz >> >> >> >> >> On Sun, Apr 14, 2013 at 1:05 PM, Umesh Prasad wrote: >> >>> j*ava version "1.6.0_43" >>> Java(TM) SE Runtime Environment (build 1.6.0_43-b01-447-11M4203) >>> Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01-447, mixed mode) >>> * >>> Mac OS X : Version 10.7.5 >>> >>> -- >>> Umesh >>> >>> >>> >>> On Sat, Apr 13, 2013 at 12:08 AM, Chris Hostetter < >>> hossman_luc...@fucit.org> wrote: >>> : /Users/umeshprasad/Downloads/solr-4.2.1/solr/core/src/java/org/apache/solr/handler/c : *omponent/QueryComponent.java:765: cannot find symbol : [javac] symbol : class ShardFieldSortedHitQueue : [javac] location: class org.apache.solr.handler.component.QueryComponent : [javac] ShardFieldSortedHitQueue queue;* Weird ... can you provide us more details about the java compiler you are using? ShardFieldSortedHitQueue is a package protected class declared in ShardDoc.java (in the same package as QueryComponent). That isn't exactly a best practice, but it shouldn't be causing a compilation failure. -Hoss >>> >>> >>> >>> -- >>> --- >>> Thanks & Regards >>> Umesh Prasad >>> >> >> >> >> -- >> --- >> Thanks & Regards >> Umesh Prasad >
Re: Best way to design a "story and comments" schema.
Jack, Why are multi-valued fields considered messy? I think I am about to learn something.. Thanks Another Jack On Mon, May 13, 2013 at 5:29 AM, Jack Krupansky wrote: > Try the simplest, cleanest design first (at least on paper), before you > start resorting to either dynamic fields or multi-valued fields or other > messy approaches. Like, one collection for stories, which would have a story > id and a second collection for comments, each with a comment id and a field > that is the associated story id and user id. And a third collection for > users and their profiles. Identify the user and get their user id. Identify > the story (maybe by keyword search) to get story id. Then identify and facet > user comments by story id and user id and whatever other search criteria, > and then facet on that. > > -- Jack Krupansky > > -Original Message- From: samabhiK > Sent: Monday, May 13, 2013 5:24 AM > To: solr-user@lucene.apache.org > Subject: Best way to design a "story and comments" schema. > > > Hi, I wish to know how to best design a schema to store comments in stories > / > articles posted. > I have a set of fields: > /indexed="true" stored="true"/> > indexed="true" stored="true"/> > indexed="true" stored="true"/> > indexed="false" stored="true" /> / > Users can post their comments on a post and I should be able to retrieve > these comments and show it along side the original post. I only need to show > the last 3 comments and show a facet of the remaining comments which user > can click and see the rest of the comments ( something like facebook does ). > One alternative, I could think of, was adding a dynamic field for all > comments : > / indexed="false" stored="true"/>/ > So, to store each comments, I would send a text to solr of the form -> > For Field Name: /comment_n/ Value:/[Commenter Name]:[Commenter ID]:[Actual > Comment Text]/ > And to keep the count of those comments, I could use another field like so > :/ indexed="true" stored="true"/>/ > With this approach, I will have to do some calculation when a comment is > deleted by the user but I still can manage to show the comments right. > My idea is to find the best solution for this scenario which will be fast > and also be simple. > Kindly suggest. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Best-way-to-design-a-story-and-comments-schema-tp4062867.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Quick SolrJ query how-to question.
In some sense, if all you want to do is send over a URL, e.g. http://localhost:8993/, it's not out of the question to use the java url stuff as exemplified at http://www.cafeaulait.org/course/week12/22.html or http://stackoverflow.com/questions/7500342/using-sockets-to-fetch-a-webpage-with-java But, that's a trivial case. You might have something else in mind. Jack On Tue, May 14, 2013 at 1:36 PM, Shawn Heisey wrote: > On 5/14/2013 3:13 AM, Luis Cappa Banda wrote: >> I know that, but I was wondering if it exists another way just to set the >> complete query (including q, fq, sort, etc.) embedded in a SolrQuery object >> as the same way that you query using some kind of RequestHandler. That way >> would be more flexible because you don't need to parse the complete query >> checking q, fg, sort... parameters one by one and setting them with >> setFields(), setStart(), setRows(), etcetera. Solr is doing that query >> parse internally when you execute queries with it's REST API and maybe >> there exist a way to re-use that functionality to just set a String to a >> SolrQuery and that SolrQuery does internally all the magic. > > This is a little bit of an odd idea, because it goes against the way a > Java programmer expects to do things. Where does the 'URL parameter' > version of your query come from? If it's possible, it would make more > sense to incorporate that code into your SolrJ app and avoid two steps > -- the need to create the URL syntax and the need to decode the URL syntax. > > In a later message, you said that you are working on a SolrServer > implementation to handle your use case. I'm wondering if SolrJ already > has URL query parameter parsing capability. I'd be slightly surprised > to learn that it does - that code is probably part of the servlet API. > > It's not that your idea is bad, it just sounds like a ton of extra work > that could be better spent elsewhere. > > Thanks, > Shawn >
Re: Varnish
I presume you mean https://www.varnish-cache.org/ That's the first I'd heard of it. Thanks Jack On Thu, Jun 20, 2013 at 10:48 PM, William Bell wrote: > Who is using varnish in front of SOLR? > > Anyone have any configs that work with the cache control headers of SOLR? > > -- > Bill Bell > billnb...@gmail.com > cell 720-256-8076
Re: The book: Solr 4.x Deep Dive - Early Access Release #1
As one of the early reviewers of the manuscript, I always had high hopes for this work. I now have the pdf from lulu; do not have time now to dive deeply, but will comment that it seems, to me at least, well worth owning. Jack On Fri, Jun 21, 2013 at 11:41 AM, Jack Krupansky wrote: > Okay, it's DONE. Here's the Lulu link, ready to go: > > http://www.lulu.com/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-1/ebook/product-21079719.html > > (Or, go to Lulu.com and just search for "Solr" - It's the only hit so far.) > > Price is $9.99 for now (I get $8.10 of that, BTW, in case you're wondering > how Lulu works - minus $0.90 (10%) "base price" to host the file, > bandwidth, credit card processing, etc., and minus another $0.90 (10%) for > Lulu's "share, a total of 19% to Lulu.) > > I'll see how the response is over the next two weeks and maybe adjust the > price. I almost went with $14.99 or even $19.99, but I decided this was a > decent introductory special. I mean, if it was complete, I might sell the > e-book for $25 or $29.99 or so. > > This pricing and distribution is all an experiment and subject to change at > any time. > > Thanks for all the feedback! > > Seriously, if you want to wait two weeks or a month for cleanup, go right > ahead. I thought of delaying so that "everything looks right", but I decided > that some us us just want the facts and the "finish" is not as important. > I'll try to cater to both. > > I'll spend another week or so on cleanup, and then decide whether to > intensify "finish" work, or focus on adding more content, like highlighting, > distributed search, DIH, core and collection management, or maybe even > Spatial. > > Here are the topics that are NOT in the current early-access edition: > > - SolrCloud > - Traditional Distributed Solr - shards, master/slave, replication > - Data Import Handler (DIH) > - Core management > - Collection management > - Admin UI > - Admin API > - Luke > - CheckIndex > - Spatial and Geospatial search > - Highlighting > - Query elevation > - Autocomplete deep dive > - SolrJ API > - UI example > - Application layer example > - Terms Component > - Term vectors component > - Javabin format > - Deeper coverage of DocValues (mentioned in Faceting) > > All of those are candidates for work over in the coming months. > > Here are aspects that are NOT under consideration and beyond the current > anticipated scope of the book, for now: > > - Cookbook approach to Solr > - Deployment, such as configuring Tomcat > - Tuning, estimation, performance optimization > - Troubleshooting > - Tips > - Security > - Access control > - Document-level access control > - Relevance Tuning > - Data Modeling > - How to develop custom plugin code > - Lucene API itself > - Diagrams - sorry, I'm a text guy - but contributions are welcome > - Details of Lucene index format > - Details of Lucene document scoring and relevancy > - Non-Java client APIs > > -- Jack Krupansky > > -Original Message- From: Jack Krupansky > Sent: Friday, June 21, 2013 9:04 AM > To: solr-user@lucene.apache.org > Subject: The book: Solr 4.x Deep Dive - Early Access Release #1 > > > I’m expecting to self-publish the first Early Access Release for my book, > Solr 4.x Deep Dive, on lulu.com sometime today. It is still far from > finished and needs lots of work and missing a lot of important areas > (SolrCloud and distributed Solr in general, DIH, highlighting, core and > collection API, admin API and UI, query elevation, etc.), but I think there > is a critical mass of useful material that is a decent foundation to build > the rest of the book on. For those who participated in the early chapter > review process for the book’s predecessor (Lucene and Solr: The Definitive > Guide), most of those review chapters (at least the ones authored by me) are > included, plus a bunch more, especially chapters on indexing data, update > processors, and faceting. The new book is Solr-only. Alas, I have not > incorporated most of the reviewer feedback yet as I have been focused on > writing for the indexing and faceting chapters for the past two months. > > It will be e-book (PDF) only for the time being. Don’t even think about > printing it yourself – over 1,100 pages, and counting! Currently a 5MB > download. > > I still haven’t settled on pricing. For early access, the intent is that > people will want to check back every couple weeks or month or two, more like > a subscription. My current thought is to treat it as if it were a $60 to $80 > paper book bought once per year, but on a monthly subscription, say $5 to $8 > per download. My expectation is to update roughly every two weeks, or at > least monthly, as new material is added, issues resolved, and new Solr > releases. In the early going, I’ll probably update the PDF on lulu every > week. > > Given this rough model, what price point has the most appeal: $2.99 (yeah, > who doesn’t want it, but little incentive for me!), $4.99 (seems reasonable, >
New user: version conflict for ....
I am running against a networked Solr4 installation -- but not using any of the cloud apparatus. I wish to update a document (Node) with new information. I send back as a partial update using SolrJ's add() command document id the new or updated field version number precisely as it was fetched What I get back is an error message: version conflict for MyFirstNode1356582803755 expected=1422480157168369664 actual=1422480158385766400 When I look at that document in the admin browser, it looks like this: { "TupleListProperty": [ "250d3a66\\-c2cc\\-4c83\\-9b93\\-46cfdde120b8" ], "_version_": 1422480158385766400, "LocatorPropertyType": "MyFirstNode1356582803755" } which means all the other data in that node were replaced. There must be something obvious I am missing. Is there a particular recipe online using SolrJ to do this properly? Many thanks in advance Jack
Re: New user: version conflict for ....
Hi Chris, Your suspicion turned out to be spot on with a code glitch. The history of this has been due to a fairly weak understanding of how partial update works. The first code error was just a simple,stupid one in which I was not working against a "current" copy of the document. But, when I got that single update working, I revised the unit test to increase the complexity. The core unit test creates two "nodes", and then wires them together with a third node which serves as a relation topic between the two nodes; yes, I am building topic maps here. Each actor node (SolrDocument) is constructed, then is updated with an addition to a multi-valued field which contains node identifiers for the TupleNode, a SolrDocument which turns a relationship between two actors into a topic itself. The update is that of adding a value to a previously empty field. The second version of the unit test creates the two actors, and a first relation, then adds a second one. This is where I discovered (lots of trial and error here), that you don't send in the list in the add or set, rather you send in one value at a time. I am imagining that the solution to sending in multivalued updates on the same field might mean a custom update handler which reduces HTTP round trips when dealing with a list of values to add. Perhaps there is a documented way to do multiple updates on the same document/field pair in a single call? Many thanks. Jack On Mon, Dec 31, 2012 at 12:06 PM, Chris Hostetter wrote: > > : any of the cloud apparatus. I wish to update a document (Node) with > : new information. I send back as a partial update using SolrJ's add() > : command > : document id > : the new or updated field > : version number precisely as it was fetched > > can you give us more details about what your client code is doing -- > ideally just include a complate example. > > : What I get back is an error message: > : version conflict for MyFirstNode1356582803755 > : expected=1422480157168369664 actual=1422480158385766400 > > that error suggests that in between the time you downloaded the document, > and when you sent the request to update the document, some other update > was already recieved and changed the version number. > > : When I look at that document in the admin browser, it looks like this: > ... > : which means all the other data in that node were replaced. > > Are you sure the field values were replaced by *your* changes, the ones > that you sent when that error was returned, or is it possible some other > instnace of your code did the exact same update and got a success? > > Checking your server logs to see if multiple update commands were recieved > by solr is one way to help verify this. > > my suspicion is that maybe you have a glitch in your code that results in > the update operation actually happening twice -- and it's the second > update command that is getting the error. > > > > -Hoss
Question about dates and SolrJ
My work engages SolrJ, with which I send documents off to Solr 4 which properly store, as viewed in the admin panel, as this example: 2013-02-04T02:11:39.995Z When I retrieve a document with that date, I use the SolrDocument returned as a Map in which the date now looks like this: Sun Feb 03 18:11:39 PST 2013 I am thinking that I am missing something in the SolrJ configuration, though it could be in how I structure the query; for now, here is the simplistic way I setup SolrJ: HttpSolrServer server = new HttpSolrServer(solrURL); server.setParser(new XMLResponseParser()) Is there something I am missing to retain dates as Solr stores them? Many thanks in advance Jack
Re: Question about dates and SolrJ
Thanks Shawn. I stopped setting the parser as suggested. I found that what I had to do is to just store Date objects in my documents, then, at the last minute, when building a SolrDocument to send, convert with DateField. When I Export to XML, I export to that DateField string, then convert the zulu string back to a Date object as needed. Seems to be working fine now. Many thanks Jack On Sat, Jan 12, 2013 at 10:52 PM, Shawn Heisey wrote: > On 1/12/2013 7:51 PM, Jack Park wrote: >> >> My work engages SolrJ, with which I send documents off to Solr 4 which >> properly store, as viewed in the admin panel, as this example: >> 2013-02-04T02:11:39.995Z >> >> When I retrieve a document with that date, I use the SolrDocument >> returned as a Map in which the date now looks like >> this: >> Sun Feb 03 18:11:39 PST 2013 >> >> I am thinking that I am missing something in the SolrJ configuration, >> though it could be in how I structure the query; for now, here is the >> simplistic way I setup SolrJ: >> >> HttpSolrServer server = new HttpSolrServer(solrURL); >> server.setParser(new XMLResponseParser()) >> >> Is there something I am missing to retain dates as Solr stores them? > > > Quick note: setting the parser is NOT necessary unless you are trying to > connect radically different versions of Solr and SolrJ (1.x and 3.x/later, > to be precise), and will in fact make SolrJ slightly slower when contacting > Solr. Just let it use the default javabin parser -- it's faster. > > If your date field in Solr is an actual date type, then you should be > getting back a Date object in Java which you can manipulate in all the usual > Java ways. The format that you are seeing matches the toString() output > from a Date object: > > http://docs.oracle.com/javase/6/docs/api/java/util/Date.html#toString%28%29 > > You'll almost certainly have to cast the object so it's the right type: > > Date dateField = (Date) doc.get("datefieldname"); > > Thanks, > Shawn >
Re: URL encoding problems
Similar thoughts: I used unit tests to explore that issue with SolrJ, originally encoding with ClientUtils; The returned results had "|" many places in the text, with no clear way to un-encode. I eventually ran some tests with no encoding at all, including strings like "hello & goodbye"; such strings were served and fetched without errors. In queries at the admin console, they show up in the JSON results correctly. What's left? I share the confusion about what is really going on. Jack On Thu, Jan 17, 2013 at 2:44 AM, Bruno Dusausoy wrote: > Hi, > > I have some problems related to URL encoding. > I'm using Solr 3.6.1 on a Windows (32 bit) system. > Apache Tomcat is version 6.0.36. > I'm accessing Solr through solrj-3.3.0. > > When using the Solr admin and specifying my request, the URL looks like this > (${SOLR} is there for the sake of brevity) : > ${SOLR}/select?q=rapporteur_name%3A%28John+%2BSmith+%2B%5C%28FOO%5C%29%29 > > But when my app launching the query, the URL looks like this : > ${SOLR}/select?q=rapporteur_name%3A%28John%5C+Smith%5C+%5C%28FOO%5C%29%29 > > My "decoded" query, as entered in the admin interface, is : > rapporteur_name:(John +Smith +\(FOO\)) > > Both request return results, but only the one returns the correct ones. > > The code that escapes the query is : > > SolrQuery query = new SolrQuery(); > query.setQuery("rapporteur_name:(" + ClientUtils.escapeQueryChars("John > Smith (FOO)") + ")"); > > I don't know if it's the right way to encode the query. > > Any ideas or directions ? > > Regards. > -- > Bruno Dusausoy > Software Engineer > YP5 Software > -- > Pensez environnement : limitez l'impression de ce mail. > Please don't print this e-mail unless you really need to.
When a URL is a component of a query string's data?
There exists in my Solr index a document (several, actually) which harbor http:// URL values. Trying to find documents with a particular URL fails. The query is like this: ResourceURLPropertyType:http://someserver.org/something Fails due to the second ":" If I substitute %3a into that query, e.g. ResourceURLPropertyType:http$3a//someserver.org/something the query goes through and finds nothing. A fork in the road? Make it a policy to swap %3a into all URL values going to Solr, then use the same format in search. or Find another way to get the query to work with the full URL, untouched, in the index. Googling this one has been difficult due to the ambiguity of "url" in query strings. Thoughts? Many thanks in advance Jack
Re: When a URL is a component of a query string's data?
At the admin console, surrounding with "" worked fine. Many thanks Jack On Mon, Jan 21, 2013 at 11:24 AM, Jack Krupansky wrote: > The colons are probably okay. It is probably the slashes causing the > problem. An embedded slash now terminates the preceding term and starts a > regular expression term (that is terminated by a second slash). > > Solution: quote each slash with a backslash. > >ResourceURLPropertyType:http:\/\/someserver.org\/something > > Or, enclose the URL in quotes. > >ResourceURLPropertyType:"http://someserver.org/something"; > > -- Jack Krupansky > > -Original Message- From: Jack Park > Sent: Monday, January 21, 2013 1:41 PM > To: solr-user@lucene.apache.org > Subject: When a URL is a component of a query string's data? > > > There exists in my Solr index a document (several, actually) which > harbor http:// URL values. Trying to find documents with a particular > URL fails. > > The query is like this: > ResourceURLPropertyType:http://someserver.org/something > > Fails due to the second ":" > > If I substitute %3a into that query, e.g. > ResourceURLPropertyType:http$3a//someserver.org/something > the query goes through and finds nothing. > > A fork in the road? > Make it a policy to swap %3a into all URL values going to Solr, then > use the same format in search. > or > Find another way to get the query to work with the full URL, > untouched, in the index. > > Googling this one has been difficult due to the ambiguity of "url" in > query strings. > > Thoughts? > > Many thanks in advance > Jack
Solr and Unicode characters in strings
Here is a situation I now experience: What Solr has: economist and thus …@en What was sent: economist and thus …@en where those are just snippets from what I sent up -- the ellipsis was created by Carrot2, and what comes back when I fetch the document with that passage. There is a hint in the Solr FAQ that the server must support UTF-8; it's not clear how to do that from HTTPSolrServer. Other hints from around the web suggest I should be using a different field than type = "string" I should point out that I am running these developmental tests on the Solr 4 example build with my schema.xml. My question is this: what simple, say, utility call would return the text to its original? (perhaps that's the wrong question...) Many thank in advance Jack
Re: Solr and Unicode characters in strings
Thanks! On Tue, Jan 22, 2013 at 8:59 AM, Otis Gospodnetic wrote: > Hi, > > When you run your indexing app make sure you treat what you send to Solr as > UTF-8. > Use -Dfile.encoding=UTF8 -Dclient.encoding.override=UTF-8 to the Java > command line. > > Otis > -- > Solr & ElasticSearch Support > http://sematext.com/ > > > > > > On Mon, Jan 21, 2013 at 3:06 PM, Jack Park wrote: > >> Here is a situation I now experience: >> >> What Solr has: >> economist and thus …@en >> What was sent: >> economist and thus …@en >> where those are just snippets from what I sent up -- the ellipsis was >> created by Carrot2, and what comes back when I fetch the document with >> that passage. >> >> There is a hint in the Solr FAQ that the server must support UTF-8; >> it's not clear how to do that from HTTPSolrServer. >> Other hints from around the web suggest I should be using a different >> field than type = "string" >> >> I should point out that I am running these developmental tests on the >> Solr 4 example build with my schema.xml. >> >> My question is this: what simple, say, utility call would return the >> text to its original? >> (perhaps that's the wrong question...) >> >> Many thank in advance >> Jack >>
UpdateResponse agents in a cloud?
Say you have a dozen servers, one core each. Say you wish to add an agent reference inside the solrconfig update response descriptor. Would you do that for every core? Thanks in advance. Jack
Re: Introducing Solrstrap: A blazing fast tool for querying Solr in a Googleish fashion
Hi Fergus, Would it make sense to you to switch to the Apache 2 license so that your project can "play nice" in the apache ecosystem? Thanks Jack On Sun, Feb 17, 2013 at 6:25 AM, Fergus McDowall wrote: > Erik > > Thanks for the great feedback. It fills me with joy to know that another > human being has chosen to use Solrstrap > > 1) I have added a couple more CONST variables to the code to allow the > implementer to specify the names of the hit body and hit title > (re: exampledocs/*.xml) > > 2) In order to pass a full document to the hit-template you could simply to > this: > > rs.append(hitTemplate({doc: result.response.docs[i]})); > > and then change the hit template so that it references each hit as "doc" > and subfields thereof {{doc.somefield}} > > >>> {{title}}>> >> >> And finally... GPL?! ewww, why?! (-1) :) >> >> Well played, Fergus! >> >> Erik >> >> >> On Feb 17, 2013, at 05:35 , Fergus McDowall wrote: >> >> > Solrstrap is a very basic Query-Result interface for Solr. Solrstrap is >> intended to be a starting point for those building web interfaces that talk >> to Solr, or a very lightweight admin tool for querying Solr in a Googleish >> fashion. >> > >> > Cool things about Solrstrap: >> > >> >* Requires only local installation- easy to set up >> >* Access to all Bootstrap functionality. Can be easily extended in a >> Bootstrappy way. >> >* Blazing fast >> >* Uses less bandwidth >> > >> > Use it as you see fit. Merciless criticism and fawning praise equally >> welcome. >> > >> > See http://fergiemcdowall.github.com/solrstrap/ >> > >> > and >> > >> > http://blog.comperiosearch.com/blog/2013/02/17/introducing-solrstrap/ >> > >> > Fergus >> > >> > >> >>
>> {{text}} >>
Document update question
>From what I can read about partial updates, it will only work for singleton fields where you can set them to something else, or multi-valued fields where you can add something. I am testing on 4.1 I ran some tests to prove to me that you cannot do anything else to a multi-valued field, like remove a value and do a partial update on the whole list. It flattens the result to a comma delimited String when I remove a value, from "details": [ "here & there", "Hello there", "Oh Fudge" ], to this "details": [ "[here & there, Oh Fudge]" ], Does this meant that I must remove the entire document and re-index it? Many thanks in advance Jack
Re: Document update question
I am using 4.1. I was not aware of that link. In the absence of being able to do partial updates to multi-valued fields, I just punted to delete and reindex. I'd like to see otherwise. Many thanks Jack On Thu, Feb 21, 2013 at 8:13 AM, Timothy Potter wrote: > Hi Jack, > > There was a bug for this fixed for 4.1 - which version are you on? I > remember this b/c I was on 4.0 and had to upgrade for this exact > reason. > > https://issues.apache.org/jira/browse/SOLR-4134 > > Tim > > On Wed, Feb 20, 2013 at 9:16 PM, Jack Park wrote: >> From what I can read about partial updates, it will only work for >> singleton fields where you can set them to something else, or >> multi-valued fields where you can add something. I am testing on 4.1 >> >> I ran some tests to prove to me that you cannot do anything else to a >> multi-valued field, like remove a value and do a partial update on the >> whole list. It flattens the result to a comma delimited String when I >> remove a value, from >>"details": [ >> "here & there", >> "Hello there", >> "Oh Fudge" >> ], >> to this >>"details": [ >> "[here & there, Oh Fudge]" >> ], >> >> Does this meant that I must remove the entire document and re-index it? >> >> Many thanks in advance >> Jack
Re: Document update question
Interesting you should say that. Here is my solrj code: public Solr3Client(String solrURL) throws Exception { server = new HttpSolrServer(solrURL); // server.setParser(new XMLResponseParser()); } I cannot recall why I commented out the setParser line; something about someone saying in another thread it's not important. I suppose I should revisit my unit tests with that line uncommented. Or, did I miss something? The JSON results I painted earlier were from reading the document online in the admin query panel. Many thanks Jack On Thu, Feb 21, 2013 at 8:52 AM, Timothy Potter wrote: > Weird - the only difference I see is that we us XML vs. JSON, but > otherwise, doing the following works for us: > > VALU1 > VALU2 > > Result would be: > > > VALU1 > VALU2 > > > > On Thu, Feb 21, 2013 at 9:44 AM, Jack Park wrote: >> I am using 4.1. I was not aware of that link. In the absence of being >> able to do partial updates to multi-valued fields, I just punted to >> delete and reindex. I'd like to see otherwise. >> >> Many thanks >> Jack >> >> On Thu, Feb 21, 2013 at 8:13 AM, Timothy Potter wrote: >>> Hi Jack, >>> >>> There was a bug for this fixed for 4.1 - which version are you on? I >>> remember this b/c I was on 4.0 and had to upgrade for this exact >>> reason. >>> >>> https://issues.apache.org/jira/browse/SOLR-4134 >>> >>> Tim >>> >>> On Wed, Feb 20, 2013 at 9:16 PM, Jack Park wrote: >>>> From what I can read about partial updates, it will only work for >>>> singleton fields where you can set them to something else, or >>>> multi-valued fields where you can add something. I am testing on 4.1 >>>> >>>> I ran some tests to prove to me that you cannot do anything else to a >>>> multi-valued field, like remove a value and do a partial update on the >>>> whole list. It flattens the result to a comma delimited String when I >>>> remove a value, from >>>>"details": [ >>>> "here & there", >>>> "Hello there", >>>> "Oh Fudge" >>>> ], >>>> to this >>>>"details": [ >>>> "[here & there, Oh Fudge]" >>>> ], >>>> >>>> Does this meant that I must remove the entire document and re-index it? >>>> >>>> Many thanks in advance >>>> Jack
Re: If we Open Source our platform, would it be interesting to you?
Marcelo In some sense, it sounds like you are aiming at building a topic map of all your resources. Jack On Thu, Feb 21, 2013 at 11:54 AM, Marcelo Elias Del Valle wrote: > Hello David, > > First of all, thanks for answering! > > 2013/2/21 David Quarterman > >> Looked through your site and the framework looks very powerful as an >> aggregator. We do a lot of data aggregation from many different sources in >> many different formats (XML, JSON, text, CSV, etc) using RDBMS as the main >> repository for eventual SOLR indexing. A 'one-stop-shop' for all this would >> be very appealing. >> > > Actually, just to clarify, it uses Cassandra as repository, not an > RDMS. We want to use it for large scale, so you could import entire company > databases into the repo and relate the data from one another. However, If I > understood you right, you got the idea, an intermediate repo before > indexing, so you could postpone decisions about what to index and how... > > >> Have you looked at products like Talend & Jitterbit? These offer >> transformation from almost anything to almost anything using graphical >> interfaces (Jitterbit is better) and a PHP-like coding format for trickier >> work. If you (or somebody) could add a graphical interface, the world would >> beat a path to your door! > > > This is very interesting, actually! We considered using Talend when we > started our business, but we decided to go ahead with the development of a > new product. The reason was: Talend is great, but it limits a good > programmer, if he is more agile coding than using graphical interfaces. > Have user interfaces as a possibility is nice, but as something you HAVE TO > use is awful. Besides, it has a learning curve and seems to run better and > you hire their own platform, and we wanted to choose the fine grain of our > platform. > However, your question made me think a lot about it. Do you think > integrating to jitterbit or talend could be interesting? Or did you mean > developing a new user interface? The bad thing I see in integrating with a > talend like program is that you start to be dependent on the graphical > interface, I feel it's hard to use my own java code... I might be wrong. > Anyway, I will consider this possibility, but if you could explain > better why you think one or other could be such a good idea would help us a > lot. Would you be interested in using such a tool yourself? > > Best regards, > Marcelo.
Re: semantic search questions
Hi Vinay, Perhaps you could say more about what you are looking for? What use cases, say. Did you see the book _Taming Text_? Thanks Jack On Fri, Feb 22, 2013 at 8:48 AM, Vinay B, wrote: > Hi, > > A few questions, some specific to UIMA, others more general. > 1. The SOLR/UIMA example employs 3rd party (some of which are > commercial) semantic APIs such as AlchemyApi and OpenCalais. This > won't do for our application (semantic analysis of large numbers of > plain text files) . Are there any open source alternatives that work > with SOLR and can achieve the same results. OpenNLP can extract parts > of speech and extract names etc but isn't really meant for concept > extraction. > 2. Regardless of the caveat mentioned above, can someone illustrate a > usecase for UIMA annotations . i.e. what kind of queries can be > performed once a document has been processed via the UIMA plugin > 3. Does (or can) SOLR have any disambiguation functionality (either > native or via a 3rd party plugin) and if so, how can I leverage it. > Once again OpenNLP has a part of speech tagger that could possibly be > used for this. > eg. if doc 1 contains text "This pipe is made of lead" (lead is a > noun) and doc 2 contains text "Lincoln lead by example" (lead is a > verb) , how would I phrase a query intended to return docs that > countain the term "lead" as a verb. If there's a link that explains > how to do this, please do post it. > > Apparantly SIREN (http://siren.sindice.com/index.html) has some of > this functionality (and more) built in but the documentation and use > cases are a bit sketchy. It also hasn't been updated in a year. Does > anyone know if it will be compatable with future SOLR / Lucene > releases. > > Thanks for your responses.
Interesting issue with "special characters" in a string field value
I have a multi-value stored field called "details" I've been deliberately sending it values like If I fetch a document with that field at the admin query console, using XML, I get: If I fetch with JSON, I get: "details": [ "" ], Even more curious, if I use this query at the console: details: I get nothing back. I think I'm having an identity crisis in relation to escaping characters at SolrJ. The values are going up, and when the query is to bring the document back, they come back. But, as individuals values, they don't appear to submit to query. If I actually escape them going up, then the document is full of escaped characters, which can be troublesome when fetching and using. Any thoughts? Many thanks Jack
Re: Interesting issue with "special characters" in a string field value
Michael, I don't think you misunderstood. I will soon give a full response here, but am on the road at the moment. Many thanks Jack On Friday, February 22, 2013, Michael Della Bitta < michael.della.bi...@appinions.com> wrote: > My mistake, I misunderstood the problem. > > Michael Della Bitta > > > Appinions > 18 East 41st Street, 2nd Floor > New York, NY 10017-6271 > > www.appinions.com > > Where Influence Isn’t a Game > > > On Fri, Feb 22, 2013 at 3:55 PM, Chris Hostetter > wrote: >> >> : If you're submitting documents as XML, you're always going to have to >> : escape meaningful XML characters going in. If you ask for them back as >> : XML, you should be prepared to unescape special XML characters as >> >> that still wouldn't explain the discrepency he's claiming to see between >> the json & xml resmonses (the json containing an empty string >> >> Jack: please elaborate with specifics about your solr version, field, >> field type, how you indexed your doc, and what the request urls & raw >> responses that you get are (ie: don't trust the XML you see in your >> browser, it may be unescaping escaped sequences in element text to be >> "helpful" .. use something like curl) >> >> For example... >> >> BEGIN GOOD EXAMPLE OF SPECIFICS--- >> >> I'm using Solr 4.x with the 4.x example schema which has the following >> field... >> >> >> >> >> I indexed a doc like this... >> >> $ curl "http://localhost:8983/solr/update?commit=true"; -H 'Content-type:application/json' -d '[{"id":"hoss", "cat":"" } ]' >> >> And this is what i get from the following requests... >> >> $ curl " http://localhost:8983/solr/select?q=id:hoss&wt=xml&indent=true&omitHeader=true " >> >> >> >> >> >> hoss >> >>>> >> 1427705631375097856 >> >> >> >> $ curl " http://localhost:8983/solr/select?q=id:hoss&wt=json&indent=true&omitHeader=true " >> { >> "response":{"numFound":1,"start":0,"docs":[ >> { >> "id":"hoss", >> "cat":[""], >> "_version_":1427705631375097856}] >> }} >> >> $ curl "http://localhost:8983/solr/select?q=cat:%22 %22&wt=json&indent=true&omitHeader=true" >> { >> "response":{"numFound":1,"start":0,"docs":[ >> { >> "id":"hoss", >> "cat":[""], >> "_version_":1427705631375097856}] >> }} >> >> END GOOD EXAMPLE OF SPECIFICS--- >> >> : > Even more curious, if I use this query at the console: >> : > >> : > details: >> : > >> : > I get nothing back. >> >> note in my last example above the importance of using quotes (or the >> {!term} qparser) to query string fields that contain special characters >> like whitespace -- whitespace is syntacally meaningul to the lucene query >> parser, it seperates clauses of a boolean query. >> >> >> -Hoss >
Re: Interesting issue with "special characters" in a string field value
Ok. I have revisited this issue as deeply as possible using simplistic unit tests, tossing out indexes, and starting fresh. A typical Solr document might have a label, e.g. the string inside the quotes: "Node Type". That would be queried, according to what I've been able to read, as a Phrase Query, which means, include the quotes around the text. When I use the admin query panel with this query: label:"Node Type" A fragment of the full document is returned. it is this: NodeType Node Type In my code using SolrJ, I have printlines just as the "escaped" query string comes in, and one which shows what the SolrQuery looks like after setting it up to go online. I then show what came back: Solr3Client.runQuery- label:"Node Type" 0 10 Solr3Client.runQuery-1 q=label%3A%22Node+Type%22&start=0&rows=10 {numFound=1,start=0,docs=[SolrDocument{locator=NodeType, smallIcon=cogwheel.png, subOf=ClassType, details=The TopicQuests typology node type., isPrivate=false, creatorId=SystemUser, label=Node Type, largeIcon=cogwheel.png, lastEditDate=Sat Feb 23 20:43:22 PST 2013, createdDate=Sat Feb 23 20:43:22 PST 2013, _version_=1427826019119661056}]} What that says is that SolrQuery inserted a + inside the query string, and that it found 1 document, but did not return it. In the largest picture, I have returned to using XMLResponseParser on the theory that I will now be able to take advantage of partialUpdates on multi-valued fields (List) but haven't tested that yet. I am not yet escaping such things as "<" or ">" but just escaping those things mentioned in the Solr documents which are reserved characters. So, the current update is this: learning about phrase queries, and judicious escaping of reserved characters seems to be helping. Next up entails two issues: more robust testing of escaped characters, and trying to discover what is the best approach to dealing with characters that must be escaped to get past XML, e.g. '<', '>', and others. Many thanks Jack On Fri, Feb 22, 2013 at 2:44 PM, Jack Park wrote: > Michael, > I don't think you misunderstood. I will soon give a full response here, but > am on the road at the moment. > > Many thanks > Jack > > > On Friday, February 22, 2013, Michael Della Bitta > wrote: >> My mistake, I misunderstood the problem. >> >> Michael Della Bitta >> >> >> Appinions >> 18 East 41st Street, 2nd Floor >> New York, NY 10017-6271 >> >> www.appinions.com >> >> Where Influence Isn’t a Game >> >> >> On Fri, Feb 22, 2013 at 3:55 PM, Chris Hostetter >> wrote: >>> >>> : If you're submitting documents as XML, you're always going to have to >>> : escape meaningful XML characters going in. If you ask for them back as >>> : XML, you should be prepared to unescape special XML characters as >>> >>> that still wouldn't explain the discrepency he's claiming to see between >>> the json & xml resmonses (the json containing an empty string >>> >>> Jack: please elaborate with specifics about your solr version, field, >>> field type, how you indexed your doc, and what the request urls & raw >>> responses that you get are (ie: don't trust the XML you see in your >>> browser, it may be unescaping escaped sequences in element text to be >>> "helpful" .. use something like curl) >>> >>> For example... >>> >>> BEGIN GOOD EXAMPLE OF SPECIFICS--- >>> >>> I'm using Solr 4.x with the 4.x example schema which has the following >>> field... >>> >>>>> multiValued="true"/> >>>>> /> >>> >>> I indexed a doc like this... >>> >>> $ curl "http://localhost:8983/solr/update?commit=true"; -H >>> 'Content-type:application/json' -d '[{"id":"hoss", "cat":">> as a source node>" } ]' >>> >>> And this is what i get from the following requests... >>> >>> $ curl >>> "http://localhost:8983/solr/select?q=id:hoss&wt=xml&indent=true&omitHeader=true"; >>> >>> >>> >>> >>> >>> hoss >>> >>> <Something to use as a source node> >>> >>> 1427705631375097856 >>> >>> >>> >>> $ curl >>> "http://localhost:8983/solr/select?q=id:hoss&wt=json&indent=true&omitHeader=true"
Re: Document update question
I uncommented out the line which sets server to an XMLResponse parser, and used the following code in a tiny test: String sourceNodeLocator = node.getLocator(); Map updateMap = new HashMap(); Map newMap = new HashMap(); Map myMap = node.getProperties(); Listvalues = (List) myMap.get(key); String what = "set"; values.add(newValue); updateMap.put(ITopicQuestsOntology.LOCATOR_PROPERTY, sourceNodeLocator); newMap.put(what,values); updateMap.put(key, newMap); IResult result = solr.partialUpdateData(updateMap);; The printstring fragment from that test look like this: and fetching in JSON from the admin query console looks like this: "locator": "MySecondNode1361728848603", "details": [ "here & there", "Oh Fudge" ], It appears that using the XMLResponseParser and getting the query string right works! Many thanks for all the comments. Cheers Jack On Thu, Feb 21, 2013 at 5:45 PM, Shawn Heisey wrote: > On 2/21/2013 10:00 AM, Jack Park wrote: >> >> Interesting you should say that. Here is my solrj code: >> >> public Solr3Client(String solrURL) throws Exception { >> server = new HttpSolrServer(solrURL); >> // server.setParser(new XMLResponseParser()); >> } >> >> I cannot recall why I commented out the setParser line; something >> about someone saying in another thread it's not important. I suppose I >> should revisit my unit tests with that line uncommented. Or, did I >> miss something? >> >> The JSON results I painted earlier were from reading the document >> online in the admin query panel. > > > Jack, > > SolrJ defaults to the javabin response parser, which offers maximum > efficiency in the communication. Between version 1.4.1 and 3.1.0, the > javabin version changed and became incompatible with the old one. > > The XML parser is a little bit less efficient than javabin, but is the only > way to get Solr/SolrJ to talk when one side is using a different javabin > version than the other side. If you are not mixing 1.x with later versions, > you do not need to worry about changing the response parser. > > Thanks, > Shawn >
Re: Interesting issue with "special characters" in a string field value
I did run attempt queries with and without escaping at the admin query browser; made no difference. I seem to recall that the system did not work without escaping, but it does seem worth blocking escaping and testing again. Many thanks Jack On Sun, Feb 24, 2013 at 1:16 PM, Michael Della Bitta wrote: > Hello Jack, > > I'm not sure if this is an option for you, but if you submit and > retrieve your documents using only SolrJ, you won't have to worry > about escaping them for encoding into a particular document format. > SolrJ would handle that for you. > > Michael Della Bitta > > > Appinions > 18 East 41st Street, 2nd Floor > New York, NY 10017-6271 > > www.appinions.com > > Where Influence Isn’t a Game > > > On Sun, Feb 24, 2013 at 12:29 AM, Jack Park wrote: >> Ok. I have revisited this issue as deeply as possible using simplistic >> unit tests, tossing out indexes, and starting fresh. >> >> A typical Solr document might have a label, e.g. the string inside the >> quotes: "Node Type". That would be queried, according to what I've >> been able to read, as a Phrase Query, which means, include the quotes >> around the text. >> >> When I use the admin query panel with this query: >> label:"Node Type" >> A fragment of the full document is returned. it is this: >> >> >> NodeType >> >> Node Type >> >> >> In my code using SolrJ, I have printlines just as the "escaped" query >> string comes in, and one which shows what the SolrQuery looks like >> after setting it up to go online. I then show what came back: >> >> Solr3Client.runQuery- label:"Node Type" 0 10 >> Solr3Client.runQuery-1 q=label%3A%22Node+Type%22&start=0&rows=10 >> {numFound=1,start=0,docs=[SolrDocument{locator=NodeType, >> smallIcon=cogwheel.png, subOf=ClassType, details=The TopicQuests >> typology node type., isPrivate=false, creatorId=SystemUser, label=Node >> Type, largeIcon=cogwheel.png, lastEditDate=Sat Feb 23 20:43:22 PST >> 2013, createdDate=Sat Feb 23 20:43:22 PST 2013, >> _version_=1427826019119661056}]} >> >> What that says is that SolrQuery inserted a + inside the query string, >> and that it found 1 document, but did not return it. >> >> In the largest picture, I have returned to using XMLResponseParser on >> the theory that I will now be able to take advantage of partialUpdates >> on multi-valued fields (List) but haven't tested that yet. I >> am not yet escaping such things as "<" or ">" but just escaping those >> things mentioned in the Solr documents which are reserved characters. >> >> So, the current update is this: learning about phrase queries, and >> judicious escaping of reserved characters seems to be helping. Next up >> entails two issues: more robust testing of escaped characters, and >> trying to discover what is the best approach to dealing with >> characters that must be escaped to get past XML, e.g. '<', '>', and >> others. >> >> Many thanks >> Jack >> >> >> On Fri, Feb 22, 2013 at 2:44 PM, Jack Park wrote: >>> Michael, >>> I don't think you misunderstood. I will soon give a full response here, but >>> am on the road at the moment. >>> >>> Many thanks >>> Jack >>> >>> >>> On Friday, February 22, 2013, Michael Della Bitta >>> wrote: >>>> My mistake, I misunderstood the problem. >>>> >>>> Michael Della Bitta >>>> >>>> >>>> Appinions >>>> 18 East 41st Street, 2nd Floor >>>> New York, NY 10017-6271 >>>> >>>> www.appinions.com >>>> >>>> Where Influence Isn’t a Game >>>> >>>> >>>> On Fri, Feb 22, 2013 at 3:55 PM, Chris Hostetter >>>> wrote: >>>>> >>>>> : If you're submitting documents as XML, you're always going to have to >>>>> : escape meaningful XML characters going in. If you ask for them back as >>>>> : XML, you should be prepared to unescape special XML characters as >>>>> >>>>> that still wouldn't explain the discrepency he's claiming to see between >>>>> the json & xml resmonses (the json containing an empty string >>>>> >>>>> Jack: please elaborat
Re: Query with whitespace
I found a tiny notice about just using quotes; tried it in the admin query console and it works. e.g. label:"car house" would fetch any document for which the label field contained that phrase. Jack On Fri, Mar 1, 2013 at 9:17 AM, Shawn Heisey wrote: > On 3/1/2013 8:50 AM, vsl wrote: >> >> I would like to send query like "car house". My expectation is to have >> resulting documents that contains both car and house. Unfortunately Apache >> Solr out of the box returns documents as if the whitespace between was >> treated as OR. Does anybody know how to fix this? > > > Three solutions come to mind: 1) Set the q.op parameter to AND. 2) Send > "car AND house" instead, or "+car +house". 3) Use the edismax query parser > (defType=edismax) and set the mm parameter to 100%. The wiki should have > info on all these. > > Thanks, > Shawn >
Custom update handler?
With 4.1, not in cloud configuration, I have a custom response handler chain which injects an additional handler for studying the documents as they come in. But, when I do partial updates on those documents, I don't want them to be studied again, so I created another version of the same chain, but without my added feature. I named it "/partial". When I create an instance of SolrJ for the url /solr/partial, I get back this error message: Server at http://localhost:8983/solr/partial returned non ok status:404, message:Not Found {locator=2146fd50-fac9-47d5-85c0-47aaeafe177f, tuples={set=99edfffe-b65c-4b5e-9436-67085ce49c9c}} Here is the configuration for that: The normal handler chain is this: hello which runs on a SolrJ set for http://localhost:8983/solr/ What might I be missing? Many thanks Jack
Re: Custom update handler?
Many thanks. Let me record here what I have tried. I have viewed: http://wiki.apache.org/solr/UpdateXmlMessages and this github project which is suggestive: https://github.com/industria/solrprocessors I now have two UpdateRequestChains: hello and the new one (which is "harvest" without the TopicQuestsDocumentProcessFactory): Before I added "partial" ... "harvest" always ran using http://localhost:8983/solr as the base URL. A goal was to use "harvest" only for "updates" and use "partial" for partial updates. I am now feeding partial with this code: UpdateRequest ur = new UpdateRequest(); ur.add(document); ur.setCommitWithin(1000); UpdateResponse response = ur.process(updateServer); where updateServer is a second SolrJ server set to http://localhost:8983/solr/update But, what is now happening, after I made this addition: partial dropping "partial" into /update where nothing was there before, Now, just "partial" is running from the base URL and "harvest" is never called, which means that I never see partial updates to validate that part of the code. At issue is this: I have two "update" pathways: One for when I am adding new documents One for which I am performing partial updates May I ask how I can configure my system to use "harvest" for new documents and "partial" for when partial updates are sent in? Many thanks Jack On Mon, Mar 11, 2013 at 12:23 AM, Upayavira wrote: > You need to refer to your chain in a RequestHandler config. Search for > /update, duplicate that, and change the chain it points to. > > Upayavira > > On Mon, Mar 11, 2013, at 05:22 AM, Jack Park wrote: >> With 4.1, not in cloud configuration, I have a custom response handler >> chain which injects an additional handler for studying the documents >> as they come in. But, when I do partial updates on those documents, I >> don't want them to be studied again, so I created another version of >> the same chain, but without my added feature. I named it "/partial". >> >> When I create an instance of SolrJ for the url /solr/partial, >> I get back this error message: >> >> Server at http://localhost:8983/solr/partial returned non ok >> status:404, message:Not Found >> {locator=2146fd50-fac9-47d5-85c0-47aaeafe177f, >> tuples={set=99edfffe-b65c-4b5e-9436-67085ce49c9c}} >> >> Here is the configuration for that: >> >> >> >> >> >> >> The normal handler chain is this: >> >> >> >> > class="org.apache.solr.update.TopicQuestsDocumentProcessFactory"> >> hello >> >> >> >> >> which runs on a SolrJ set for http://localhost:8983/solr/ >> >> What might I be missing? >> >> Many thanks >> Jack
Re: Custom update handler? Some progress, new issue
Further progress now hampered by configuring an update log. When I follow instructions found around the web, I get this: SEVERE: Unable to create core: collection1 caused by Caused by: java.lang.NullPointerException at org.apache.solr.common.params.SolrParams.toSolrParams(SolrParams.java:295) Now, the updateLog is configured thus: partial ${solr.data.dir:} I think the issue lies with "solr.data.dir" The wikis just say to drop that into the request handler chain, without any explanation of where "solr.data.dir" comes from. In any case, I might have successfully settled on how to choose which update chain, but now I am deep into the bowels of update logs. What am I missing? Many thanks Jack On Mon, Mar 11, 2013 at 9:45 PM, Jack Park wrote: > Many thanks. > Let me record here what I have tried. > I have viewed: > http://wiki.apache.org/solr/UpdateXmlMessages > > and this github project which is suggestive: > https://github.com/industria/solrprocessors > > > I now have two UpdateRequestChains: > > > >class="org.apache.solr.update.TopicQuestsDocumentProcessFactory"> > hello > > > > > and the new one (which is "harvest" without the > TopicQuestsDocumentProcessFactory): > > > > > > > Before I added "partial" > class="solr.XmlUpdateRequestHandler"> > ... > > "harvest" always ran using http://localhost:8983/solr as the base URL. > > A goal was to use "harvest" only for "updates" and use "partial" for > partial updates. > > I am now feeding partial with this code: > > UpdateRequest ur = new UpdateRequest(); > ur.add(document); > ur.setCommitWithin(1000); > UpdateResponse response = > ur.process(updateServer); > where updateServer is a second SolrJ server set to > http://localhost:8983/solr/update > > But, what is now happening, after I made this addition: > > class="solr.XmlUpdateRequestHandler"> > > partial > > > > dropping "partial" into /update where nothing was there before, > > Now, just "partial" is running from the base URL and "harvest" is > never called, which means that I never see partial updates to validate > that part of the code. > > At issue is this: > > I have two "update" pathways: > One for when I am adding new documents > One for which I am performing partial updates > > May I ask how I can configure my system to use "harvest" for new > documents and "partial" for when partial updates are sent in? > > Many thanks > Jack > > > On Mon, Mar 11, 2013 at 12:23 AM, Upayavira wrote: >> You need to refer to your chain in a RequestHandler config. Search for >> /update, duplicate that, and change the chain it points to. >> >> Upayavira >> >> On Mon, Mar 11, 2013, at 05:22 AM, Jack Park wrote: >>> With 4.1, not in cloud configuration, I have a custom response handler >>> chain which injects an additional handler for studying the documents >>> as they come in. But, when I do partial updates on those documents, I >>> don't want them to be studied again, so I created another version of >>> the same chain, but without my added feature. I named it "/partial". >>> >>> When I create an instance of SolrJ for the url /solr/partial, >>> I get back this error message: >>> >>> Server at http://localhost:8983/solr/partial returned non ok >>> status:404, message:Not Found >>> {locator=2146fd50-fac9-47d5-85c0-47aaeafe177f, >>> tuples={set=99edfffe-b65c-4b5e-9436-67085ce49c9c}} >>> >>> Here is the configuration for that: >>> >>> >>> >>> >>> >>> >>> The normal handler chain is this: >>> >>> >>> >>> >> class="org.apache.solr.update.TopicQuestsDocumentProcessFactory"> >>> hello >>> >>> >>> >>> >>> which runs on a SolrJ set for http://localhost:8983/solr/ >>> >>> What might I be missing? >>> >>> Many thanks >>> Jack
solr.DirectUpdateHandler2 failed to instantiate
That messages gives great, but terrible google. Zillions of hits, mostly filled with very long log traces, and zero messages (that I could find) about what to do about it. I switched over to using that handler since it has an update log specified, and that's the only place I've found how to use update log. But, can't boot now. All the jars are in place; I'm able to import that class in my code. Is there any news on that issue? Many thanks Jack
Re: solr.DirectUpdateHandler2 failed to instantiate
Indeed! Perhaps the germane part is this, before the failure to instantiate notice: Caused by: java.lang.ClassCastException: class org.apache.solr.update.DirectUpda teHandler2 at java.lang.Class.asSubclass(Unknown Source) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader. java:432) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:507) This suggests that I might be doing something wrong elsewhere in solrconfig.xml. The possibly relevant parts (my contributions) are these: hello harvest partial Thanks Jack On Tue, Mar 12, 2013 at 12:16 PM, Mark Miller wrote: > There should be a stack trace - also, you shouldn't have to do anything > special to use this class. It's the default and only truly supported > implementation… > > - Mark > > On Mar 12, 2013, at 2:53 PM, Jack Park wrote: > >> That messages gives great, but terrible google. Zillions of hits, >> mostly filled with very long log traces, and zero messages (that I >> could find) about what to do about it. >> >> I switched over to using that handler since it has an update log >> specified, and that's the only place I've found how to use update log. >> But, can't boot now. >> >> All the jars are in place; I'm able to import that class in my code. >> >> Is there any news on that issue? >> >> Many thanks >> Jack >
Re: solr.DirectUpdateHandler2 failed to instantiate
I can safely say that it is not DirectUpdateHandler2 failing; By commenting out my own handlers, the system boots without error. This means that my handlers are problematic in some way. The moment I put back just one of my handlers: hello harvest The problem returns. It simply appears that I cannot declare a named requestHandler using that class. Jack On Tue, Mar 12, 2013 at 12:22 PM, Jack Park wrote: > Indeed! Perhaps the germane part is this, before the failure to > instantiate notice: > > Caused by: java.lang.ClassCastException: class > org.apache.solr.update.DirectUpda > teHandler2 > at java.lang.Class.asSubclass(Unknown Source) > at > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader. > java:432) > at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:507) > > This suggests that I might be doing something wrong elsewhere in > solrconfig.xml. > > The possibly relevant parts (my contributions) are these: > > > > > > > > >class="org.apache.solr.update.TopicQuestsDocumentProcessFactory"> > hello > > > > >class="solr.DirectUpdateHandler2"> > > harvest > > > > >class="solr.DirectUpdateHandler2"> > > partial > > > > Thanks > Jack > > On Tue, Mar 12, 2013 at 12:16 PM, Mark Miller wrote: >> There should be a stack trace - also, you shouldn't have to do anything >> special to use this class. It's the default and only truly supported >> implementation… >> >> - Mark >> >> On Mar 12, 2013, at 2:53 PM, Jack Park wrote: >> >>> That messages gives great, but terrible google. Zillions of hits, >>> mostly filled with very long log traces, and zero messages (that I >>> could find) about what to do about it. >>> >>> I switched over to using that handler since it has an update log >>> specified, and that's the only place I've found how to use update log. >>> But, can't boot now. >>> >>> All the jars are in place; I'm able to import that class in my code. >>> >>> Is there any news on that issue? >>> >>> Many thanks >>> Jack >>
Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.
Is there a document that tells how to create multiple threads? Search returns many hits which orbit this idea, but I haven't spotted one which tells how. Thanks Jack On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller wrote: > You def have to use multiple threads with it for it to be fast, but 3 or 4 > docs a second still sounds absurdly slow. > > - Mark > > On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda wrote: > >> And up! :-) >> >> I´ve been wondering if using CloudSolrServer has something to do here. Does >> it have a bad performance when a CloudSolrServer singletong receives >> multiple queries? Is it recommended to have a CloudSolrServer instances >> list and select one of them with a Round Robin criteria? >> >> >> >> 2013/3/14 Luis Cappa Banda >> >>> Hello! >>> >>> Thanks a lot, Erick! I've attached some stack traces during a normal >>> 'engine' running. >>> >>> Cheers, >>> >>> - Luis Cappa >>> >>> >>> 2013/3/13 Erick Erickson >>> Stack traces.. First, jps -l that will give you a the process IDs of your running Java processes. Then: jstack Usually I pipe the output from jstack into a text file... Best Erick On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda wrote: > Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s posible > to output this traces, but with a .war application built on top of Spring I > don´t know how can I do that. In any case, here is my CloudSolrServer > wrapper that is used by other classes. There is no sync method or piece of > code: > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {* > > private static final long serialVersionUID = 3905956120804659445L; >public BinaryLBHttpSolrServer(String[] endpoints) throws > MalformedURLException { >super(endpoints); >} > >@Override >protected HttpSolrServer makeServer(String server) throws > MalformedURLException { >HttpSolrServer solrServer = super.makeServer(server); >solrServer.setRequestWriter(new BinaryRequestWriter()); >return solrServer; >} > } > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {* > private CloudSolrServer cloudSolrServer; > > private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class); > > public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[] > endpoints, int clientTimeout, > int connectTimeout, String cloudCollection) { > try { > BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer* > (endpoints); > this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints, > lbSolrServer); > this.cloudSolrServer.setZkConnectTimeout(connectTimeout); > this.cloudSolrServer.setZkClientTimeout(clientTimeout); > this.cloudSolrServer.setDefaultCollection(cloudCollection); > } catch (MalformedURLException e) { > log.error(e); > } > } > > @Override > public QueryResponse *search*(SolrQuery query) throws SolrServerException { > return cloudSolrServer.query(query, METHOD.POST); > } > > @Override > public boolean *index*(DocumentBean user) { > boolean indexed = false; > int retries = 0; > do { > indexed = addBean(user); > retries++; > } while(!indexed && retries<4); > return indexed; > } > @Override > public boolean *update*(SolrInputDocument updateDoc) { > boolean update = false; > int retries = 0; > > do { > update = addSolrInputDocument(updateDoc); > retries++; > } while(!update && retries<4); > return update; > } > @Override > public void commit() { > try { > cloudSolrServer.commit(); > } catch (SolrServerException e) { > log.error(e); > } catch (IOException e) { > log.error(e); > } > } > > @Override > public boolean *delete*(String ... ids) { > boolean deleted = false; > List idList = Arrays.asList(ids); > try { > this.cloudSolrServer.deleteById(idList); > this.cloudSolrServer.commit(true, true); > deleted = true; > > } catch (SolrServerException e) { > log.error(e); > > } catch (IOException e) { > log.error(e); > } > return deleted; > } > > @Override > public void *optimize*() { > try { > this.cloudSolrServer.optimize(); > } catch (SolrServerException e) { > log.error(e); > } catch (IOException e) { > log.error(e); > } > } >>
RE: what is too large for an indexed field
I get no results back on a search. But I can see the actual word or phrase in the stored doc. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Monday, September 21, 2009 4:18 PM To: solr-user@lucene.apache.org Subject: Re: what is too large for an indexed field On Mon, Sep 21, 2009 at 3:27 PM, Park, Michael wrote: > I am trying to place the value of around 390,000 characters into a > single field. However, my search results have become inaccurate. Do you mean that the document should score higher, or that the document doesn't match a particular query? If the former, keep in mind that length normalization penalizes long documents. -Yonik http://www.lucidimagination.com
RE: what is too large for an indexed field
I'm using the solr.WhitespaceTokenizerFactory and the solr.LowerCaseFilterFactory. Is it safe to assume that a token would be created for each word? I can't image anything that would be even close to 16383 chars. Is there a way to dissect the tokens? Thanks, Mike -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Monday, September 21, 2009 3:42 PM To: solr-user@lucene.apache.org Subject: Re: what is too large for an indexed field Park, Michael wrote: > I am trying to place the value of around 390,000 characters into a > single field. However, my search results have become inaccurate. Is > this too large? I tried bumping the maxFieldLength in the > solrconfig.xml file to 500,000 and it hasn't fixed the problem. > > > > Thanks, > > Mike > > > How large is your largest token? There is hard limit of (I think) 16383 chars. -- - Mark http://www.lucidimagination.com
solr home
I already have a handful of solr instances running . However, I'm trying to install solr (1.4) on a new linux server with tomcat using a context file (same way I usually do): However it throws an exception due to the following: SEVERE: Could not start SOLR. Check solr/home property java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in classpath or 'solr/conf/', cwd=/opt/local/solr/fedora_solr at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader. java:198) at org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.ja va:166) Any ideas why this is happening? Thanks, Mike
RE: Solr + autocomplete
Thanks! That's a good suggestion too. I'll look into that. Actually, I was hoping someone had used a reliable JS library that accepted JSON. -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Monday, October 15, 2007 4:44 PM To: solr-user@lucene.apache.org Subject: Re: Solr + autocomplete > > I would imagine there is a library to set up an autocomplete search with > Solr. Does anyone have any suggestions? Scriptaculous has a JavaScript > autocomplete library. However, the server must return an unordered > list. > Solr does not provide an autocomplete UI, but it can return JSON that a JS library can use to populate an autocomplete. Depending on you index size/ query speed, you may be fine with a standard faceting prefix filter. If the index is large, you may want to index using the EdgeNGramFilterFactory. Check the last comment in: https://issues.apache.org/jira/browse/SOLR-357 ryan
RE: Solr + autocomplete
Thx! I remember coming across extjs a ways back. It was very slick. I'll give it a try. -Original Message- From: Bharani [mailto:[EMAIL PROTECTED] Sent: Thursday, October 18, 2007 5:59 AM To: solr-user@lucene.apache.org Subject: RE: Solr + autocomplete You should take a look at http:\\www.extjs.com. The combo box has got an autocomplete fultionality. Infact it even has paging built into it. I just did a demo using Solr for autocomplete and i got a very good responsive GUI. I have got about 100,000 documents with 26 fields each and get a response < 1s Hope that helps -Bharani Park, Michael wrote: > > Thanks! That's a good suggestion too. I'll look into that. > > Actually, I was hoping someone had used a reliable JS library that > accepted JSON. > > -Original Message- > From: Ryan McKinley [mailto:[EMAIL PROTECTED] > Sent: Monday, October 15, 2007 4:44 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr + autocomplete > >> >> I would imagine there is a library to set up an autocomplete search > with >> Solr. Does anyone have any suggestions? Scriptaculous has a > JavaScript >> autocomplete library. However, the server must return an unordered >> list. >> > > Solr does not provide an autocomplete UI, but it can return JSON that a > JS library can use to populate an autocomplete. > > Depending on you index size/ query speed, you may be fine with a > standard faceting prefix filter. If the index is large, you may want to > > index using the EdgeNGramFilterFactory. > > Check the last comment in: > https://issues.apache.org/jira/browse/SOLR-357 > > ryan > > > > -- View this message in context: http://www.nabble.com/Solr-%2B-autocomplete-tf4630140.html#a13271445 Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr + autocomplete
Thanks Ryan, This looks like the way to go. However, when I set up my schema I get, "Error loading class 'solr.EdgeNGramFilterFactory'". For some reason the class is not found. I tried the stable 1.2 build and even tried the nightly build. I'm using "". Any suggestions? Thanks, Mike -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Monday, October 15, 2007 4:44 PM To: solr-user@lucene.apache.org Subject: Re: Solr + autocomplete > > I would imagine there is a library to set up an autocomplete search with > Solr. Does anyone have any suggestions? Scriptaculous has a JavaScript > autocomplete library. However, the server must return an unordered > list. > Solr does not provide an autocomplete UI, but it can return JSON that a JS library can use to populate an autocomplete. Depending on you index size/ query speed, you may be fine with a standard faceting prefix filter. If the index is large, you may want to index using the EdgeNGramFilterFactory. Check the last comment in: https://issues.apache.org/jira/browse/SOLR-357 ryan
RE: Solr + autocomplete
Will I need to use Solr 1.3 with the EdgeNGramFilterFactory in order to get the autosuggest feature? -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Monday, November 12, 2007 1:05 PM To: solr-user@lucene.apache.org Subject: RE: Solr + autocomplete : "Error loading class 'solr.EdgeNGramFilterFactory'". For some reason EdgeNGramFilterFactory didn't exist when Solr 1.2 was released, but the EdgeNGramTokenizerFactory did. (the javadocs that come with each release list all of the various factories in that release) -Hoss
tomcat context fragment
Hello All, I've been working with solr on Tomcat 5.5/Windows and had success setting my solr home using the context fragment. However, I cannot get it to work on Tomcat 5.028/Unix. I've read and re-read the Apache Tomcat documentation and cannot find a solution. Has anyone run into this issue? Is there some Tomcat setting that is preventing this from working? Thanks, Mike
RE: tomcat context fragment
I've found the problem. The Context attribute path needed to be set: -Original Message- From: Park, Michael [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 05, 2007 5:28 PM To: solr-user@lucene.apache.org Subject: tomcat context fragment Hello All, I've been working with solr on Tomcat 5.5/Windows and had success setting my solr home using the context fragment. However, I cannot get it to work on Tomcat 5.028/Unix. I've read and re-read the Apache Tomcat documentation and cannot find a solution. Has anyone run into this issue? Is there some Tomcat setting that is preventing this from working? Thanks, Mike
RE: tomcat context fragment
Hi Chris, No. I set up a separate file, same as the wiki. It's either a tomcat version issue or a difference between how tomcat on my Win laptop is configured vs. the configuration on our tomcat Unix machine. I intend to run multiple instances of solr in production and wanted to use the context fragments. I have 3 test instances of solr running now (with 3 context files) and found that whatever you set the path attribute to becomes the name of the deployed web app (it doesn't have to match the name of the context file, but cleaner to keep the names the same). Here is what I found on the Apache site about this: "The context path of this web application, which is matched against the beginning of each request URI to select the appropriate web application for processing. All of the context paths within a particular Host must be unique. If you specify a context path of an empty string (""), you are defining the default web application for this Host, which will process all requests not assigned to other Contexts." ~Mike -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 06, 2007 2:53 PM To: solr-user@lucene.apache.org Subject: RE: tomcat context fragment : I've found the problem. : : The Context attribute path needed to be set: : :