Re: disaster recovery scenarios for solr cloud and zookeeper

2013-05-06 Thread Furkan KAMACI
Hi Mark; You said: "So it's pretty simple - when you lost the ability to talk to ZK, everything keeps working based on the most recent clusterstate - except that updates are blocked and you cannot add new nodes to the cluster." Where nodes keeps cluster stat? When a query comes to a node that is a

action=CREATE

2013-05-06 Thread Peter Kirk
Hi I have a core definition in solr.xml which looks like the following: If I instead want to create this core with a CREATE command, how do I also supply a property - like "language" in the above? For example, some sort of request: http://localhost:8080/solr/admin/cores?acti

Memory problems with HttpSolrServer

2013-05-06 Thread Rogowski, Britta
Hi! When I write from our database to a HttpSolrServer, (using a LinkedBlockingQueue to write just one document at a time), I run into memory problems (due to various constraints, I have to remain on a 32-bit system, so I can use at most 2 GB RAM). If I use an EmbeddedSolrServer (to write loca

without the indexed property is set to true by default?

2013-05-06 Thread joo
Indexed properties in a constant field current to the field, I did not give the search. indexed attribute is set to true by default, does not turn you on? -- View this message in context: http://lucene.472066.n3.nabble.com/without-the-indexed-property-is-set-to-true-by-default-tp4060973.html Se

Questions about the performance of Solr

2013-05-06 Thread joo
Search speed at which data is loaded is more than 7 ten millon current will be reduced too. About 50 seconds it will take, but the number is often just this, it is not possible to know whether such. Will there is a problem with the Query I use it to know the Query Optimizing Solr and fall. The Quer

Indexing off of the production servers

2013-05-06 Thread David Parks
I've had trouble figuring out what options exist if I want to perform all indexing off of the production servers (I'd like to keep them only for user queries). We index data in batches roughly daily, ideally I'd index all solr cloud shards offline, then move the final index files to the solr cl

Re: Duplicated Documents Across shards

2013-05-06 Thread Iker Mtnz. Apellaniz
Thanks Erick, I think we found the problem. When defining the cores for both shards we define both of them in the same instanceDir, like this: Each shard should have its own folder, so the final configuration should be like this: Can anyone confirm this? Thanks, Iker 2013/5/4 Erick E

Re: Indexing off of the production servers

2013-05-06 Thread Furkan KAMACI
1-2) Your aim for using Hadoop is probably Map/Reduce jobs. When you use Map/Reduce jobs you split your workload, process it, and then reduce step takes into account. Let me explain you new SolrCloud architecture. You start your SolrCluoud with a numShards parameter. Let's assume that you have 5 sh

Re: iterate through each document in Solr

2013-05-06 Thread Dmitry Kan
Are you doing it once? Is your index sharded? If so, can you ask each shard individually? Another way would be to do it on Lucene level, i.e. read from the binary indices (API exists). Dmitry On Mon, May 6, 2013 at 5:48 AM, Mingfeng Yang wrote: > Dear Solr Users, > > Does anyone know what is t

solr adding unique values

2013-05-06 Thread Nikhil Kumar
Hey, I have recently started using solr, I have a list of users, which are subscribed to some lists. eg. user a[ id:a liists[ list_a ] ] user b[ id:b liists[ list_a ] ] I am using {"id": a, "lists":{"add":"list_a"}} to add particular list a user. but what is happen

Re: Did something change with Payloads?

2013-05-06 Thread hariistou
Hi, I realized that there was no mistake with the way Lucene writes postings/payloads. Rather, there is no flaw there. The problem may be with the way the scorePayload() is implemented. We need to use both payload.bytes, and payload.offset to compute the score. So, please ignore my previous mess

Re: disaster recovery scenarios for solr cloud and zookeeper

2013-05-06 Thread Erick Erickson
If I understand correctly, each of the nodes has a copy of the state as of the last time there was a ZK quorum and operates off that so the cluster can keep chugging along with updates disabled. Of course if the state of your cluster changes (i.e. nodes come or go), ZK is no longer available to te

Re: action=CREATE

2013-05-06 Thread Erick Erickson
http://wiki.apache.org/solr/CoreAdmin#CREATE "Core properties can be specified when creating a new core using optional property.name=value request parameters, similar to tag inside solr.xml." haven't tried it myself though... Best Erick On Mon, May 6, 2013 at 3:04 AM, Peter Kirk wrote: > Hi >

Scores dilemma after providing boosting with bq as same weigtage for 2 condition

2013-05-06 Thread nishi
While giving bq with same weightage as ^1.2 below on two values of articleTopic, the result always coming all of the "Office" on the top somehow. What other factors would influence at this scenario when there is no keyword search also? http://localhost:8080/solr?rows=900&fq=(articleTopic:"Food" OR

Re: Duplicated Documents Across shards

2013-05-06 Thread Erick Erickson
Having multiple cores point to the same index is, except for special circumstances where one of the cores is guaranteed to be read only, a Bad Thing. So it sounds like you've found your issue... Best Erick On Mon, May 6, 2013 at 4:44 AM, Iker Mtnz. Apellaniz wrote: > Thanks Erick, > I think w

Re: Indexing off of the production servers

2013-05-06 Thread Erick Erickson
The only problem with using Hadoop (or whatever) is that you need to be sure that documents end up on the same shard, which means that you have to use the same routing mechanism that SolrCloud uses. The custom doc routing may help here My very first question, though, would be whether this is n

Re: Duplicated Documents Across shards

2013-05-06 Thread Iker Mtnz. Apellaniz
Thank you very Much Erick, That was the real problem, we had two cores sharing the same folder and core_name. Here is the definitive version of the solr.xml. Tested and correctly working Thanks everybody Iker 2013/5/6 Erick Erickson > Having multiple cores point to the same index is, exc

Re: solr adding unique values

2013-05-06 Thread Erick Erickson
Depends on your goal here. I'm guessing you're using atomic updates, in which case you need to use "set" rather than "add" as the former replaces the contents. See: http://wiki.apache.org/solr/UpdateJSON#Solr_4.0_Example If you're simply re-indexing the documents, just send the entire fresh docume

query regarding the multiple documents

2013-05-06 Thread Rohan Thakur
hi all wanted to know that I have indexed documents for search purpose in solr and now for auto suggestion purpose I want to index new data that is the popular query term searched by users and frequency of them to get searched on websitebut as it has no relation with the product data on which

Re: Scores dilemma after providing boosting with bq as same weigtage for 2 condition

2013-05-06 Thread Erick Erickson
Try adding &debugQuery=true to your query, the resulting data will show you exactly how the doc score is calculated. Warning: reading the explain can be a bit challenging, but that's the only way to really understand why docs scored as they did. Best Erick On Mon, May 6, 2013 at 7:33 AM, nishi

Re: Indexing off of the production servers

2013-05-06 Thread Furkan KAMACI
Hi Erick; I think that even if you use Map/Reduce you will not parallelize you indexing because indexing will parallelize as much as how many leaders you have at your SolrCloud, isn't it? 2013/5/6 Erick Erickson > The only problem with using Hadoop (or whatever) is that you > need to be sure th

Re: Log Monitor System for SolrCloud and Logging to log4j at SolrCloud?

2013-05-06 Thread Furkan KAMACI
Is there any road map for Solr when will Solr 4.3 be tagged at svn? 2013/4/26 Mark Miller > Slf4j is meant to work with existing frameworks - you can set it up to > work with log4j, and Solr will use log4j by default in the about to be > released 4.3. > > http://wiki.apache.org/solr/SolrLogging

Re: iterate through each document in Solr

2013-05-06 Thread Andre Bois-Crettez
On 05/06/2013 06:03 AM, Michael Sokolov wrote: On 5/5/13 7:48 PM, Mingfeng Yang wrote: Dear Solr Users, Does anyone know what is the best way to iterate through each document in a Solr index with billion entries? I tried to use select?q=*:*&start=xx&rows=500 to get 500 docs each time and the

Re: Is indexing large documents still an issue?

2013-05-06 Thread Bai Shen
You can still use highlighting without returning the content. Just set content as your alternate highlight field. Then if no highlights are returned you will receive the content. Make sure you set a character limit so you don't get the whole thing. I use 300. Does that make sense? This is wh

Re: Memory problems with HttpSolrServer

2013-05-06 Thread Andre Bois-Crettez
On 05/06/2013 09:32 AM, Rogowski, Britta wrote: Hi! When I write from our database to a HttpSolrServer, (using a LinkedBlockingQueue to write just one document at a time), I run into memory problems (due to various constraints, I have to remain on a 32-bit system, so I can use at most 2 GB RA

RE: Indexing off of the production servers

2013-05-06 Thread David Parks
I'm less concerned with fully utilizing a hadoop cluster (due to having fewer shards than I have hadoop reduce slots) as I am with just off-loading the whole indexing process. We may just want to re-index the whole thing to add some index time boosts or whatever else we conjure up to make queries f

replication between solr 3.1 and 4.x

2013-05-06 Thread elrond
Is it possible to replicate solr between diffrent versions ? (in my case between 3.1 (master) and 4.x(slave)))? All i get is: May 06, 2013 2:28:00 PM org.apache.solr.handler.SnapPuller fetchFileList SEVERE: No files to download for index generation: 3 -- View this message in context: http:/

Re: Indexing off of the production servers

2013-05-06 Thread Furkan KAMACI
Hi Dave; I think that when you do indexing you can use CloudSolrServer so you can learn from Zookeeper that where you data will go and then send your data to there. This will speed up you when indexing and gives benefit of Map/Reduce. Your data will be indexed by shard leaders while your replicas

Re: Indexing off of the production servers

2013-05-06 Thread Upayavira
In non-SolrCloud mode, you can index to another core, and then swap cores. You could index on another box, ship the index files to your production server, create a core pointing at these files, then swap this core with the original one. If you can tell your search app to switch to using a differen

[ANNOUNCE] Apache Solr 4.3 released

2013-05-06 Thread Simon Willnauer
May 2013, Apache Solr™ 4.3 available The Lucene PMC is pleased to announce the release of Apache Solr 4.3. Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, d

Re: Indexing off of the production servers

2013-05-06 Thread Erick Erickson
bq: Your data will be indexed by shard leaders while your replicas are responsible for querying. This is not true in SolrCloud mode. When you send a document to Solr, upon return that document has been sent to every replica for the appropriate shard and entered in the transaction log. It is index

List of Solr Query Parsers

2013-05-06 Thread Jan Høydahl
Hi, I just added a Wiki page to try to gather a list of all known Solr query parsers in one place, both those which are part of Solr and those in JIRA or 3rd party. http://wiki.apache.org/solr/QueryParser If you known about other cool parsers out there, please add to the list. -- Jan Høydah

Re: Indexing off of the production servers

2013-05-06 Thread Erick Erickson
bq: I thought I had heard of master/slave hierarchy in 3.x that would allow us to designate a master to do indexing and let the slaves pull finished indexes from the master, so I thought maybe something like that followed into solr cloud. You can still do this in Solr4 if you choose, but not in c

Re: List of Solr Query Parsers

2013-05-06 Thread Jack Krupansky
Jan, I have a full 80-page chapter on query parsers in the new book on Lucene and Solr. Send me an email if you would like to be a reviewer. It integrates the descriptions of Solr query parser, dismax, and edismax so that it's not as difficult to figure out which is which and how they compare.

A Comma /aSpace in a Query argument

2013-05-06 Thread Peter Sch�tt
Hallo, I want to use a comma as part of a query argument. E.G. q=myfield:aa,bb and "aa,bb" is the value of the field. Do I have to mask it? And what is about a space in an argument q=myfield:aa bb and "aa bb" is the value of the field. Thanks for any hint. Ciao Peter Schütt

Re: List of Solr Query Parsers

2013-05-06 Thread Roman Chyla
Hi Jan, Please add this one http://29min.wordpress.com/category/antlrqueryparser/ - I can't edit the wiki This parser is written with ANTLR and on top of lucene modern query parser. There is a version which implements Lucene standard QP as well as a version which includes proximity operators, mult

Re: Duplicated Documents Across shards

2013-05-06 Thread Jack Krupansky
I think if we had a more compehensible term for a "collection configuration directory", a lot of the confusion would go away. I mean, what the heck is an "instance" anyway? How does "instanceDir" relate to an "instance" of the Solr "server"? Sure, I know that it is the parent directory of the c

Re: A Comma /aSpace in a Query argument

2013-05-06 Thread Jack Krupansky
Commas, dots, hyphens, slashes, and semicolons do not need to be "escaped", but spaces do. But... be careful, because some analyzers will throw away all or most punctuation. You may have to resort to the white space analyzer to preserver punctuation characters, but you can add char filters to

Re: A Comma /aSpace in a Query argument

2013-05-06 Thread giovanni.bricc...@banzai.it
Try escaping it with a \ Giovanni Il 06/05/13 15:34, Peter Sch�tt ha scritto: Hallo, I want to use a comma as part of a query argument. E.G. q=myfield:aa,bb and "aa,bb" is the value of the field. Do I have to mask it? And what is about a space in an argument q=myfield:aa bb and "aa b

Re: Indexing off of the production servers

2013-05-06 Thread Andre Bois-Crettez
Excellent idea ! And it is possible to use collection aliasing with the CREATEALIAS to make this transparent for the query side. ex. with 2 collections named : collection_1 collection_2 /collections?action=CREATEALIAS&name=collectionalias&collections=collection_1 "collectionalias" is now a virtu

Re: List of Solr Query Parsers

2013-05-06 Thread Jan Høydahl
Hi Roman, This sounds great! Please register as a user on the WIKI and give us your username here, then we'll grant you editing karma so you can edit the page yourself! The NEAR/5 syntax is really something I think we should get into the default lucene parser. Can't wait to have a look at your

Re: A Comma /aSpace in a Query argument

2013-05-06 Thread Jack Krupansky
Oops, and I neglected to mention that you can escape a single character with a backslash, or you can enclose the entire term in double quotes: q=myfield:aa,bb q=myfield:"aa,bb" q=myfield:aa\ bb q=myfield:"aa bb" -- Jack Krupansky -Original Message- From: Jack Krupansky Sent: Monday,

Re: Memory problems with HttpSolrServer

2013-05-06 Thread Shawn Heisey
On 5/6/2013 1:32 AM, Rogowski, Britta wrote: > Hi! > > When I write from our database to a HttpSolrServer, (using a > LinkedBlockingQueue to write just one document at a time), I run into memory > problems (due to various constraints, I have to remain on a 32-bit system, so > I can use at most

Solr on Amazon EC2

2013-05-06 Thread Rajesh Nikam
Hello, I am looking into how to do document classification for categorization of html documents. I see Solr/Lucene + MoreLikeThis that suits to find similar documents for given document. I am able to do classification using Lucene + MoreLikeThis example. Then I was looking for how to host Solr o

Re: Duplicated Documents Across shards

2013-05-06 Thread Shawn Heisey
On 5/6/2013 7:44 AM, Jack Krupansky wrote: > I think if we had a more compehensible term for a "collection > configuration directory", a lot of the confusion would go away. I mean, > what the heck is an "instance" anyway? How does "instanceDir" relate to > an "instance" of the Solr "server"? Sure,

Re: Indexing off of the production servers

2013-05-06 Thread Shawn Heisey
On 5/6/2013 7:55 AM, Andre Bois-Crettez wrote: > Excellent idea ! > And it is possible to use collection aliasing with the CREATEALIAS to > make this transparent for the query side. > > ex. with 2 collections named : > collection_1 > collection_2 > > /collections?action=CREATEALIAS&name=collectio

Re: Duplicated Documents Across shards

2013-05-06 Thread Jack Krupansky
Oops... you're right, and before I started writing that response I had the thought that these should be "shardDir", but even that is confused. I think "replicaDir" or "collectionReplica" or "shardReplicaDir" or... "collectionShardReplicaDir" - the latter is wordy, but is explicit. I'd reserve "

Re: Indexing off of the production servers

2013-05-06 Thread Furkan KAMACI
Hi Erick; Thanks for your answer. I have read that at somewhere: I believe "redirect" from replica to leader would happen only at index time, so a doc first gets indexed to leader and from there it's replicated to non-leader shards. Is that true? I want to make clear the things in my mind otherw

Re: Atomic Update and stored copy-fields

2013-05-06 Thread raulgrande83
We have defined those copyfield destinations as stored because we have experienced some problems when highlighting in them. These fields have different Tokenizers and Analyzers. We have found that if we search in one of them but highlight in a different one some words that doesn't match the first q

update to 4.3

2013-05-06 Thread Arkadi Colson
Hi After update to 4.3 I got this error: May 06, 2013 2:30:08 PM org.apache.coyote.AbstractProtocol init INFO: Initializing ProtocolHandler ["http-bio-8983"] May 06, 2013 2:30:08 PM org.apache.coyote.AbstractProtocol init INFO: Initializing ProtocolHandler ["ajp-bio-8009"] May 06, 2013 2:30:08 P

Re: Solr on Amazon EC2

2013-05-06 Thread Stephane Gamard
Hi Rajesh, Rule of thumb when it comes to Solr and the cloud is run your own instance. There are so many difference (subtle but could be painful) between Solr releases that it is best that you know which you are using. Solr is also package to work directly out of the box (using the jetty starter: s

Re: Log Monitor System for SolrCloud and Logging to log4j at SolrCloud?

2013-05-06 Thread Steve Rowe
Done - see http://markmail.org/message/66vpwk42ih6uxps7 On May 6, 2013, at 5:29 AM, Furkan KAMACI wrote: > Is there any road map for Solr when will Solr 4.3 be tagged at svn? > > 2013/4/26 Mark Miller > >> Slf4j is meant to work with existing frameworks - you can set it up to >> work with log

Re: Duplicated Documents Across shards

2013-05-06 Thread Shawn Heisey
> Oops... you're right, and before I started writing that response I had the > thought that these should be "shardDir", but even that is confused. I > think > "replicaDir" or "collectionReplica" or "shardReplicaDir" or... > "collectionShardReplicaDir" - the latter is wordy, but is explicit. I'd > r

Re: disaster recovery scenarios for solr cloud and zookeeper

2013-05-06 Thread Mark Miller
ClusterState is kept in memory and Solr is notified of ClusterState updates by ZooKeeper when a change happens - Solr then grabs the latest ClusterState. If ZooKeeper goes down, Solr keeps using the in memory ClusterState it has and simply stops getting any new ClusterState updates until ZooKeep

Tokenize Sentence and Set Attribute

2013-05-06 Thread Rendy Bambang Junior
Hello, I am trying to use part of speech tagger for bahasa Indonesia to filter tokens in Solr. The tagger receive input as word list of a sentence and return tag array. I think the process should by like this: - tokenize sentence - tokenize word - pass it into the tagger - set attribute using tag

Re: custom tokenizer error

2013-05-06 Thread Sarita Nair
baseTokenizer is reset in the #reset method. Sarita From: Jack Krupansky To: solr-user@lucene.apache.org Sent: Sunday, May 5, 2013 1:37 PM Subject: Re: custom tokenizer error I didn't notice any call to the "reset" method for your base tokenizer. Is there

solr.LatLonType type vs solr.SpatialRecursivePrefixTreeFieldType

2013-05-06 Thread bbarani
Hi, I am currently using SOLR 4.2 to index geospatial data. I have configured my geospatial field as below. I just want to make sure that I am using the correct SOLR class for performing geospatial search since I am not sure which of the 2 class(LatLonType vs SpatialRecursivePr

RE: Indexing off of the production servers

2013-05-06 Thread David Parks
So, am I following this correctly by saying that, this proposed solution would present us a way to index a collection on an offline/dev solr cloud instance and *move* that pre-prepared index to the production server using an alias/rename trick? That seems like a reasonably doable solution. I also

Solr Cloud with large synonyms.txt

2013-05-06 Thread Son Nguyen
Hello, I'm building a Solr Cloud (version 4.1.0) with 2 shards and a Zookeeper (the Zookeeer is on different machine, version 3.4.5). I've tried to start with a 1.7MB synonyms.txt, but got a "ConnectionLossException": Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: Keep

Re: iterate through each document in Solr

2013-05-06 Thread Mingfeng Yang
Hi Dmitry, My index is not sharded, and since its size is so big, sharding won't help much on the paging issue. Do you know any API which can help read from lucene binary index directly? I will be nice if we can just scan through the docs directly. Thanks! Ming- On Mon, May 6, 2013 at 3:33

Re: iterate through each document in Solr

2013-05-06 Thread Mingfeng Yang
Andre, Thanks for the info! Unfortunately, my solr is on 3.6 version, and looks like those options are not available. :( Ming- On Mon, May 6, 2013 at 5:32 AM, Andre Bois-Crettez wrote: > On 05/06/2013 06:03 AM, Michael Sokolov wrote: > >> On 5/5/13 7:48 PM, Mingfeng Yang wrote: >> >>> Dear So

how to quickly export data from SolrCloud

2013-05-06 Thread Kevin Osborn
I am looking to export a large amount of data from Solr. This export will be done by a Java application and then written to file. Initially, I was thinking of using direct HTTP calls and using the CSV response writer. And then my Java application can quickly parse each line from a stream. But, wit

Re: how to quickly export data from SolrCloud

2013-05-06 Thread Shawn Heisey
On 5/6/2013 10:48 AM, Kevin Osborn wrote: I am looking to export a large amount of data from Solr. This export will be done by a Java application and then written to file. Initially, I was thinking of using direct HTTP calls and using the CSV response writer. And then my Java application can quic

Re: Tokenize Sentence and Set Attribute

2013-05-06 Thread Jack Krupansky
Sounds like a very ambitious project. I'm sure you COULD do it in Solr, but not in very short order. Check out some discussion of simply searching within sentences: http://markmail.org/message/aoiq62a4mlo25zzk?q=apache#query:apache+page:1+mid:aoiq62a4mlo25zzk+state:results First, how do you exp

Re: how to quickly export data from SolrCloud

2013-05-06 Thread Kevin Osborn
This is actually something I will do quite frequently. I basically export from Solr into a CSV file as part of a workflow sequence. CSV is nice and fast, but does not have the ZooKeeper integration that I like with SolrJ. On Mon, May 6, 2013 at 10:11 AM, Shawn Heisey wrote: > On 5/6/2013 10:48

Solr 4.3 and SLF4j

2013-05-06 Thread Jonatan Fournier
Hi, I've read from http://wiki.apache.org/solr/SolrLogging that Solr no longer ships with Logging jars bundled into the WAR file. For simplicity in package management, other than Solr, I'm trying to stay with stock packages from Ubuntu 12.04 (e.g. Tomcat7 etc.) Now I'm trying to find out what do

Re: List of Solr Query Parsers

2013-05-06 Thread Roman Chyla
Hi Jan, My login is RomanChyla Thanks, Roman On 6 May 2013 10:00, "Jan Høydahl" wrote: > Hi Roman, > > This sounds great! Please register as a user on the WIKI and give us your > username here, then we'll grant you editing karma so you can edit the page > yourself! The NEAR/5 syntax is really so

Re: Solr 4.3 and SLF4j

2013-05-06 Thread Mark Miller
You need all the same jars that are in the lib/ext folder of the default jetty distribution. Those are the logging jars, those are what you need. All you can do is swap out impls (see the SLF4j documentation). You must have all those jars as a start, and if you don't want to use log4j, you can s

Re: Query Elevation exception on shard queries

2013-05-06 Thread Ravi Solr
Varun, Since our cores were totally disjoint i.e. they pertain to two different applications which may or may not have results for a given query, we moved the elavation outside of solr into our java code. As long as both cores had some results to return for a given query elevation would wo

Re: Why is SolrCloud doing a full copy of the index?

2013-05-06 Thread Michael Della Bitta
Hi Shawn, Thanks a lot for this entry! I'm wondering, when you say "Garbage collections that happen more often than ten or so times per minute may be an indication that the heap size is too small," do you mean *any* collections, or just full collections? Michael Della Bitta ---

Re: iterate through each document in Solr

2013-05-06 Thread Dmitry Kan
Hi Ming, Quoting my anwser on a diff. thread ( http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201210.mbox/%3ccaonbidbuzzsaqctdhtlxlgeoori_ghrjbt-84bm0zb-fsps...@mail.gmail.com%3E ): > > [code] > > Directory indexDir = FSDirectory.open(new File(pathToDir)); > > IndexReader input = Index

Re: update to 4.3

2013-05-06 Thread Jan Høydahl
Hi, The reason is that from Solr 4.3 you need to provide the SLF4J logger jars of choice when deploying Solr to an external servlet container. Simplest is to copy all jars from example/lib/ext into tomcat/lib cd solr-4.3.0/example/lib/ext cp * /usr/local/apache-tomcat-7.0.39/lib/ Please see CH

Re: Query Elevation exception on shard queries

2013-05-06 Thread varun srivastava
Thanks Ravi. So then it is a bug . On Mon, May 6, 2013 at 12:04 PM, Ravi Solr wrote: > Varun, > Since our cores were totally disjoint i.e. they pertain to two > different applications which may or may not have results for a given query, > we moved the elavation outside of solr into our

Re: Why is SolrCloud doing a full copy of the index?

2013-05-06 Thread Shawn Heisey
On 5/6/2013 1:39 PM, Michael Della Bitta wrote: Hi Shawn, Thanks a lot for this entry! I'm wondering, when you say "Garbage collections that happen more often than ten or so times per minute may be an indication that the heap size is too small," do you mean *any* collections, or just full colle

Re: Why is SolrCloud doing a full copy of the index?

2013-05-06 Thread Otis Gospodnetic
Hi, I just looked at SPM monitoring we have for Solr servers that run search-lucene.com. One of them has 1-2 collections/minute. Another one closer to 10. These are both small servers with small JVM heaps. Here is a graph of one of them: https://apps.sematext.com/spm/s/104ppwguao Just looked

ConcurrentUpdateSolrServer "Missing ContentType" error on SOLR 4.2.1

2013-05-06 Thread cleardot
My SolrJ client uses ConcurrentUpdateSolrServer to index > 50Gs of docs to a SOLR 3.6 instance on my Linux box. When running the same client against SOLR 4.2.1 on EC2 I got the following: SolrJ client error request: http://ec2-

Open position: Senior Information Retrieval Engineer, Zurich, Switzerland

2013-05-06 Thread Toan V Luu
We are looking for an engineer who has strong background in Information retrieval and Solr /Lucene platform. Prefer native German or French speaking. Please contact us if you are interested in this position: http://local-ch.github.io/senior-ir-engineer.html. Thanks. Toan Luu.

Re: Solr Cloud with large synonyms.txt

2013-05-06 Thread Jan Høydahl
See discussion here http://lucene.472066.n3.nabble.com/gt-1MB-file-to-Zookeeper-td3958614.html One idea was compression. Perhaps if we add gzip support to SynonymFilter it can read synonyms.txt.gz which would then fit larger raw dicts? -- Jan Høydahl, search solution architect Cominvent AS - ww

Re: List of Solr Query Parsers

2013-05-06 Thread Jan Høydahl
Added. Please try editing the page now. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 6. mai 2013 kl. 19:58 skrev Roman Chyla : > Hi Jan, > My login is RomanChyla > Thanks, > > Roman > On 6 May 2013 10:00, "Jan Høydahl" wrote: > >> Hi Roman, >> >> This sounds gre

Re: ConcurrentUpdateSolrServer "Missing ContentType" error on SOLR 4.2.1

2013-05-06 Thread Shawn Heisey
On 5/6/2013 1:25 PM, cleardot wrote: My SolrJ client uses ConcurrentUpdateSolrServer to index > 50Gs of docs to a SOLR 3.6 instance on my Linux box. When running the same client against SOLR 4.2.1 on EC2 I got the following: SOLR 4.2.1 log error =

Re: ConcurrentUpdateSolrServer "Missing ContentType" error on SOLR 4.2.1

2013-05-06 Thread cleardot
Shawn, I didn't sanitize the log other than the ec2 servername. The constructor is ConcurrentUpdateSolrServer solrServer = new ConcurrentUpdateSolrServer(solrUrl, solrBufferCount, solrThreadCount); and I don't use setParser at all. But the SolrJ client is using

Re: ConcurrentUpdateSolrServer "Missing ContentType" error on SOLR 4.2.1

2013-05-06 Thread Shawn Heisey
On 5/6/2013 4:06 PM, cleardot wrote: Shawn, I didn't sanitize the log other than the ec2 servername. The constructor is ConcurrentUpdateSolrServer solrServer = new ConcurrentUpdateSolrServer(solrUrl, solrBufferCount, solrThreadCount); and I don't use setParse

Is there a way to remove caches in SOLR?

2013-05-06 Thread bbarani
I am trying to create performance metrics for SOLR. I don't want the searcher to warm up when I issue a query since I am trying to collect metrics for cold search. Is there a way to disable warming? -- View this message in context: http://lucene.472066.n3.nabble.com/Is-there-a-way-to-remove-cac

Re: Is there a way to remove caches in SOLR?

2013-05-06 Thread varun srivastava
make size 0 On Mon, May 6, 2013 at 4:38 PM, bbarani wrote: > I am trying to create performance metrics for SOLR. I don't want the > searcher > to warm up when I issue a query since I am trying to collect metrics for > cold search. Is there a way to disable warming? > > > > -- > View this messag

Re: Is there a way to remove caches in SOLR?

2013-05-06 Thread Shawn Heisey
On 5/6/2013 5:38 PM, bbarani wrote: I am trying to create performance metrics for SOLR. I don't want the searcher to warm up when I issue a query since I am trying to collect metrics for cold search. Is there a way to disable warming? Set the autowarmCount to 0 in each of the cache definitions.

Re: Indexing off of the production servers

2013-05-06 Thread Erick Erickson
Nope. There is no replication, as in replication of the indexed document in the normal flow. The _raw_ document is forwarded to all replicas and upon return from the replicas, the raw document has been written to each individual transaction log on each replica. "replication" implies the _indexed_ f

Re: Questions about the performance of Solr

2013-05-06 Thread Mikhail Khludnev
Hello, start from http://wiki.apache.org/solr/CommonQueryParameters#fq On Mon, May 6, 2013 at 11:42 AM, joo wrote: > Search speed at which data is loaded is more than 7 ten millon current will > be reduced too. > About 50 seconds it will take, but the number is often just this, it is not > p

RE: Solr Cloud with large synonyms.txt

2013-05-06 Thread David Parks
Wouldn't it make more sense to only store a pointer to a synonyms file in zookeeper? Maybe just make the synonyms file accessible via http so other boxes can copy it if needed? Zookeeper was never meant for storing significant amounts of data. -Original Message- From: Jan Høydahl [mailto:

Re: What makes an Analyzer/Tokenizer/CharFilter/etc suitable for Solr?

2013-05-06 Thread Alexandre Rafalovitch
Has this logic (default constructor or version flag) changed due to LUCENE-4877 ? I rerun my tool and suddenly huge number of Factories acquired a new constructor (e.g. MappingCharFilterFactory). Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/a

Re: ConcurrentUpdateSolrServer "Missing ContentType" error on SOLR 4.2.1

2013-05-06 Thread Ravi Solr
I apologize for intruding, Shawn, do you know what can cause empty params (i.e. params={}) ? Ravi On Mon, May 6, 2013 at 5:47 PM, Shawn Heisey wrote: > On 5/6/2013 1:25 PM, cleardot wrote: > >> My SolrJ client uses ConcurrentUpdateSolrServer to index > 50Gs of docs >> to a >> SOLR 3.6 instance

Re: ConcurrentUpdateSolrServer "Missing ContentType" error on SOLR 4.2.1

2013-05-06 Thread Shawn Heisey
> I apologize for intruding, Shawn, do you know what can cause empty params > (i.e. params={}) ? I've got no idea what is causing this problem on your system. All of the ideas I've had so far don't seem to apply. Can you run a packet sniffer on your client to see whether the client is sending the

Re: update to 4.3

2013-05-06 Thread Arkadi Colson
Any tips on what to do with the configuration files? Where do I have to store them and what should they look like? Any examples? May 07, 2013 6:16:27 AM org.apache.catalina.core.AprLifecycleListener init INFO: The APR based Apache Tomcat Native library which allows optimal performance in produc