Solr Filter Query is not working

2014-01-22 Thread kumar
Hi, I have some product details when i am looking for the different products at a time it is not working. I am using edismax. Configured filter query in the following way. {!edismax v=$c} _query_:"{!field f=product v=$p}" For Example i using the following query for filtering results. http://

Re: How to run a subsequent update query to documents indexed from a dataimport query

2014-01-22 Thread Dileepa Jayakody
Hi All, I did some research on this and found some alternatives useful to my usecase. Please give your ideas. Can I update all documents indexed after a /dataimport query using the last_indexed_time in dataimport.properties? If so can anyone please give me some pointers? What I currently have in

Re: dataimport handler

2014-01-22 Thread Shalin Shekhar Mangar
I'm guessing that "id" in your schema.xml is also a unique key field. If so, each document must have an id field or Solr will refuse to index them. DataImportHandler will map the id field in your table to Solr schema's id field only if you have not specified a mapping. On Thu, Jan 23, 2014 at 3:0

Re: Optimizing index on Slave

2014-01-22 Thread Salman Akram
Unfortunately we can't do sharding right now. If we optimize on master and slave separately the file names and sizes are same. I think it's just the version no that is different. Maybe if there was a to copy master version to slave that would resolve this issue?

RE: Highlighting not working

2014-01-22 Thread Fatima Issawi
Hi, I have stored=true for my "content" field, but I get an error saying there is a mismatch of settings on that field (I think) because of the "term*=true" settings. Thanks again, Fatima > -Original Message- > From: Ahmet Arslan [mailto:iori...@yahoo.com] > Sent: Wednesday, January

Re: Solr/Lucene Faceted Search Too Many Unique Values?

2014-01-22 Thread Erick Erickson
A legitimate question that only you can answer is "what's the value of faceting on fields with so many unique values?" Consider the ridiculous case of faceting on . There's almost exactly zero value in faceting on it, since all counts will be 1. By analogy, with millions of tag values, will there

Re: Solr Cloud Bulk Indexing Questions

2014-01-22 Thread Erick Erickson
When you're doing hard commits, is it with openSeacher = true or false? It should probably be false... Here's a rundown of the soft/hard commit consequences: http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ I suspect (but, of course, can't prove)

Re: Solr middle-ware?

2014-01-22 Thread lianyi
I've been thinking of using nodejs as a thin layer between the client and solr servers.  it seems pretty handy for adding features like throttling, load balancing and basic authentications. -lianyi On Wed, Jan 22, 2014 at 7:36 PM, Alexandre Rafalovitch wrote: > I thought about Go, but that doe

Re: Solr middle-ware?

2014-01-22 Thread Alexandre Rafalovitch
I thought about Go, but that does not give the advantages of spanning client and server like Dart and Node/Javascript. Which is why Dart felt a bit more interesting, especially with tree-shaking of unused code. But then, neither language has enough adoption to be an answer to my original question

Re: shard merged into a another shard as replica

2014-01-22 Thread Utkarsh Sengar
Thanks Mark. I tried updating clusterstate manually, things went haywire J. So to fix it, had to take 30secs-1min downtime where I stopped solr and zk, deleted "/zookeeper_data/version-2" directory and restarted everything again. I have auotmated these commands via fabric, so was easily able to re

Re: Searching and scoring with block join

2014-01-22 Thread dev
Zitat von Mikhail Khludnev : On Wed, Jan 22, 2014 at 10:17 PM, wrote: I know that I can't just make a query like this: {!parent which=is_parent:true}+Term, most likely I'll get this error: child query must only match non-parent docs, but parent docID= matched childScorer=class org.apache

Re: shard merged into a another shard as replica

2014-01-22 Thread Mark Miller
Hopefully an issue that has been fixed then. We should look into that. You should be able to fix it by directly modifying the clusterstate.json in ZooKeeper. Remember to back it up first! There are a variety of tools you can use to work with ZooKeeper - I like the eclipse plug-in that you can

Re: shard merged into a another shard as replica

2014-01-22 Thread Utkarsh Sengar
solr 4.4.0 On Wed, Jan 22, 2014 at 3:12 PM, Mark Miller wrote: > What version of Solr are you running? > > - Mark > > > > On Jan 22, 2014, 5:42:30 PM, Utkarsh Sengar > wrote: I am not sure what happened, I updated merchant collection and then > restarted all the solr machines. > > This is what

Re: shard merged into a another shard as replica

2014-01-22 Thread Mark Miller
What version of Solr are you running? - Mark On Jan 22, 2014, 5:42:30 PM, Utkarsh Sengar wrote: I am not sure what happened, I updated merchant collection and then restarted all the solr machines. This is what I see right now: http://i.imgur.com/4bYuhaq.png merchant collection looks fine.

Re: Solr/Lucene Faceted Search Too Many Unique Values?

2014-01-22 Thread Yago Riveiro
You will need to use DocValues if you want to use facets with this amount of terms and not blow the heap. I have facets with ~39M of unique terms, the response time is about 10 ~ 40 seconds, in my case is not a problem. -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig)

Solr/Lucene Faceted Search Too Many Unique Values?

2014-01-22 Thread Bing Hua
Hi, I am going to evaluate some Lucene/Solr capabilities on handling faceted queries, in particular, with a single facet field that contains large number (say up to 1 million) of distinct values. Does anyone have some experience on how lucene performs in this scenario? e.g. Doc1 has tags A B C D

shard merged into a another shard as replica

2014-01-22 Thread Utkarsh Sengar
I am not sure what happened, I updated merchant collection and then restarted all the solr machines. This is what I see right now: http://i.imgur.com/4bYuhaq.png merchant collection looks fine. But deals and prodinfo collections should have a total of 3 shards. But someone shard1 has converted to

Re: Solr Cloud on HDFS

2014-01-22 Thread Lajos
Cool Mark, I'll keep an eye on this one. L On 22/01/2014 22:36, Mark Miller wrote: Whoops, hit the send keyboard shortcut. I just created a JIRA issue for the first bit I’ll be working on: SOLR-5656: When using HDFS, the Overseer should have the ability to reassign the cores from failed nod

dataimport handler

2014-01-22 Thread tom
Hi, I am trying to use dataimporthandler(Solr 4.6) from oracle database, but I have some issues in mapping the data. I have 3 columns in the test_table, column1, column2, id dataconfig.xml Is

Re: Optimizing index on Slave

2014-01-22 Thread Michael Della Bitta
Salman, To my knowledge, there's not a great way of doing this. Perhaps if your dataset were based on a time series, you could shard by date, and then only a smaller segment of your data would be updated and therefore need to be sent each week? Michael Della Bitta Applications Developer o: +1

Re: Searching and scoring with block join

2014-01-22 Thread Mikhail Khludnev
On Wed, Jan 22, 2014 at 10:17 PM, wrote: > I know that I can't just make a query like this: {!parent > which=is_parent:true}+Term, most likely I'll get this error: child query > must only match non-parent docs, but parent docID= matched > childScorer=class org.apache.lucene.search.TermScorer

Re: Solr Cloud on HDFS

2014-01-22 Thread Mark Miller
Whoops, hit the send keyboard shortcut. I just created a JIRA issue for the first bit I’ll be working on: SOLR-5656: When using HDFS, the Overseer should have the ability to reassign the cores from failed nodes to running nodes. - Mark On Jan 22, 2014, 12:57:46 PM, Lajos wrote: Thanks

Re: Solr Cloud on HDFS

2014-01-22 Thread Mark Miller
I just created a JIRA issue for the first bit I’ll be working on: - Mark On Jan 22, 2014, 12:57:46 PM, Lajos wrote: Thanks Mark ... indeed, some doc updates would help. Regarding what seems to be a popular question on sharding. It seems that it would be a Good Thing that the shards for

Re: Solr middle-ware?

2014-01-22 Thread Jorge Luis Betancourt González
I would love to see some proxy-like application implemented in go (partly for my desire of having time to check out go). - Original Message - From: "Shawn Heisey" To: solr-user@lucene.apache.org Sent: Wednesday, January 22, 2014 10:38:34 AM Subject: Re: Solr middle-ware? On 1/22/2014 12

Re: Interesting search question! How to match documents based on the least number of fields that match all query terms?

2014-01-22 Thread Mikhail Khludnev
Hello Daniel, I have an idea to try to use coord() here. Check http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.htmland http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/package-summary.html So, if you can override similar

Fuzzy 2 search results wrong

2014-01-22 Thread Lou Foster
I am using the fuzzy search functionality with solr 4.1 and am having problems with the fuzzy search results when fuzzy level 2 is used. Here is a description of the issue; I have an index that consists of one main core that is generated by merging many other cores together. If I fuzzy search wi

Re: Trying to config solr cloud

2014-01-22 Thread svante karlsson
Thank you very much!! Just to recap. My solrconfig.xml had the tvComponent and when I removed that it works as expected although not as fast as I had hoped. I'll do some more reading on best practices and probably ask a new question later... tvComponent - Svante 2014/

Re: Solr Cloud Bulk Indexing Questions

2014-01-22 Thread Software Dev
A suggestion would be to hard commit much less often, ie every 10 minutes, and see if there is a change. - Will try this How much system RAM ? JVM Heap ? Enough space in RAM for system disk cache ? - We have 18G of ram 12 dedicated to Solr but as of right now the total index size is only 5GB Ah

RE: Interesting search question! How to match documents based on the least number of fields that match all query terms?

2014-01-22 Thread Petersen, Robert
Hi Daniel, How about trying something like this (you'll have to play with the boosts to tune this), search all the fields with all the terms using edismax and use the minimum should match parameter, but require all terms to match in the allMetadata field. https://wiki.apache.org/solr/Extend

Searching and scoring with block join

2014-01-22 Thread dev
Hello again, I'm using the solr block-join feature to index a journal and all of it's articles. Here a short example: 527fcbf8-c140-4ae6-8f51-68cd2efc1343 Sozialmagazin 8 2008 0340-8469 .

Re: Solr Cloud on HDFS

2014-01-22 Thread Lajos
Thanks Mark ... indeed, some doc updates would help. Regarding what seems to be a popular question on sharding. It seems that it would be a Good Thing that the shards for a collection running HDFS essentially be pointers to the HDFS-replicated index. Is that what your thinking is? I've been

Re: AIOOBException on trunk since 21st or 22nd build

2014-01-22 Thread Mark Miller
Looking at the list of changes on the 21st and 22nd, I don’t see a smoking gun. - Mark On Jan 22, 2014, 11:13:26 AM, Markus Jelsma wrote: Hi - this likely belongs to an existing open issue. We're seeing the stuff below on a build of the 22nd. Until just now we used builds of the 20th and

RE: Indexing URLs from websites

2014-01-22 Thread Teague James
Markus, With some help from another user on the Nutch list I did a dump and found that the URLs I am trying to capture are in Nutch. However, when I index them with Solr I am not getting them. What I get in the dump is this: http://www.example.com/pdfs/article1.pdf Status: 2 (db_fetched) Fetch

Upgrading from SolrCloud 4.x to 4.y - as if you had used 4.y all along

2014-01-22 Thread Per Steffensen
If you are upgrading from SolrCloud 4.x to a later version 4.y, and basically want your end-system to seem as if it had been running 4.y (no legacy mode or anything) all along, you might find some inspiration here http://solrlucene.blogspot.dk/2014/01/upgrading-from-solrcloud-4x-to-4y-as-if.htm

AIOOBException on trunk since 21st or 22nd build

2014-01-22 Thread Markus Jelsma
Hi - this likely belongs to an existing open issue. We're seeing the stuff below on a build of the 22nd. Until just now we used builds of the 20th and didn't have the issue. This is either a bug or did some data format in Zookeeper change? Until now only two cores of the same shard through the e

Re: Solr reload trigger when a configuration file is changed

2014-01-22 Thread Mark Miller
Yonik has brought up this feature a few times as well. I’ve always felt about the same as Shawn. I’m fine with it being optional, default to off. A cluster reload can be a fairly heavy operation. - Mark On Jan 22, 2014, 4:36:19 AM, Mohit Jain wrote: Thanks Shawn. I appreciate you sharing

Re: Solr Cloud on HDFS

2014-01-22 Thread Mark Miller
Right - solr.hdfs.home is the only setting you should use with SolrCloud. The documentation should probably be improved. If you set the data dir or ulog location in solrconfig.xml explicitly, it will be the same for every collection. SolrCloud shares the solrconfig.xml across SolrCore’s, an

Re: Solr middle-ware?

2014-01-22 Thread Shawn Heisey
On 1/22/2014 12:25 AM, Raymond Wiker wrote: > Speaking for myself, I avoid using "client apis" like SolrNet, SolrJ and > FAST DSAPI for the simple reason that I feel that the abstractions they > offer are so thin that I may just as well talk directly to the HTTP > interface. Doing that also lets me

Re: Highlighting not working

2014-01-22 Thread Ahmet Arslan
Hi Fatima, To enable higlighting (both standard and fastvector) you need to make stored="true". Term vectors may speed up standard highlighter. Plus they are mandatory for FastVectorHighligher. https://cwiki.apache.org/confluence/display/solr/Field+Properties+by+Use+Case Ahmet On Wednes

Re: dismax request handler will give wrong result in solr 4.3

2014-01-22 Thread Ahmet Arslan
Hi Viresh, A couple of things: 1) / character is a special query parser character now. It wasn't before. It is used for regular expression searches. http://lucene.apache.org/core/4_6_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches What happens when you

How to run a subsequent update query to documents indexed from a dataimport query

2014-01-22 Thread Dileepa Jayakody
Hi All, I have a Solr requirement to send all the documents imported from a /dataimport query to go through another update chain as a separate background process. Currently I have configured my custom update chain in the /dataimport handler itself. But since my custom update process need to conne

Replication and conf files

2014-01-22 Thread Andrea Gazzarini
Hi all, Reading here http://wiki.apache.org/solr/SolrReplication#How_are_configuration_files_replicated.3F I don't understand what is the observed behaviour in case - confFiles contains schema.xml - schema doesn't change between replication cycles I mean, I read that the file is physically repl

dismax request handler will give wrong result in solr 4.3

2014-01-22 Thread Viresh Modi
When i use dismax query type handler in *SOLR 1.4 *and then same for *SOLR 4.3 *then both give different numFound record both have same index profile as well. means Solr 1.4 gives 9 records and Solr 4.3 gives 99 records. *My Query is:* start=0&rows=10&hl=true&hl.fl=content&qt=dismax &q=syste

Re: Solr Cloud on HDFS

2014-01-22 Thread Lajos
Uugh. I just realised I should have take out the data dir and update log definitions! Now it works fine. Cheers, L On 22/01/2014 11:47, Lajos wrote: Hi all, I've been running Solr on HDFS, and that's fine. But I have a Cloud installation I thought I'd try on HDFS. I uploaded the configs fo

Solr Cloud on HDFS

2014-01-22 Thread Lajos
Hi all, I've been running Solr on HDFS, and that's fine. But I have a Cloud installation I thought I'd try on HDFS. I uploaded the configs for the core that runs in standalone mode already on HDFS (on another cluster). I specify the HdfsDirectoryFactory, HDFS data dir, solr.hdfs.home, and HDF

Re: SOLR 4 - Query Issue in Common Grams with Surround Query Parser

2014-01-22 Thread Salman Akram
Apologies for the late response as this mail was lost somewhere in filters. Issue was that CommonGramsQueryFilterFactory should be used for searching and CommonGramsFilterFactory for indexing. We were using CommonGramsFilterFactory for both due to which it was not dropping single tokens for common

Re: Optimizing index on Slave

2014-01-22 Thread Salman Akram
We do. We have a lot of updates/deletes every day and a weekly optimization definitely gives a considerable improvement so don't see a downside to it except the complete replication part which is not an issue on local network.

Re: Solr reload trigger when a configuration file is changed

2014-01-22 Thread Mohit Jain
Thanks Shawn. I appreciate you sharing the philosophy behind Solr's implementation. I absolutely agree with the design principle and the fact that it helps to debug unknown issues. Moreover it definitely gives more control over the software. However there are *small* number of applications that mi

Re: Solr middle-ware?

2014-01-22 Thread Lajos
I always go for SolrJ as the intermediate layer, usually in a Spring app. I have sometimes proxied directly to Solr itself, but since we use a lot of Ajax, I'm not comfortable with exposing the Solr URIs directly, even if controlled via a proxy. Having it go through a webapp gives me a layer

RE: Highlighting not working

2014-01-22 Thread Fatima Issawi
Also my highlighting defaults... on content documentname html 0 documentname 3 200 content 750 > -Original Message- > From: Fatima Issawi [mailto:issa...@qu.edu.qa] > Sent: Wednesda

Highlighting not working

2014-01-22 Thread Fatima Issawi
Hello, I'm trying to highlight content that is returned from a Solr query, but I can't seem to get it working. I would like to highlight the "documentname" and the "pagetext" or "content" results, but when I run the search I don't get anything returned. I thought that the "content" field is su

Re: Solr Cloud Bulk Indexing Questions

2014-01-22 Thread Andre Bois-Crettez
1 node having more load should be the leader (because of the extra work of receiving and distributing updates, but my experiences show only a bit more CPU usage, and no difference in disk IO). A suggestion would be to hard commit much less often, ie every 10 minutes, and see if there is a change.