Re: Adding a new pseudo field

2012-10-08 Thread Upayavira
If I've understood you correctly, you could achieve this also with the XSLTResponseWriter, it would be pretty trivial to write an XLST that exposes the node position in the results, containing: Stick that in solr/conf/xslt, and reference it with wt=xslt&tr=.xsl That way you wouldn't need to

Re: Storing queries in Solr

2012-10-08 Thread Upayavira
Solr has a small query cache, but this does not hold queries for any length of time, so won't suit your purpose. The LucidWorks Search product has (I believe) a click tracking feature, but that is about boosting documents that are clicked on, not specific search terms. Parsing the Solr log, or pus

Re: Adding a new pseudo field

2012-10-08 Thread Upayavira
Good question. I know xslt could output json, but you'd have to write a stylesheet that transforms the xml into json. I'm not sure whether you can influence the content-type for the output with the xslt response writer though. There's also the velocity response writer, which sits behind the /brows

Re: add shard to index

2012-10-08 Thread Upayavira
Given that Solr does not support distributed IDF, adding a shard without balancing the number of documents could seriously skew your scoring. If you are okay with that, then the next question is what happens if you download the clusterstate.json from ZooKeeper, and add another entry, along the line

Re: Problem with relating values in two multi value fields

2012-10-08 Thread Toke Eskildsen
On Mon, 2012-10-08 at 08:42 +0200, Torben Honigbaum wrote: > sorry, my fault. This was one of my first ideas. My problem is, that > I've 1.000.000 documents, each with about 20 attributes. Additionally > each document has between 200 and 500 option-value pairs. So if I > denormalize the data, it me

Re: add shard to index

2012-10-08 Thread Rafał Kuć
Hello! Radim there is a JIRA issue - https://issues.apache.org/jira/browse/SOLR-3755. It is work in progress, but once finished Solr will enable you to add additional shards on a live collection and split the ones that were already created. -- Regards, Rafał Kuć Sematext :: http://sematext.com

Reloading ExternalFileField blocks Solr

2012-10-08 Thread Martin Koch
Hi List We're using Solr-4.0.0-Beta with a 7M document index running on a single host with 16 shards. We'd like to use an ExternalFileField to hold a value that changes often. However, we've discovered that the file is apparently re-read by every shard/core on *every commit*; the index is unrespon

Solr 4 spatial search - point intersects polygon

2012-10-08 Thread Jorge Suja
Hi everyone, I've been playing around with the new spatial search functionalities included in the newer versions of solr (solr 4.1 and solr trunk 5.0), and i've found something strange when I try to find a point inside a polygon (particularly inside a square). You can reproduce this problem usin

I don't understand

2012-10-08 Thread Tolga
Hi, There are two servers with the same configuration. I crawl the same URL. One of them is giving the following error: Caused by: org.apache.solr.common.SolrException: ERROR: [doc=http://bilgisayarciniz.org/] multiple values encountered for non multiValued copy field text: bilgisayarciniz w

Re: I don't understand

2012-10-08 Thread Jan Høydahl
Hi, Please describe your environemnt better * How do you "crawl", using which crawler? * To which RequestHandler do you send the docs? * Which version of Solr * Can you share your schema and other relevant config with us? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com

solr1.4 code Example

2012-10-08 Thread Sujatha Arun
hi, I am unable to unzip the 5883_Code.zip file for solr 1.4 from paktpub site .I get the error message End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfil

Re: I don't understand

2012-10-08 Thread Tolga
Hi Jan, thanks for your fast reply. Below is the information you requested: * I use nutch, using the command "nutch crawl urls -dir crawl-$(date +%FT%H-%M-%S) -solr http://localhost:8983/solr/ -depth 10 -topN 5" * What do you mean "which RequestHandler"? How can I find that out? * 3.6.1 * Both

Re: I don't understand

2012-10-08 Thread Tolga
Hi Jan, thanks for your fast reply. Below is the information you requested: * I use nutch, using the command "nutch crawl urls -dir crawl-$(date +%FT%H-%M-%S) -solr http://localhost:8983/solr/ -depth 10 -topN 5" * What do you mean "which RequestHandler"? How can I find that out? * 3.6.1 * Both

Re: QueryElevationComponent not working in Distributed Search

2012-10-08 Thread Erick Erickson
You shouldn't try copying files around, your comment that you " tried replacing QueryElevationComponent.java" leads me to think you tried that. Instead, I notice that there's a SOLR-2949.3x patch. If you want to try that, you can apply the patch to the 3.x code line. See "working with patches" at h

Re: add shard to index

2012-10-08 Thread Erick Erickson
Right, but even if that worked, you'd then get docs being assigned to the wrong shard. The shard assignment would be something like (hash(id)/3). So a document currently on shard 0 would be indexed next time, perhaps, on shard 2, leaving two "live" docs in your system with the same ID. Bad Things w

Re: I don't understand

2012-10-08 Thread Erick Erickson
Well, the schemas are different. The first schema doesn't have a copyField directive anywhere in it and the second one does. And the is in a non-standard place anyway, it's usually outside the tag. Kind of surprising it works at all there, now I've got to go figure out why . Anyway apparent

Re: solr 1.4.1 -> 3.6.1; SOLR-758

2012-10-08 Thread Jack Krupansky
The Extended Dismax query parser (edismax) mostly "obsoletes" Dismax except in the sense that some apps prefer the restricted syntax of Dismax: http://wiki.apache.org/solr/ExtendedDisMax -- Jack Krupansky -Original Message- From: Patrick Kirsch Sent: Monday, October 08, 2012 2:32 AM

Re: solr1.4 code Example

2012-10-08 Thread Toke Eskildsen
On Mon, 2012-10-08 at 13:08 +0200, Sujatha Arun wrote: > I am unable to unzip the 5883_Code.zip file for solr 1.4 from paktpub site > .I get the error message > > End-of-central-directory signature not found. [...] It is a corrupt ZIP-file. I'm guessing you got it from http://www.packtpub.com/

Re: long query response time in shards search

2012-10-08 Thread Jack Krupansky
What release of Solr are you on? Solr 4.0 has improved wildcard support (FST "automatons".) But even then, such heavy use of wildcards may be problematic. If you intend to use wildcard in that manner, you might want to create a customer stemming filter that does that stemming at index time (an

search by multiple 'LIKE' operator connected with 'AND' operator

2012-10-08 Thread gremlin
Hi. I have a trouble with SOLR configuration. Just want to implement configuration that would be operate with index like MySQL query: field_name LIKE '%foo%' AND field_name LIKE '%bar%'. So, for example, I have 4 indexed titles: 'Kathy Lee', 'Kathy Norris', 'Kathy Davies', 'Kathy Bird' and with

Re: Storing queries in Solr

2012-10-08 Thread Jorge Luis Betancourt Gonzalez
Thanks for the quick response, I'm trying to get a suggester query, I found odd the being a very common issue solr doesn't provide any built in mechanism for query suggestions, but implementing the other components isn't so hard either. Greetiings! On Oct 8, 2012, at 3:38 AM, Upayavira wrote:

Wildcards and fuzzy/phonetic query

2012-10-08 Thread Hågen Pihlstrøm Hasle
Hi! I'm quite new to Solr, I was recently asked to help out on a project where the previous "Solr-person" quit quite suddenly. I've noticed that some of our searches don't return the expected result, and I'm hoping you guys can help me out. We've indexed a lot of names, and would like to sear

Re: search by multiple 'LIKE' operator connected with 'AND' operator

2012-10-08 Thread Jack Krupansky
The PositionFilterFactory is probably preventing phrase queries from working. What are you expecting it to do? It basically means query if all the quoted terms occur at the same position. SQL "like" is comparable to Lucene wildcard, but change the "%" to "*" and "_" to "?". -- Jack Krupansky

Re: Storing queries in Solr

2012-10-08 Thread Gérard Dupont
Hi Jorge, As far as I know, there isn't built-in component to achieve such function in Solr (maybe in latest 4.1 that I didn't explored in depth yet). However I've done myself in the past using different approaches. The first one is similar to Upayavira's suggestion ans uses an independent index

Re: Wildcards and fuzzy/phonetic query

2012-10-08 Thread Jack Krupansky
A regular expression term may provide what you want, but not exactly. Maybe something like: /(ch|k)r.*/ (No guarantee that will actually work.) See: http://lucene.apache.org/core/4_0_0-BETA/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches And probably slo

Re: SolrJ - IOException

2012-10-08 Thread Briggs Thompson
I have also just ran into this a few times over the weekend in a newly deployed system. We are running Solr 4.0 Beta (not using SolrCloud) and it is hosted via AWS. I have a RabbitMQ consumer that reads updates from a queue and posts updates to Solr via SolrJ. There is quite a bit of error handlin

Re: multivalued filed question (FieldCache error)

2012-10-08 Thread giovanni.bricc...@banzai.it
Thank you very much! I've singlelined, spaced removed every fl field in my solrconfig and now the app works fine Giovanni Il 05/10/12 20:49, Chris Hostetter ha scritto: : So extracting the attachment you will be able to track down what appens : : this is the query that shows the error, and b

Re: search by multiple 'LIKE' operator connected with 'AND' operator

2012-10-08 Thread gremlin
Disabling PositionFilterFactory is totally break multiword search, and I could find titles only by single word. Default solr.TextField field with WhitespaceTokenizerFactory returns only complete words match, enabling NGramFilterFactory for that field doesn't do anything for me. If I use field desc

Re: Wildcards and fuzzy/phonetic query

2012-10-08 Thread Erick Erickson
whether phonetic filters can be multiterm aware: I'd be leery of this, as I basically don't quite know how that would behave. You'd have to insure that the algorithms changed the first parts of the words uniformly, regardless of what followed. I'm pretty sure that _some_ phonetic algorithms do no

Re: SolrJ - IOException

2012-10-08 Thread Briggs Thompson
Also note there were no exceptions in the actual Solr log, only on the SolrJ side. Thanks, Briggs On Mon, Oct 8, 2012 at 10:45 AM, Briggs Thompson < w.briggs.thomp...@gmail.com> wrote: > I have also just ran into this a few times over the weekend in a newly > deployed system. We are running Solr

Re: Problem with relating values in two multi value fields

2012-10-08 Thread Mikhail Khludnev
Toke, You are absolutely right, concatenating term is a possible solution. I found faceting is quite complicated in this case, but it was a hot fix which we delivered to production. Torben, This problem arise quite often, beside of these two approaches discussed there, also possible to approach Sp

Re: Reloading ExternalFileField blocks Solr

2012-10-08 Thread Mikhail Khludnev
Martin, Can you tell me what's the content of that field, and how it should affect search result? On Mon, Oct 8, 2012 at 12:55 PM, Martin Koch wrote: > Hi List > > We're using Solr-4.0.0-Beta with a 7M document index running on a single > host with 16 shards. We'd like to use an ExternalFileFie

Re: Wildcards and fuzzy/phonetic query

2012-10-08 Thread Otis Gospodnetic
Hi, Consider looking into synonyms and ngrams. Otis -- Performance Monitoring - http://sematext.com/spm On Oct 8, 2012 11:21 AM, "Hågen Pihlstrøm Hasle" wrote: > Hi! > > I'm quite new to Solr, I was recently asked to help out on a project where > the previous "Solr-person" quit quite suddenly.

Re: solr 1.4.1 -> 3.6.1; SOLR-758

2012-10-08 Thread Chris Hostetter
: Regarding https://issues.apache.org/jira/browse/SOLR-758 (Enhance : DisMaxQParserPlugin to support full-Solr syntax and to support alternate : escaping strategies.) FWIW: i'm not really sure what/how that issue relates to the problem you are seeing (or how you *think* it relates to hte problem

Re: solr1.4 code Example

2012-10-08 Thread Sujatha Arun
did get some files by jar unpacking ,but could not get the ones I wanted ...thanks anyway !! On Mon, Oct 8, 2012 at 5:56 PM, Toke Eskildsen wrote: > On Mon, 2012-10-08 at 13:08 +0200, Sujatha Arun wrote: > > I am unable to unzip the 5883_Code.zip file for solr 1.4 from paktpub > site > > .I get

Re: Wildcards and fuzzy/phonetic query

2012-10-08 Thread Hågen Pihlstrøm Hasle
I guess synonyms would give me a similar result as using regexes, like Jack wrote about. I've thought about that, but I don't think it would be good enough. Substituting "k" for "ch" is easy enough, but the problem is that I have to think of every possible substitution in advance. I'd like

Re: add shard to index

2012-10-08 Thread Radim Kolar
Do it as it is done in cassandra database. Adding new node and redistributing data can be done in live system without problem it looks like this: every cassandra node has key range assigned. instead of assigning keys to nodes like hash(key) mod nodes, then every node has its portion of hash k

Re: add shard to index

2012-10-08 Thread Michael Della Bitta
AKA Consistent Hashing: http://en.wikipedia.org/wiki/Consistent_hashing Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Mon, Oct 8, 2012 at 11:33 AM, Radim K

Re: Wildcards and fuzzy/phonetic query

2012-10-08 Thread Hågen Pihlstrøm Hasle
I understand that I'm quickly reaching the boundaries of my Solr-competence when I'm supposed to read about "Expert Level" concepts.. :) I had already read it once, but now I read it again. Twice. And I'm not sure if I understand it correctly.. So let me ask a follow-up question: If I define

Re: Fallout from the deprecation of setQueryType

2012-10-08 Thread Shawn Heisey
On 9/28/2012 9:09 AM, Shawn Heisey wrote: I am planning and building up a test system with Solr 4.0, for my eventual upgrade. I have not made a lot of progress so far, but I have come across a potential problem. It's been over a week with no response to this. Please see the original email f

Re: Wildcards and fuzzy/phonetic query

2012-10-08 Thread Erick Erickson
To answer your first question, yes, you've got it right. If you define a multiterm section in your fieldType, whatever you put in that section gets applied whether the underlying class is MultiTermAware or not. Which means you can shoot yourself in the foot really bad ... Well, you have 6 or so po

Re: Reloading ExternalFileField blocks Solr

2012-10-08 Thread Martin Koch
Sure: We're boosting search results based on user actions which could be e.g. the number of times a particular document has been read. In future, we'd also like to boost by e.g. impressions (the number of times a document has been displayed) and other values. /Martin On Mon, Oct 8, 2012 at 7:02 P

How to efficiently find documents that have a specific value for a field OR the field does not exist at all

2012-10-08 Thread Artem Shnayder
I'm trying to find documents using this query: field:"value" OR (*:* AND NOT field:[* TO *]) Which means, either field is set to "value" or the field does not exist in the document. I'm running this for ~20 fields in a single query strung together with ANDs. The query time is high, averaging aro

Funny behavior in facet query on large dataset

2012-10-08 Thread kevinlieb
I am doing a facet query in Solr (3.4) and getting very bad performance. This is in a solr shard with 22 million records, but I am specifically doing a small time slice. However even if I take the time slice query out it takes the same amount of time, so it seems to be searching the entire data s

Re: Funny behavior in facet query on large dataset

2012-10-08 Thread Erik Hatcher
Faceting at that scale takes time to "warm up". If you've got your caches and such configured appropriately, then successive searches will be very fast, however you'll still need to do the cache warming (depends on the faceting implementation you're using, in this case you're probably using the

RE: Funny behavior in facet query on large dataset

2012-10-08 Thread Michael Ryan
Facets are only really useful if you want the counts for multiple values (e.g., "eldudearino", "ladudearina"). I'd suggest just leaving all the facet parameters off of that query - the numFound that is returned should give you what you want. The slowness may be due to the facet cache needing to

Re: Reloading ExternalFileField blocks Solr

2012-10-08 Thread Mikhail Khludnev
Martin, I have kind of hack approach in mind regarding hiding document from search. So, it's a little bit easier than your task. I'm going to deliver talk about it http://www.apachecon.eu/schedule/presentation/89/ . Frankly speaking, there is no reliable out-of-the-box solution for it. I saw that

Re: Funny behavior in facet query on large dataset

2012-10-08 Thread Chris Hostetter
: a small time slice. However even if I take the time slice query out it : takes the same amount of time, so it seems to be searching the entire data : set. a) you might try using facet.method=enum - in some special cases it may be faster then the default (facet.method=fc). : I am trying to fi

Re: How to efficiently find documents that have a specific value for a field OR the field does not exist at all

2012-10-08 Thread Ahmet Arslan
> field:"value" OR (*:* AND NOT field:[* TO *]) > > Which means, either field is set to "value" or the field > does not exist in > the document. Instead of field:[* TO *], you can define a default value in schema.xml. Or DefaultValueUpdateProcessorFactory in solrconfig. With this, "the field do

Re: Funny behavior in facet query on large dataset

2012-10-08 Thread kevinlieb
Thanks for all the replies. I oversimplified the problem for the purposes of making my post small and concise. I am really trying to find the counts of documents by a list of 10 different authors that match those keywords. Of course on looking up a single author there is no reason to do a facet

Re: long query response time in shards search

2012-10-08 Thread Jason
Hi, We're using Solr 4.0 and servicing patent search. Patent search intends to very complex queries including wildcard. I think Ngram or EdgeNgram filter is alternative. But every terms included a query don't have wildcard. So we can't use that filter. If I make empty core and use in main core th

Re: Funny behavior in facet query on large dataset

2012-10-08 Thread Shawn Heisey
On 10/8/2012 4:09 PM, kevinlieb wrote: Thanks for all the replies. I oversimplified the problem for the purposes of making my post small and concise. I am really trying to find the counts of documents by a list of 10 different authors that match those keywords. Of course on looking up a single

Re: Funny behavior in facet query on large dataset

2012-10-08 Thread Otis Gospodnetic
Hi Kevin, Right, it's the very frequent commits, most likely. Change commits to, say, every 60 or 120 seconds and compare the performance. I think you guys use SPM, so check the Cache graphs (hit % specifically) before and after the above change. Otis -- Search Analytics - http://sematext.com/s

Re: SolrJ 4.0 Beta maxConnectionsPerHost

2012-10-08 Thread Otis Gospodnetic
Hi, Qs: * Have you tried StreamingUpdateSolrServer? * Newever version of Solr(J)? When things hang, jstack your app that uses SolrJ and Solr a few times and you should be able to see where they are stuck. Otis -- Search Analytics - http://sematext.com/search-analytics/index.html Performance Moni

Re: long query response time in shards search

2012-10-08 Thread Otis Gospodnetic
Hi, We've explored this with a few clients a while back. If I remember correctly, this doesn't make much difference and I don't expect it will make any noticable difference for you since all your cores are on that same 1 server. If you had 1 server with more CPU cores you would see better number

Re: Reloading ExternalFileField blocks Solr

2012-10-08 Thread Otis Gospodnetic
Hi Martin, Perhaps you could make a small change in Solr to add "don't reload EFF if it hasn't been modified since it was last opened". I assume you commit pretty often, but don't modify EFF files that often, so this could save you some needless loading. That said, I'd be surprised EFF doesn't a

Problem with dataimporter.request

2012-10-08 Thread Zakka Fauzan
I'm quite new in SOLR, I have a question regarding the request for data importer. In my data-config.xml, i have something like this However, everytime I execute delta-import (/dataimport?command=delta-import), it always gives me exception like this: Caused by: java.lang.RuntimeException:

Re: Help with Velocity in SolrItas

2012-10-08 Thread Paul Libbrecht
Lance, this is the kind of fun that happens with Velocity all day long... In general, when it outputs the variable name, it's the that the variable is null; this can happen when a method is missing for example There are actually effective uses of this brain-dead-debugger-oriented-practice!

Re: SolrJ 4.0 Beta maxConnectionsPerHost

2012-10-08 Thread Sami Siren
On Tue, Oct 9, 2012 at 4:52 AM, Briggs Thompson wrote: > I am running into an issue of a multithreaded SolrJ client application used > for indexing is getting into a hung state. I responded to a separate thread > earlier today with someone that had the same error, see > http://lucene.472066.n3.nab