Re: random record from solr server

2011-07-18 Thread Ahmet Arslan
> How can I get random 100 record from last two days record > from solr server. > > I am using solr 3.1 Hello, add this random field definition to you schema.xml Generate some seed value ( e.g. 125) at query time, and issue a query something like this: q:add_date[NOW-2DAYS TO *]&sort=random_

Re: SolrJ Collapsable Query Fails

2011-07-18 Thread Kurt Sultana
Hi, Thanks for the code, snippet, it was very useful, however, can you please include a very small description of certain unknown variables such as 'groupedInfo', 'ResultItem', 'searcher', 'fields' and the method 'solrDocumentToResultItem'? Thanks On Sat, Jul 16, 2011 at 3:36 PM, Kurt Sultana w

Re: Fuzzy Query Param

2011-07-18 Thread steffen_kmt
entdeveloper wrote: > > I'm using Solr trunk. > Hi! I'm using solr 3.1.0 and the feature is not implemented. When I search for a word with e.g. ~2 the "~2" is interpreted as part of the search string. Where can I get the trunk version? Is it a stable version or just for testing purposes? t

LockObtainFailedException and open finalizing IndexWriters

2011-07-18 Thread Michael Kuhlmann
Hi, we are running Solr 3.2.0 on Jetty for a web application. Since we just went online and are still in beta tests, we don't have very much load on our servers (indeed, they're currently much oversized for the current usage), and our index size on file system ist just 1.1 MB. We have one dedicat

Re: Deleted docs in IndexWriter Cache (NRT related)

2011-07-18 Thread Grijesh
optimize ensures that deleted docs and terms will not be displayed. - Thanx: Grijesh www.gettinhahead.co.in -- View this message in context: http://lucene.472066.n3.nabble.com/Deleted-docs-in-IndexWriter-Cache-NRT-related-tp3177877p3178670.html Sent from the Solr - User mailing list archiv

Re: Start parameter messes with rows

2011-07-18 Thread pravesh
>i just wanna be clear in the concepts of core and shard ? >a single core is an index with same schema , is this wat core really is ? >can a single core contain two separate indexes with different schema in it ? >Is a shard refers to a collection of index in a single physical machine >?can a sin

Dismax RequestHandler adn Fuzzy Search

2011-07-18 Thread steffen_kmt
Hi! I would like to implement a fuzzy search with the Dismax Request Handler. I noticed that there are some discussions about that, but all from the year 2009. (adding ~ in solrconfig) Is it still at the same state or may be already implemented? Is there another option to do fuzzy search with a d

Re: Dismax RequestHandler adn Fuzzy Search

2011-07-18 Thread Ahmet Arslan
> Hi! > I would like to implement a fuzzy search with the Dismax > Request Handler. > I noticed that there are some discussions about that, but > all from the year > 2009. (adding ~ in solrconfig) > > Is it still at the same state or may be already > implemented? It is already implemented. https:

I found a sorting bug in solr/lucene

2011-07-18 Thread Jason Toy
Hi all, I found a bug that exists in the 3.1 and in trunk, but not in 1.4.1 When I try to sort by a column with a colon in it like "scores:rails_f", solr has cutoff the column name from the colon forward so "scores:rails_f" becomes "scores" To test, I inserted this doc: In 1.4.1 I was able to

Re: Dismax RequestHandler adn Fuzzy Search

2011-07-18 Thread steffen_kmt
thanks for your answer! I tried to add the following line to the solrconfig.xml file: fieldName~0.8 Before adding the line I got 14 results for a request. After adding the line (and restarting solr) I did the same request and changed just one letter of the string. I was expecting that I have to g

Re: Dismax RequestHandler adn Fuzzy Search

2011-07-18 Thread Ahmet Arslan
> thanks for your answer! > > I tried to add the following line to the solrconfig.xml > file: > fieldName~0.8 > > Before adding the line I got 14 results for a request. > After adding the line > (and restarting solr) I did the same request and changed > just one letter of > the string. I was expe

Re: I found a sorting bug in solr/lucene

2011-07-18 Thread Nicholas Chase
Seems to me that you wouldn't want to use a colon in a field name, since the search syntax uses it (ie, to find a document with foo = bar, you use foo:bar). I don't know whether that's actually prohibited, but that could be your problem. Nick On 7/18/2011 8:10 AM, Jason Toy wrote: Hi

Re: I found a sorting bug in solr/lucene

2011-07-18 Thread Jason Toy
I am using a fairly popular library (sunspot-solr for ruby) on top of solr that introduces the use of a colon, so I will modify the library, but I think there is still a bug as this stopped working in recent version of solr. Solr should also not allow the data into the doc in the first place if it

Re: Dismax RequestHandler adn Fuzzy Search

2011-07-18 Thread steffen_kmt
iorixxx wrote: > > What is your purpose of using it in pf parameter? > I don't know. I have seen it somewhere and i thought it has to be in the pf parameter. iorixxx wrote: > > edismax enables fuzzy search but you should use that tilde sing in q > parameter. > I tried this in qf parameter (

Re: XInclude Multiple Elements

2011-07-18 Thread Stephen Duncan Jr
Does anyone use XInclude? I'd like to hear about any successful usage at all. Stephen Duncan Jr www.stephenduncanjr.com

[Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 10000 tps

2011-07-18 Thread Nagendra Nagarajayya
Hi! I would like to announce the availability of Solr 3.3 with RankingAlgorithm and Near Real Time (NRT) search capability now. The NRT performance is very high, 10,000 documents/sec with the MBArtists 390k index. The NRT functionality allows you to add documents without the IndexSearchers be

Re: SOLR Shard failover Query

2011-07-18 Thread Shawn Heisey
On 7/17/2011 11:03 PM, pravesh wrote: Hi, SOLR has sharding feature, where we can distribute single search request across shards; the results are collected,scored, and, then response is generated. Wanted to know, what happens in case of failure of specific shard(s), suppose, one particular shar

NRT and commit behavior

2011-07-18 Thread Nicholas Chase
Very glad to hear that NRT is finally here! But my question is this: will things still come to a standstill during a commit? Thanks... Nick

Re: difference between shard and core in solr

2011-07-18 Thread Briggs Thompson
I think everything you said is correct for static schemas, but a single core does not necessarily have a unique schema since you can have dynamic fields. With dynamic fields, you can have multiple types of documents in the same index (core), and a multiple types of indexed fields specific to indiv

Re: NRT and commit behavior

2011-07-18 Thread Yonik Seeley
On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase wrote: > Very glad to hear that NRT is finally here!  But my question is this: will > things still come to a standstill during a commit? New updates can now proceed in parallel with a commit, and searches have always been completely asynchronous w.

Re: NRT and commit behavior

2011-07-18 Thread Jonathan Rochkind
In practice, in my experience at least, a very 'expensive' commit can still slow down searches significantly, I think just due to CPU (or i/o?) starvation. Not sure anything can be done about that. That's my experience in Solr 1.4.1, but since searches have always been async with commits, it p

Re: Dismax RequestHandler adn Fuzzy Search

2011-07-18 Thread Ahmet Arslan
> I tried this in qf parameter (fieldname~0.8^2) but have > still the same > problem: no results Okey it is not qf nor pf. Just plain q parameter. q=test~0.8

Re: Dismax RequestHandler adn Fuzzy Search

2011-07-18 Thread steffen_kmt
iorixxx wrote: > > q=test~0.8 > do you add ~0.8 in the query (http) or in the solrconfig.xml (like field~0.8)? is "test" the fieldName or a search string? -- View this message in context: http://lucene.472066.n3.nabble.com/Dismax-RequestHandler-adn-Fuzzy-Search-tp3178747p3179643.html Sent

Re: Best practices for Calculationg QPS

2011-07-18 Thread Koji Sekiguchi
(11/07/18 23:35), Siddhesh Shirode wrote: Hi Everyone, I would like to know the best practices or best tools for Calculating QPS in Solr. Thanks. Just an FYI: Admin GUI > STATISTICS > QUERY gives you avgRequestsPerSecond for each request handler. koji -- http://www.rondhuit.com/en/

Re: Dismax RequestHandler adn Fuzzy Search

2011-07-18 Thread Ahmet Arslan
> > > > q=test~0.8 > > > > do you add ~0.8 in the query (http) or in the > solrconfig.xml (like name="q">field~0.8)? mostly in the http. > is "test" the fieldName or a search string? search string. Do you have another use case?

Re: NRT and commit behavior

2011-07-18 Thread Mark Miller
I've written a blog post on some of the recent improvements that explains things a bit: http://www.lucidimagination.com/blog/2011/07/11/benchmarking-the-new-solr-%E2%80%98near-realtime%E2%80%99-improvements/ On Jul 18, 2011, at 10:53 AM, Nicholas Chase wrote: > Very glad to hear that NRT is fin

Re: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 10000 tps

2011-07-18 Thread Mark Miller
Hey Nagendra - I don't mind seeing these external project announces here (though you might keep Solr related announces off the Lucene user list), but please word these announces so that users are not confused that this is an Apache release, and that it is an external project built on top of Apac

Join performance?

2011-07-18 Thread Kanduru, Ajay (NIH/NLM/LHC) [C]
I am trying to optimize performance of solr with our collection. The collection has 208M records with index size of about 80GB. The machine has 16GB and I am allocating about 14GB to solr. I am using self join statement in filter query like this: q=(general search term) fq={!join from=join_field

Re: I found a sorting bug in solr/lucene

2011-07-18 Thread Chris Hostetter
: When I try to sort by a column with a colon in it like : "scores:rails_f", solr has cutoff the column name from the colon : forward so "scores:rails_f" becomes "scores" Yes, this bug was recently reported against the 3.x line, but no fix has yet been identified... https://issues.apache.org/j

Re: Join performance?

2011-07-18 Thread Yonik Seeley
On Mon, Jul 18, 2011 at 12:48 PM, Kanduru, Ajay (NIH/NLM/LHC) [C] wrote: > I am trying to optimize performance of solr with our collection. The > collection has 208M records with index size of about 80GB. The machine has > 16GB and I am allocating about 14GB to solr. > > I am using self join sta

Solr and External Fields

2011-07-18 Thread Jamie Johnson
I recently modified the DefaultSolrHighlighter to support external fields, but is there a way to do this for solr itself? I'm looking to store a field in an external store and give Solr access to that field. Where in Solr would I do this?

Re: Extending Solr Highlighter to pull information from external source

2011-07-18 Thread Jamie Johnson
I haven't seen any interest in this, but for anyone following, I updated the alternateField logic to support pulling from the external field if available. Would be useful to know how to get solr to use this external field provider in general so we wouldn't have to modify the highlighter at all, ju

Specify the length for returned highlighted fields

2011-07-18 Thread Jamie Johnson
Is there a way to specify the length of the text that should come back from the highlighter? For instance I have a field that is 500k, I want only the first 100 characters. I don't see anything like this now, does it exist?

Solr search starting with 1 character spin endlessly

2011-07-18 Thread Timothy Tagge
Solr version: 1.4.1 I'm having some trouble with certain queries run against my Solr index. When a query starts with a single letter followed by a space, followed by another search term, the query runs endlessly and never comes back. An example problem query string... /customer/select/?q=name%

Re: Solr search starting with 1 character spin endlessly

2011-07-18 Thread Yonik Seeley
On Mon, Jul 18, 2011 at 3:44 PM, Timothy Tagge wrote: > Solr version:  1.4.1 > > I'm having some trouble with certain queries run against my Solr > index.  When a query starts with a single letter followed by a space, > followed by another search term, the query runs endlessly and never > comes ba

Searching for strings

2011-07-18 Thread Chip Calhoun
Is there a way to search for a specific string using Solr, either by putting it in quotes or by some other means? I haven't been able to do this, but I may be missing something. Thanks, Chip

RE: Analysis page output vs. actually getting search matches, a discrepency?

2011-07-18 Thread Robert Petersen
OK I did what Hoss said, it only confirms I don't get a match when I should and that the query parser is doing the expected. Here are the details for one test sku. My analysis page output is shown in my email starting this thread and here is my query debug output. This absolutely should match bu

Re: Searching for strings

2011-07-18 Thread Rob Casson
chip, gonna need more information about your particular analysis chain, content, and example searches to give a better answer, but phrase queries (using quotes) are supported in both the standard and dismax query parsers that being said, lots of things may not match a person's idea of an exact st

Re: Payload doesn't apply to WordDelimiterFilterFactory-generated tokens

2011-07-18 Thread Chris Hostetter
: It seems that the payloads are applied only to the original word that I : index and the WordDelimiterFilter doesn't apply the payloads to the tokens : it generates. I believe you are correct. I think the general rule for most TokenFilters that you will find in Lucene/Solr is that they don't t

Re: Max Rows

2011-07-18 Thread Chris Hostetter
: Works like a charm, but there is one problem, the maxrows attribute is set : to 10, this means 10 results per page, but when you put several collections : in the attribute, what it does is that adds 10 results per page per : collections, so if a have 4 comma separeted collections the max rows fo

defType argument weirdness

2011-07-18 Thread Naomi Dushay
I found a weird behavior with the Solr defType argument, perhaps with respect to default queries? defType=dismax&q=*:* no hits q={!defType=dismax}*:* hits defType=dismax hits Here is the request handler, which I explicitly indicate: lucene

Re: solr scale on trie fields

2011-07-18 Thread Chris Hostetter
There are a few things here that i think you might be missunderstanding... : function, but i read in solr book (Solr 1.4 enterprise search server by Eric : Pugh and David Smiley) that "*scale will traverse the entire document set : and evaluate the function to determine the smallest and largest va

Re: difference between shard and core in solr

2011-07-18 Thread Chris Hostetter
http://wiki.apache.org/solr/SolrTerminology : hi , : : i just wanna be clear in the concepts of core and shard ? : : a single core is an index with same schema , is this wat core really is ? : : can a single core contain two separate indexes with different schema in it ? : : Is a shard re

Re: Stored Field

2011-07-18 Thread Erick Erickson
Well, it all depends upon what you mean by "size" ... This page http://lucene.apache.org/java/3_0_2/fileformats.html#file-names explains what goes where in the files created by Lucene. The point is that the raw text (i.e. *stored* data) is put in separate fles from the indexed (i.e. searched) data

Re: Data Import from a Queue

2011-07-18 Thread Erick Erickson
This is a really cryptic problem statement. you might want to review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Fri, Jul 15, 2011 at 1:52 PM, Brandon Fish wrote: > Does anyone know of any existing examples of importing data from a queue > into Solr? > > Thank you. >

Re: Index rows with NULL value

2011-07-18 Thread Erick Erickson
Please provide some more context here, there's nothing really to go on. It might help to review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Fri, Jul 15, 2011 at 9:58 PM, Ruixiang Zhang wrote: > Hi > > It seems that solr does not index a row when some column of this row has > NU

RE: ' invisible ' words

2011-07-18 Thread Robert Petersen
Read my thread " RE: Analysis page output vs. actually getting search matches, a discrepancy?" and see if it is not somewhat like your problem... even if not, there might be something to help as to how to figure out what is going on in your case... -Original Message- From: deniz [mailto:de

Re: ' invisible ' words

2011-07-18 Thread Erick Erickson
Deniz: Can you create a self-contained test case that illustrates the problem? In reality, I suspect that you're doing something ever-so-slightly different, didn't remove your index between tests, whatever (I know I've gotten myself completely messed up after working on something for hours!). Mak

Re: XInclude Multiple Elements

2011-07-18 Thread Chris Hostetter
: I see this post: : http://lucene.472066.n3.nabble.com/including-external-files-in-config-by-corename-td698324.html : that implies you can use #xpointer(/*/node()) to get all elements of : the root node (like if I changed my example to only have one include, : and just used multiple files, which

Re: Best practices for Calculationg QPS

2011-07-18 Thread Erick Erickson
Measure. On an index with your real data with real queries ... Being a smart-aleck aside, using something like jMeter is useful. The basic idea is that you can use such a tool to fire queries at a Solr index, configuring it with some number of threads that all run in parallel, and keep upping the

Re: Specify the length for returned highlighted fields

2011-07-18 Thread Erick Erickson
Does hl.fragsize work in your case? Best Erick On Mon, Jul 18, 2011 at 3:16 PM, Jamie Johnson wrote: > Is there a way to specify the length of the text that should come back > from the highlighter?  For instance I have a field that is 500k, I > want only the first 100 characters.  I don't see an

Re: Question about optimization

2011-07-18 Thread Chris Hostetter
: I saw this in the Solr wiki : "An un-optimized index is going to be *at : least* 10% slower for un-cached queries." : Is this still true? I read somewhere that recent versions of Lucene where : less sensitive to un-ptimized indexed than is the past... correct. I've removed that specific statem

Re: Analysis page output vs. actually getting search matches, a discrepency?

2011-07-18 Thread Erick Erickson
Hmmm, is there any chance that you're stemming one place and not the other? And I infer from your output that your default search field is "moreWords", is that true and expected? You might use luke or the TermsComponent to see what's actually in the index, I'm going to guess that you'll find "ster

Re: defType argument weirdness

2011-07-18 Thread Erick Erickson
What are qf_dismax and pf_dismax? They are meaningless to Solr. Try adding &debugQuery=on to your URL and you'll see the parsed query, which helps a lot here If you change these to the proper dismax values (qf and pf) you'll get beter results. As it is, I think you'll see output like: +() ()

Re: Best practices for Calculationg QPS

2011-07-18 Thread Lance Norskog
Easiest way to count QPS: Take one solr log file. Make it have date stamps and log entries on the same line. Grab all lines with 'qTime='. Strip these lines of all text after the timestamp. Run this Unix program to get counts in line of how many times each timestamp appears in a row: uniq -c

Re: Deleted docs in IndexWriter Cache (NRT related)

2011-07-18 Thread Nagendra Nagarajayya
Thanks Pravesh! But this is NRT related so commit is not called to update a document. The documents added are available for searches immediately after update and commit is not needed. A commit may be scheduled once in about 15 mins or as needed. Regards, - Nagendra Nagarajayya http://solr-ra.

Re: NRT and commit behavior

2011-07-18 Thread Nagendra Nagarajayya
From one of the users of NRT, their system was freezing with commits at about 1.5 million docs due to the frequency of commits but with NRT (Solr with RankingAlgorithm) update document performance and a commit interval of about 15 mins they no longer have the freeze problem. Regards, - Nagen

Re: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 10000 tps

2011-07-18 Thread Nagendra Nagarajayya
Thanks Mark! I made the earlier implementation of NRT with 1.4.1 available to Solr through a JIRA issue: https://issues.apache.org/jira/browse/SOLR-2568 ( I had made available the implementation details through a paper published at http://solr-ra.tgels.com/papers/NRT_Solr_RankingAlgorithm.pdf

Re: Dismax RequestHandler adn Fuzzy Search

2011-07-18 Thread 虞冰
maybe you can read http://wiki.apache.org/solr/DisMaxQParserPlugin 2011/7/19 Ahmet Arslan > > > > > > q=test~0.8 > > > > > > > do you add ~0.8 in the query (http) or in the > > solrconfig.xml (like > name="q">field~0.8)? > > mostly in the http. > > > is "test" the fieldName or a search string?

searching for google+

2011-07-18 Thread Jason Toy
How does one search for the term "google+" with solr? I noticed on twitter I can search for google+: http://search.twitter.com/search?q=google%2B (which uses lucene, not sure about solr) but searching on my copy of solr, I can't search for google+ -- - sent from my mobile 6176064373

Re:searching for google+

2011-07-18 Thread 方振鹏
"google+" or google\+ -- Best wishes James Bond Fang 方 振鹏 Dept. Software Engineering Xiamen University -- Original -- From: "Jason Toy"; Date: 2011年7月19日(星期二) 上午10:28 To: "solr-user"; Subject: searching for google+ H

How could I monitor solr cache

2011-07-18 Thread kun xiong
Hi, I am wondering how could I get solr cache running status. I know there is a JMX containing those information. Just want to know what tool or method do you make use of to monitor cache, in order to enhance performance or detect issue. Thanks a lot Kun

Re: SOLR Shard failover Query

2011-07-18 Thread pravesh
Thanx Shawn, >When I first set things up, I was using SOLR-1537 on Solr 1.5-dev. By >the time I went into production, I had abandoned that idea and rolled >out a stock 1.4.1 index with two complete server chains, each with 7 >shards. So, Both 2 chains were configured under cluster in load bala

Re: How could I monitor solr cache

2011-07-18 Thread pravesh
This might be of some help: http://wiki.apache.org/solr/SolrJmx http://wiki.apache.org/solr/SolrJmx Thanx Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/How-could-I-monitor-solr-cache-tp3181317p3181407.html Sent from the Solr - User mailing list archive at Nabble.c

Re: defType argument weirdness

2011-07-18 Thread William Bell
dismax does not work with a=*:* defType=dismax&q=*:* no hits You need to switch this to: defType=dismax&q.alt=*:* no hits On Mon, Jul 18, 2011 at 8:44 PM, Erick Erickson wrote: > What are qf_dismax and pf_dismax? They are meaningless to > Solr. Try adding &debugQuery=on to your U

Re: How could I monitor solr cache

2011-07-18 Thread Ahmet Arslan
> I am wondering how could I get solr cache running status. I > know there is a > JMX containing those information. > > Just want to know what tool or method do you make use of to > monitor cache, > in order to enhance performance or detect issue. You might find this interesting : http://semate

solr slave's performance issue after replicate the optimized index

2011-07-18 Thread 虞冰
Hi all, I have a performance issue~ I do a optimize on solr master every night. but about a month ago, every time after the slaves get the new optimized index, system cpu usage will raise from 0.3 - 0.5% to 7 - 10% (daily average), and servers's load average also become 2 times more than normal.t