Re: Start With and contain search

2014-12-02 Thread melb
Thanks, I think the NgramFitlerFactory is the good filter, I will try it today but what if I want to search for query :* dom host *and get the result: *domhost* -- View this message in context: http://lucene.472066.n3.nabble.com/Start-With-and-contain-search-tp4171854p4172031.html Sent from th

Get list of collection

2014-12-02 Thread Ankit Jain
Hi All, I have a requirement to get the list of *collection* available in Solr. we are using solrj library. I am able to fetch the list of cores but not getting ways to fetch the list of collections. Below is the sample example, that i am using to fetch the cores: CoreAdminRequest reque

Slow queries

2014-12-02 Thread melb
Hi, I have a solr collection with 16 millions documents and growing daily with 1 documents recently it is becoming slow to answer my request ( several seconds) specially when I use multi-words query I am running solr on a machine with 32G RAM but heavy used one What are my options to optimize

Re: Slow queries

2014-12-02 Thread Siegfried Goeschl
If you performance was fine but degraded over the time it might be easier to check / increase the memory to have better disk caching. Cheers, Siegfried Goeschl On 02.12.14 09:27, melb wrote: Hi, I have a solr collection with 16 millions documents and growing daily with 1 documents recen

Re: Slow queries

2014-12-02 Thread melb
Yes performance degraded over the time, I can raise the memory but I can't do it every time and the volume will keep growing Is it better to put the solr on dedicated machine? Is there any thing else that can be done to the solr instance for example deviding the collection? rgds, -- View this

Re: Different update handlers for auto commit configuration

2014-12-02 Thread danny teichthal
Thanks for the clarification, I indeed mixed it with UpdateRequestHandler. On Mon, Dec 1, 2014 at 11:24 PM, Chris Hostetter wrote: > > : I thought that the auto commit is per update handler because they are > : configured within the update handler tag. > > is not the same thing as a that does

Replication of a corrupt master index

2014-12-02 Thread Charra, Johannes
Hi, If I have a master/slave setup and the master index gets corrupted, will the slaves realize they should not replicate from the master anymore, since the master does not have a newer index version? I'm using Solr version 4.2.1. Regards, Johannes

Re: SOLR not starting after restart 2 node cloud setup

2014-12-02 Thread Doss
Dear Erick, Thanks for your thoughts, it helped me a lot. In my instances no solr logs are appended in to catalina.out. Now I placed the log4j.properties file. Solr logs are captured in solr.log file with the help of it I found the reason for the issue. I am starting tomcat with the option -Dboo

Re: Getting the position of a word via Solr API

2014-12-02 Thread adfel70
Small update, I have managed making the Term Vector to work and I am getting all the words of the text field. The problem is that it doesn't work with several words combined, I can't find the offset of the needed expression starts... Any ideas anyone? Thanks! -- View this message in context:

Re: Start With and contain search

2014-12-02 Thread Alexandre Rafalovitch
It's not clear what you actually mean with that space. Do you mean any two words should try to match as if they were one? What's the business-level description of what you are trying to do? Also, you are not reinventing https://domainr.com/ , are you? If you are, search around, I think they had so

Re: Start With and contain search

2014-12-02 Thread melb
Yes this is exactly what I am trying to do but with less extended database can I do it with solr? rgds, -- View this message in context: http://lucene.472066.n3.nabble.com/Start-With-and-contain-search-tp4171854p4172105.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Start With and contain search

2014-12-02 Thread Alexandre Rafalovitch
Well, if all you are doing is substring searches, then Solr could be an overkill. But if you doing a search and then want to do faceting or additional query, then Solr is a good bet. And yes, it can do it, you just need to really understand your input patterns and what you want to find with them.

Re: Get list of collection

2014-12-02 Thread Erick Erickson
I think you want CloudSolrServer.getCollectionList() Best, Erick On Tue, Dec 2, 2014 at 12:27 AM, Ankit Jain wrote: > Hi All, > > I have a requirement to get the list of *collection* available in Solr. we > are using solrj library. > > I am able to fetch the list of cores but not getting ways to

Re: Slow queries

2014-12-02 Thread Erick Erickson
bq: Is it better to put the solr on dedicated machine? Yes, absolutely. Solr _likes_ memory, and on a machine with lots of other processes you'll keep running into this problem. FWIW, I've seen between 10M and 300M docs fit into 16G for the JVM. But see Uwe's excellent blog on MMapDirectory and n

Re: Replication of a corrupt master index

2014-12-02 Thread Erick Erickson
No. The master is the master and will always stay the master unless you change it. This is one of the reasons I really like to keep the original source around in case I every have this problem. Best, Erick On Tue, Dec 2, 2014 at 2:34 AM, Charra, Johannes wrote: > > Hi, > > If I have a master/sla

Re: SOLR not starting after restart 2 node cloud setup

2014-12-02 Thread Erick Erickson
Glad you found a solution! Best, Erick On Tue, Dec 2, 2014 at 4:30 AM, Doss wrote: > Dear Erick, > > Thanks for your thoughts, it helped me a lot. In my instances no solr logs > are appended in to catalina.out. > > Now I placed the log4j.properties file. Solr logs are captured in solr.log > file

Re: Slow queries

2014-12-02 Thread Siegfried Goeschl
It might be a good idea to * move SOLR to a dedicated box :-) * load your SOLR server with 20.000.000 documents (the estimated number of documents after three years) and do performance testing & tuning Afterwards you have some hard facts about hardware sizing and expected performance for the ne

AW: Replication of a corrupt master index

2014-12-02 Thread Charra, Johannes
Thanks for your response, Erick. Do you think it is possible to corrupt an index merely with HTTP requests? I've been using the aforementioned m/s setup for years now and have never seen a master failure. I'm trying to think of scenarios where this setup (1 master, 4 slaves) might have a tota

Find duplicates

2014-12-02 Thread Peter Kirk
Hi Is it possible to formulate a Solr query which finds all documents which have the same value in a particular field? Note, I don't know what the value is, I just want to find all documents with duplicate values. For example, I have 5 documents: Doc1: field Name = Peter Doc2: field Name = Jac

Re: Find duplicates

2014-12-02 Thread Erik Hatcher
Sort of… if you indexed the full value of the field (and you’re looking for truly exact matches) as a string field type you could facet on that field with facet.mincount=2 and the facets returned would be the ones with duplicate values. You’d have to drill down on each of the facets returned to

RE: Find duplicates

2014-12-02 Thread Gonzalo Rodriguez
Have you tried using result grouping for your query? There are some very good examples in the wiki: https://wiki.apache.org/solr/FieldCollapsing Gonzalo -Original Message- From: Peter Kirk [mailto:p...@alpha-solutions.dk] Sent: Tuesday, December 02, 2014 9:58 AM To: solr-user@lucene.a

spellchecker returns correctlySpelled=true if one term in phrase is correctly spelled

2014-12-02 Thread Tao, Jing
Hi, It seems that when I do a phrase search, SOLR's spellchecker would return correctlySpelled=true if at least one term in the phrase was correctly spelled. For example: If I search for "soriasis treatment", SOLR returns over 8000 search results for "treatment", correctlySpelled: true, and a sp

Re: Find duplicates

2014-12-02 Thread Alexandre Rafalovitch
And if I am correct, enabling docValues will do this kind of grouping as part of the indexing with docValues data structure (per segment). So, all one has to do is to get it back (through faceting). Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newslett

Re: Contextual search

2014-12-02 Thread ASHOK SARMAH
Hi alex thnx .i was able to get the get the suggestion for thri book as " the book of three".but when i search for threebook (three and book are now combined)then i am not able to get the suggestn for "a book of three".how we solve this? On 01-Dec-2014 9:34 PM, "Alexandre Rafalovitch" wrote: > If

Re: SOLR Join Query, Use highest weight.

2014-12-02 Thread Darin Amos
Thanks! I will take a look at this. I do have an additional question, since after a bunch of digging I believe I am going to run into another dead end. I want to execute the join (or rollup) query, but I want the facets to represent the facets of all the child documents, not the resulting produ

Tika HTTP 400 Errors with DIH

2014-12-02 Thread Teague James
Hi all, I am using Solr 4.9.0 to index a DB with DIH. In the DB there is a URL field. In the DIH Tika uses that field to fetch and parse the documents. The URL from the field is valid and will download the document in the browser just fine. But Tika is getting HTTP response code 400. Any ideas why

Re: Contextual search

2014-12-02 Thread Alexandre Rafalovitch
Well, how would you expect it to solve it - in non-technical terms. What's the high level description of "book of three" matching "threebook" and not say "threeof"? Random permutation of any two words? It's a bit of a strange requirement so far. Regards, Alex. Personal: http://www.outerthoughts

Re: Tika HTTP 400 Errors with DIH

2014-12-02 Thread Alexandre Rafalovitch
On 2 December 2014 at 13:19, Teague James wrote: > clob="true" What does ClobTransformer is doing on the DownloadURL field? Is it possible it is corrupting the value somehow? Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-s

Re: SOLR Join Query, Use highest weight.

2014-12-02 Thread Michael Sokolov
Have you considered using grouping? If I understand your requirements, I think it does what you want. https://cwiki.apache.org/confluence/display/solr/Result+Grouping On 12/02/2014 12:59 PM, Darin Amos wrote: Thanks! I will take a look at this. I do have an additional question, since after a

Re: Getting the position of a word via Solr API

2014-12-02 Thread Michael Sokolov
I would keep trying with the highlighters. Some of them, at least, have options to provide an external text source, although you will almost certainly have to write some java code to get this working; extend the highlighter you choose and supply its text from an external source. -Mike On 12

Re: SOLR Join Query, Use highest weight.

2014-12-02 Thread Darin Amos
Hi, Thanks for the response, I have considered grouping often, but grouping does not return the parent document, just the group id. I would still have to add something to take the group id’s and get the parent documents. Thanks Darin > On Dec 2, 2014, at 2:11 PM, Michael Sokolov > wrote: >

Re: SOLR Join Query, Use highest weight.

2014-12-02 Thread Michael Sokolov
We simply index parent and child documents with the same field value, and group on that, querying both parent and child documents. If you boost the parent it will show up as the first result in the group. Then you get all related documents together. in the same group. -Mike On 12/02/2014 02:

Solr collection alias - how rank is affected

2014-12-02 Thread SolrUser1543
Solr allows create an alias for few collection via its API Suppose I have two collection C1 & C2 and an alias C3 = C1 , C2 C1 and C2 deployed on different machines , but has a mutual ZooKeeper . How rank is affected when searching C3 collection ? when they has same schema ? different schema ?

indexing numbers in texts for range queries

2014-12-02 Thread Mikhail Khludnev
Hello Searchers, Don't you remember any examples of indexing numbers inside of plain text. eg. if I have a text: "foo and 10 bars" I want to find it with a query like foo [8 TO 20] bars. The question no.1 whether to put trie terms into the separate field or they can reside at the same text one? No

Re: indexing numbers in texts for range queries

2014-12-02 Thread Michael Sokolov
Mikhail - I can imagine a filter that strips out everything but numbers and then indexes those with a (separate) numeric (trie) field. But I don't believe you can do phrase or other proximity queries across multiple fields. As long as an or-query is good enough, I think this problem is not to

Re: indexing numbers in texts for range queries

2014-12-02 Thread Mikhail Khludnev
Hello Michael, On Tue, Dec 2, 2014 at 11:15 PM, Michael Sokolov < msoko...@safaribooksonline.com> wrote: > Mikhail - I can imagine a filter that strips out everything but numbers > and then indexes those with a (separate) numeric (trie) field. But I don't > believe you can do phrase or other pro

Re: indexing numbers in texts for range queries

2014-12-02 Thread Michael Sokolov
On 12/02/2014 03:41 PM, Mikhail Khludnev wrote: Thanks for suggestions. Do I remember correctly that you ignored last Lucene Revolution? I wouldn't say I ignored it, but it's true I wasn't there in DC: I'm excited to catch up on the presentations as the videos become available, though. -Mike

Re: indexing numbers in texts for range queries

2014-12-02 Thread Ahmet Arslan
Hi Mikhail, Range queries allowed inside phrases with ComplexPhraseQParser, but I think string order is used. Also LUCENE-5205 / SOLR-5410 is meant to supersede complex phrase. It might have that functionality too. Ahmet On Tuesday, December 2, 2014 10:43 PM, Mikhail Khludnev wrote: Hel

Re: Replication of a corrupt master index

2014-12-02 Thread Erick Erickson
If nothing else, the disk underlying the index could have a bad spot... There have been some corrupt index bugs in the past, but they always get a super-high priority for fixing so don't hang around for long. You can always take periodic backups. Perhaps the slickest way to do that is to set up a

Re: Contextual search

2014-12-02 Thread ASHOK SARMAH
HI Alex, I have specified following in my solrconfig.xml :: on true 5 2 5 true true 5 3 wordbreak 5 I have written wordbreak 5 to break the words with minimum length 5.then it should break my word threebook as three and book right?corre

Re: Contextual search

2014-12-02 Thread ASHOK SARMAH
HI Alex, I have specified these in the solrconfig.xml as:: on true 5 2 5 true true 5 3 wordbreak 5 . The lines wordbreak 5 are for breaking the word threebook as three and book .But then too its not searching for the string "A book of t

Re: Contextual search

2014-12-02 Thread Alexandre Rafalovitch
Sorry, beyond my area of expertise now. Hopefully somebody else will pitch in. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853