RE: solr 4.0 - pagination

2010-12-20 Thread Grijesh.singh
Then what will be when we filter out only some result and want to group ,how your index time group count will help. - Grijesh -- View this message in context: http://lucene.472066.n3.nabble.com/solr-4-0-pagination-tp1812384p2124747.html Sent from the Solr - User mailing list archive at Nabb

Recap on derived objects in Solr Index, 'schema in a can'

2010-12-20 Thread Dennis Gearon
Based on more searches and manual consolidation, I've put together some of the ideas for this already suggested in a summary below. The last item in the summary seems to be interesting, low technical cost way of doing it. Basically, it treats the index like a 'BigTable', a la "No SQL". Erick Er

Re: Lower level filtering

2010-12-20 Thread Stephen Green
On Wed, Dec 15, 2010 at 9:57 AM, Stephen Green wrote: > Otis pointed out that the patch can't be applied against the current > source, so I need to go back and make it work with the current source > (new job = no time).  I'll see if I can find the time this weekend to > do this. OK, I just submit

Re: [Nutch] and Solr integration

2010-12-20 Thread Adam Estrada
bin/nutch crawl urls -dir crawl -threads 10 -depth 100 -topN 50 -solrindex http://localhost:8983/solr I've run that command before and it worked...that's why I asked. grab nutch from trunk and run bin/nutch and see that it is in fact an option. It looks like Hadoop is the culprit now and I am at

Re: shard versus core

2010-12-20 Thread Lance Norskog
2x the index size is required for optimizing. Things that increase with index size: indexing time, query time and disk index size. My 500GB index at a previous job worked. Indexing was a little slow, queries were much slower. What finally made us split it up was that one binary blob of 500GB was t

Re: master master, repeaters

2010-12-20 Thread Lance Norskog
Ah, thanks for pointing that out. Each indexer needs its own marker for "where is new data in this stream"? This way, when either the primary or secondary starts, it can restart indexing from where it left off. The most reliable way to do this is to search the indexer Solr for its last update. A

Re: A schema inside a Solr Schema (Schema in a can)

2010-12-20 Thread Dennis Gearon
Here is a thread on this subject that I did not find earlier. Sometimes discussion, thought, and 'mulling' in the subconcious gets me better Google searches. http://lucene.472066.n3.nabble.com/multi-valued-associated-fields-td811883.html Dennis Gearon Signature Warning It is

RE: solr 4.0 - pagination

2010-12-20 Thread phpcip
Well, right now, I'm using SOLR in a LOT of my projects. I'm VERY fond of it, proud of it and VERY happy that such a team exists to make it work. Of course the pagination issue is a bit frustrating on the field collapsing... But... heck... I'm currently de-normalizing my postgresql database and..

Mime type for JSON

2010-12-20 Thread Emmanuel Bégué
Hello, When using a writer type of "json", SOLR (1.4.1) sets the content type header of the response as "text/plain" although it should be "application/json". This is not a very big problem, but it writes many warnings in Chrome logs: "Resource interpreted as script but transferred with MIME type

Re: [Nutch] and Solr integration

2010-12-20 Thread Anurag
why are using solrindex in the argument.? It is used when we need to index the crawled data in Solr For more read http://wiki.apache.org/nutch/NutchTutorial . Also for nutch-solr integration this is very useful blog http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/ I integrated nutch an

[Nutch] and Solr integration

2010-12-20 Thread Adam Estrada
All, I have a couple websites that I need to crawl and the following command line used to work I think. Solr is up and running and everything is fine there and I can go through and index the site but I really need the results added to Solr after the crawl. Does anyone have any idea on how to make

Re: A schema inside a Solr Schema (Schema in a can)

2010-12-20 Thread Dennis Gearon
Thanks James. So being accurate with fields with fields(mulitvalues) is probably not possible using all the currently made analyzers. - Original Message From: "Dyer, James" To: "solr-user@lucene.apache.org" Sent: Mon, December 20, 2010 7:16:43 AM Subject: RE: A schema inside a Sol

Re: Reg blank values ( ) tags in SOLR XML

2010-12-20 Thread Markus Jelsma
No. But why is it a problem? A standard XML parser won't feel the difference. > Hi, > > In SOLR XML the blank spaces are displayed with just tags > > Is there a way I can make SOLR XML to display the blank values as > > > > instead of just > > > > Also has anyone parsed the blank value ta

Re: Syncing 'delta-import' with 'select' query

2010-12-20 Thread Juan Manuel Alvarez
Oops! That seems to be the problem, since I am using 1.4. Thanks! Juan M. On Tue, Dec 14, 2010 at 8:40 PM, Alexey Serba wrote: > What Solr version do you use? > > It seems that sync flag has been added to 3.1 and 4.0 (trunk) branches > and not to 1.4 > https://issues.apache.org/jira/browse/SOLR-

Re: about groups of random results + alphabetical result

2010-12-20 Thread Paula C. Laun : Dataprisma
There's another problem, i'm not sure i was clear: i need these records randomic, each level randomic alone. (one level cannot random with another level) Is it possible for the same request? Um Abraço, Paula C. Laun : Dataprisma pa...@dataprisma.com.br (47) 3035.1868 www.dataprisma.com.br

Reg blank values ( ) tags in SOLR XML

2010-12-20 Thread bbarani
Hi, In SOLR XML the blank spaces are displayed with just tags Is there a way I can make SOLR XML to display the blank values as instead of just Also has anyone parsed the blank value tags using SOLRNET before? If anyone can help me with my question or provide pointers it would be of g

Re: about groups of random results + alphabetical result

2010-12-20 Thread Paula C. Laun : Dataprisma
"brasil" will return companies with this word in any part of its name. this search (randomic in 4 different levels) is only for promoted records (1 records to be searched at all). free records (10 milion) are the fifth level and will respect the common search mode. Um Abraço, Paula C. Laun

Re: about groups of random results + alphabetical result

2010-12-20 Thread Walter Underwood
The problem happens with any common word, not just short words. What happens with "Brasil"? If this was a good way to do search, Solr would already implement it. It is not that hard to build. But it is not a good way to do search. I have been working on search for almost 15 years, and I hear th

Re: about groups of random results + alphabetical result

2010-12-20 Thread Paula C. Laun : Dataprisma
thank you for your help... this search will be published in Portuguese, and in this language we can clean up the sentence from words shorter than 3 characters. Paula C. Laun : Dataprisma pa...@dataprisma.com.br (47) 3035.1868 www.dataprisma.com.br - Original Message - From: "Walter Unde

Re: about groups of random results + alphabetical result

2010-12-20 Thread Walter Underwood
You probably do not want this ranking, because any query with a common word, like "the", will match most of the corpus in step two. Instead, use Solr to weight better quality matches more heavily, maybe 4X for exact matches, 2X for stemmed matches, and 1X for phonetic matches. wunder On Dec 20

Re: [Reload-Config] not working

2010-12-20 Thread Adam Estrada
This is the response I get...Does it matter that the configuration file is called something other than data-config.xml? After I get this I still have to restart the service. I wonder...do I need to commit the change? -

RE: A schema inside a Solr Schema (Schema in a can)

2010-12-20 Thread Dyer, James
Dennis, If you need to search a key/value pair, you'll have to put them both in the same field, somehow. One way is to re-index them using the key in the fieldname. For instance, suppose you have: contributor: dyer, james contributor: smith, sam role: author role: editor ...but you want

Re: shard versus core

2010-12-20 Thread Tri Nguyen
Thought about it some more and after some reading.  I suppose the answer depends on what kind of response time is expected to be good enough.   I can do some stress testing and see if disk i/o is the bottleneck as the index grows.  I can also look into optimizing/configuring solr parameters to he

Re: shard versus core

2010-12-20 Thread Tri Nguyen
Hi Erick,   Thanks for the explanation.   At which point does the index get too big where sharding is appropriate where it affects performance?   Tri --- On Sun, 12/19/10, Erick Erickson wrote: From: Erick Erickson Subject: Re: shard versus core To: solr-user@lucene.apache.org Date: Sunday, D

about groups of random results + alphabetical result

2010-12-20 Thread Paula C. Laun : Dataprisma
hi. i'm looking for a technology who could have high performance in searching a high amount of data (nearly 10 milion lines in a convencional database like sql server) and i think PHP running under apache solr is a good choice. i have only a doubt about its possibilities. i need to show in fir

Re: Dismax score - maximu of any one field?

2010-12-20 Thread Ahmet Arslan
> Can anyone tell me hoe the dismax score is computed? Is it > the maximum score for any of the component fields that are > searched? Thank You. http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/

Dismax score - maximu of any one field?

2010-12-20 Thread Jason Brown
Can anyone tell me hoe the dismax score is computed? Is it the maximum score for any of the component fields that are searched? Thank You. If you wish to view the St. James's Place email disclaimer, please use the link below http://www.sjp.co.uk/portal/internet/SJPemaildisclaimer

Re: DIH for sharded database?

2010-12-20 Thread Grijesh.singh
you can put table names in a different table and use like this - - - - - - Grijesh -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-for-

Re: master master, repeaters

2010-12-20 Thread Upayavira
I've successfully made extensive use of load balancers in sharded, replicated slave setups - see [1]. My question is how that might work with a master. You can have a load balancer, but you'd need to configure it into a 'fail over but please don't fail back' configuration. I'm not sure if that is