Re: How to configure solr to our java project in eclipse

2013-10-27 Thread Amit Nithian
Try this: http://hokiesuns.blogspot.com/2010/01/setting-up-apache-solr-in-eclipse.html I use this today and it still works. If anything is outdated (as it's a relatively old post) let me know. I wrote this so ping me if you have any questions. Thanks Amit On Sun, Oct 27, 2013 at 7:33 PM, Amit A

When is/should qf different from pf?

2013-10-27 Thread Amit Nithian
Hi all, I have been using Solr for years but never really stopped to wonder: When using the dismax/edismax handler, when do you have the qf different from the pf? I have always set them to be the same (maybe different weights) but I was wondering if there is a situation where you would have a fi

Re: When is/should qf different from pf?

2013-10-28 Thread Amit Nithian
"when phrases aren't important in the fields". > If you're doing a simple boolean match, adding phrase fields will add > expense, to no good purpose etc. Phrases on numeric > fields seems wrong. > > FWIW, > Erick > > > On Mon, Oct 28, 2013 at 1:03 AM, Amit

Boosting documents by categorical preferences

2013-11-12 Thread Amit Nithian
Hi all, I have a question around boosting. I wanted to use the &boost= to write a nested query that will boost a document based on categorical preferences. For a movie search for example, say that a user likes drama, comedy, and action. I could use things like qq=&q={!boost%20b=$b%20defType=edis

Re: Boosting documents by categorical preferences

2013-11-18 Thread Amit Nithian
Hey Chris, Sorry for the delay and thanks for your response. This was inspired by your talk on boosting and biasing that you presented way back when at a meetup. I'm glad that my general approach seems to make sense. My approach was something like: 1) Look at the categories that the user has pref

Re: Boosting documents by categorical preferences

2013-11-20 Thread Amit Nithian
I thought about that but my concern/question was how. If I used the pow function then I'm still boosting the bad categories by a small amount..alternatively I could multiply by a negative number but does that work as expected? I haven't done much with negative boosting except for the sledgehammer

Re: Boosting documents by categorical preferences

2014-01-27 Thread Amit Nithian
Hi Chris (and others interested in this), Sorry for dropping off.. I got sidetracked with other work and came back to this and finally got a V1 of this implemented. The final process is as follows: 1) Pre-compute the global categorical num_ratings/average/std-dev (so for Action the average rating

Re: Boosting documents by categorical preferences

2014-01-30 Thread Amit Nithian
Chris, Sounds good! Thanks for the tips.. I'll be glad to submit my talk to this as I have a writeup pretty much ready to go. Cheers Amit On Tue, Jan 28, 2014 at 11:24 AM, Chris Hostetter wrote: > > : The initial results seem to be kinda promising... of course there are > many > : more optimiz

Re: do SearchComponents have access to response contents

2013-04-04 Thread Amit Nithian
"We need to also track the size of the response (as the size in bytes of the whole xml response tat is streamed, with stored fields and all). I was a bit worried cause I am wondering if a searchcomponent will actually have access to the response bytes..." ==> Can't you get this from your container

Re: Solr 4.2 single server limitations

2013-04-04 Thread Amit Nithian
There's a whole heap of information that is missing like what you plan on storing vs indexing and yes QPS too. My short answer is try with one server until it falls over then start adding more. When you say multiple-server setup do you mean multiple servers where each server acts as a slave storin

Re: Sharing index amongst multiple nodes

2013-04-06 Thread Amit Nithian
I don't understand why this would be more performant.. seems like it'd be more memory and resource intensive as you'd have multiple class-loaders and multiple cache spaces for no good reason. Just have a single core with sufficiently large caches to handle your response needs. If you want to load

Re: how to skip test while building

2013-04-06 Thread Amit Nithian
If you generate the maven pom files you can do this I think by doing mvn -DskipTests=true. On Sat, Apr 6, 2013 at 7:25 AM, Erick Erickson wrote: > Don't know a good way to skip compiling the tests, but there isn't > any harm in compiling them... > > changing to the solr directory and just issui

Re: writing a custom Filter plugin?

2013-05-14 Thread Amit Nithian
At first I thought you were referring to Filters in Lucene at query time (i.e. bitset filters) but I think you are referring to token filters at indexing/text analysis time? I have had success writing my own Filter as the link presents. The key is that you should write a custom class that extends

Re: Need solr query help

2013-05-14 Thread Amit Nithian
Is it possible instead to store in your solr index a bounding box of store location + delivery radius, do a bounding box intersection between your user's point + radius (as a bounding box) and the shop's delivery bounding box. If you want further precision, the frange may work assuming it's a post-

Re: Restaurant availability from database

2013-05-23 Thread Amit Nithian
Hossman did a presentation on something similar to this using spatial data at a Solr meetup some months ago. http://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/ May be helpful to you. On Thu, May 23, 2013 at 9:40 AM, rajh wrote: > Thank you for your answer. > > Do you m

DIH, UTF8 and default DIH encoding value

2010-07-31 Thread Amit Nithian
All, I am not sure if this is overly obvious or not (it wasn't to me) but in trying to index some international characters from XML files using the DIH, I found that setting the encoding attribute on the dataSource element to "UTF-8" fixed my problem. My question is why the default isn't UTF-8

Re: DIH and multivariable fields problems

2010-08-06 Thread Amit Nithian
That's probably the most efficient way to do it... I believe the line you are referring allows you to have sub-entities which , in the RDBMS, would execute a separate query for each parent given a primary key. The downside to this though is that for each parent you will be executing N separate quer

Re: DIH, UTF8 and default DIH encoding value

2010-08-08 Thread Amit Nithian
just create an account (I think the > link to > that is in the page footer) and edit. > > Otis > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > ----- Original Message > > F

Can a Solr Server be both master and slave?

2010-08-16 Thread Amit Nithian
I am not sure if this is the best approach to this problem but I was curious if a single solr server could be both a master and a slave without causing index corruption? It seems that you could setup multiple replication handlers in the SOLR config, /replication /replication2 and have one be master

Re: Can a Solr Server be both master and slave?

2010-08-16 Thread Amit Nithian
Ugh I should have checked there first! Thanks for the reply.. that helps a lot. Sincerely Amit On Mon, Aug 16, 2010 at 10:57 AM, Gora Mohanty wrote: > On Mon, 16 Aug 2010 10:43:38 -0700 > Amit Nithian wrote: > > > I am not sure if this is the best approach to this proble

Re: Is there any strss test tool for testing Solr?

2010-08-25 Thread Amit Nithian
i recommend JMeter. We use that to do load testing on a search server. of course you have to provide a reasonable set of queries as input... if you don't have any then a reasonable estimation based on your expected traffic should suffice. JMeter can be used for other load testing too.. Be careful

Hardware Specs Question

2010-08-30 Thread Amit Nithian
Hi all, I am curious to know get some opinions on at what point having more CPU cores shows diminishing returns in terms of QPS. Our index size is about 8GB and we have 16GB of RAM on a quad core 4 x 2.4 GHz AMD Opteron 2216. Currently I have the heap to 8GB. We are looking to get more servers to

Re: Hardware Specs Question

2010-08-30 Thread Amit Nithian
index in memory- better than Solr can. > > Lance > > On Mon, Aug 30, 2010 at 4:52 PM, Amit Nithian wrote: > > Hi all, > > > > I am curious to know get some opinions on at what point having more CPU > > cores shows diminishing returns in terms of QPS. Our index size

Re: Hardware Specs Question

2010-08-30 Thread Amit Nithian
d allocate enough >> RAM to run comfortably. Linux & Windows et. al. have their own cache >> of disk blocks. They use very good algorithms for managing this cache. >> Also, they do not make long garbage collection passes. >> >> On Mon, Aug 30, 2010 at 5:48 PM, Amit Nit

Re: anybody using solr with Cassandra?

2010-08-30 Thread Amit Nithian
I am curious about this too.. are you talking about using HBase/Cassandra as an aux store of large data or using Cassandra to store the actual lucene index (as in LuCandra)? On Mon, Aug 30, 2010 at 11:06 PM, Siju George wrote: > Thanks a million Nick, > > We are currently debating whether we sho

CoreContainer Usage

2010-10-07 Thread Amit Nithian
I am trying to understand the multicore setup of Solr more and saw that SolrCore.getCore is deprecated in favor of CoreContainer.getCore(name). How can I get a reference to the CoreContainer for I assume it's been created somewhere in Solr and is it possible for one core to get access to another So

Re: Very slow queries

2010-10-07 Thread Amit Nithian
Try stopping replication and see if your query performance may improve. I think the caches get reset each time replication occurs. You can look at the cache performance using the admin console.. try and see if any of the caches are constantly being missed.. this could be due to your newSearcher/fir

CollapseComponent with MLT component

2010-10-07 Thread Amit Nithian
Few questions about the CollapseComponent: 1) From what I can tell in either SOLR-236, SOLR-1682, this component extends the QueryComponent which allows one to dedup when doing a normal search. Is there a concern about performance in the worst case that you have a bunch of docs with the same value

Re: CoreContainer Usage

2010-10-11 Thread Amit Nithian
ving it re-parse the configuration files). Any help would be appreciated. Thanks! Amit On Thu, Oct 7, 2010 at 10:07 AM, Amit Nithian wrote: > I am trying to understand the multicore setup of Solr more and saw > that SolrCore.getCore is deprecated in favor of > CoreContainer.getCore(nam

Re: Solr like for autocomplete field?

2010-11-02 Thread Amit Nithian
I implemented the edge ngrams solution and it's an awesome one compared to any other that I could think of because I can index more than just text (other metadata) that can be used to *rank* the autocomplete results eventually getting to rank by the probability of selection which is, after all, wha

Re: is there a way to prevent abusing rows parameter

2012-11-26 Thread Amit Nithian
If you're going to validate the rows parameter, may as well validate the start parameter too.. I've run into problems with start and rows with ridiculously high values crash our servers. On Thu, Nov 22, 2012 at 9:58 AM, solr-user wrote: > Thanks guys. This is a problem with the front end not v

Re: Search among multiple cores

2012-11-26 Thread Amit Nithian
You can simplify your code by searching across cores in the SearchComponent: 1) public class YourComponent implements SolrCoreAware --> Grab instance of CoreContainer and store (mCoreContainer = core.getCoreDescriptor().getCoreContainer();) 2) In the process method: * grab the core requested (SolrC

Re: Grouping by a date field

2012-11-29 Thread Amit Nithian
Why not create a new field that just contains the day component? Then you can group by this field. On Thu, Nov 29, 2012 at 12:38 PM, sdanzig wrote: > I'm trying to create a SOLR query that groups/field collapses by date. I > have a field in -MM-dd'T'HH:mm:ss'Z' format, "datetime", and I'm

Re: Grouping by a date field

2012-11-29 Thread Amit Nithian
ue&group.func=**rint(div(ms(date_dt),mul(24,** > mul(60,mul(60,1000) > > -- Jack Krupansky > > -Original Message- From: Amit Nithian > Sent: Thursday, November 29, 2012 10:29 PM > To: solr-user@lucene.apache.org > Subject: Re: Grouping by a date field >

group.ngroups behavior in response

2013-01-16 Thread Amit Nithian
Hi all, I recently discovered the group.main=true/false parameter which really has made life simple in terms of ensuring that the format coming out of Solr for my clients (RoR app) is backwards compatible with the non-grouped results which ensures no special "handle grouped results" logic. The on

Re: group.ngroups behavior in response

2013-01-17 Thread Amit Nithian
A new response attribute would be better but it also complicates the patch in that it would require a new way to serialize DocSlices I think (especially when group.main=true)? I was looking to set group.main=true so that my existing clients don't have to change to parse the grouped resultset format

Re: Solr HTTP Replication Question

2013-01-24 Thread Amit Nithian
ough to simply say a full copy is needed if the slave's index version is >= master's index version. I'll create a patch and file a bug along with a more thorough writeup of how I got in this state. Thanks! Amit On Thu, Jan 24, 2013 at 2:33 PM, Amit Nithian wrote: > Does Solr&#

Re: Solr HTTP Replication Question

2013-01-25 Thread Amit Nithian
Okay one last note... just for closure... looks like it was addressed in solr 4.1+ (I was looking at 4.0). On Thu, Jan 24, 2013 at 11:14 PM, Amit Nithian wrote: > Okay so after some debugging I found the problem. While the replication > piece will download the index from the master serv

Re: Boost Specific Phrase

2013-02-13 Thread Amit Nithian
Have you looked at the "pf" parameter for dismax handlers? pf does I think what you are looking for which is to boost documents with the query term exactly matching in the various fields with some phrase slop. On Wed, Feb 13, 2013 at 2:59 AM, Hemant Verma wrote: > Hi All > > I have a use case wi

Re: what do you use for testing relevance?

2013-02-13 Thread Amit Nithian
Ultimately this is dependent on what your metrics for success are. For some places it may be just raw CTR (did my click through rate increase) but for other places it may be a function of money (either it may be gross revenue, profits, # items sold etc). I don't know if there is a generic answer fo

Re: replication problems with solr4.1

2013-02-13 Thread Amit Nithian
So just a hunch... but when the slave downloads the data from the master, doesn't it do a commit to force solr to recognize the changes? In so doing, wouldn't that increase the generation number? In theory it shouldn't matter because the replication looks for files that are different to determine w

Re: replication problems with solr4.1

2013-02-13 Thread Amit Nithian
Okay so then that should explain the generation difference of 1 between the master and slave On Wed, Feb 13, 2013 at 10:26 AM, Mark Miller wrote: > > On Feb 13, 2013, at 1:17 PM, Amit Nithian wrote: > > > doesn't it do a commit to force solr to recognize the changes? > > yes. > > - Mark >

Re: Boost Specific Phrase

2013-02-13 Thread Amit Nithian
Ah yes sorry mis-understood. Another option is to use n-grams so that "projectmanager" is a term so any query involving "project manager in india with 2 years experience" would match higher because the query would contain "projectmanager" as a term. On Wed, Feb 13, 2013 at 9:56 PM, Hemant Verma w

Re: Anyone else see this error when running unit tests?

2013-02-14 Thread Amit Nithian
Okay so I think I found a solution if you are a maven user and don't mind forcing the test codec to Lucene40 then do the following: Add this to your pom.xml under the " " section org.apache.maven.plugins maven-surefire-plugin 2.13 -Dtests.codec=Lucene40 I

Re: replication problems with solr4.1

2013-02-14 Thread Amit Nithian
additional commit > - if the master is 2 or more generations ahead then do _no_ commit > OR > - if the master is 2 or more generations ahead then do a commit but don't > change generation and version of index > > Can this be true? > > I would say "not really&q

Re: Slaves always replicate entire index & Index versions

2013-02-21 Thread Amit Nithian
be added to the next release of Solr as this is a fairly significant bug to me. Cheers Amit On Thu, Feb 21, 2013 at 12:56 AM, Amit Nithian wrote: > So the diff in generation numbers are due to the commits I believe that > Solr does when it has the new index files but the fact that it's

Re: Slaves always replicate entire index & Index versions

2013-02-21 Thread Amit Nithian
Thanks for the links... I have updated SOLR-4471 with a proposed solution that I hope can be incorporated or amended so we can get a clean fix into the next version so our operations and network staff will be happier with not having gigs of data flying around the network :-) On Thu, Feb 21, 2013

Re: Slaves always replicate entire index & Index versions

2013-02-21 Thread Amit Nithian
Sounds good I am trying the combination of my patch and 4413 now to see how it works and will have to see if I can put unit tests around them as some of what I thought may not be true with respect to the commit generation numbers. For your issue above in your last post, is it possible that there w

Re: numFound is not correct while using Result Grouping

2013-02-25 Thread Amit Nithian
Yeah I had a similar problem. I filed and submitted this patch: https://issues.apache.org/jira/browse/SOLR-4310 Let me know if this is what you are looking for! Amit On Mon, Feb 25, 2013 at 1:50 PM, Teun Duynstee wrote: > Ah, I see. The docs say "Although this result format does not have as mu

Re: [ANN] vifun: tool to help visually tweak Solr boosting

2013-02-25 Thread Amit Nithian
This is cool! I had done something similar except changing via JConsole/JMX: https://issues.apache.org/jira/browse/SOLR-2306 We had something not as nice at Zvents but I wanted to expose these as MBean properties so you could change them via any JMX UI like JVisualVM Cheers! Amit On Mon, Feb 25

Re: numFound is not correct while using Result Grouping

2013-02-26 Thread Amit Nithian
I need to write some tests which I hope to do tonight and then I think it'll get into 4.2 On Tue, Feb 26, 2013 at 6:24 AM, Nicholas Ding wrote: > Thanks Amit, that's cool! So it will also be fixed on Solr 4.2, right? > > On Mon, Feb 25, 2013 at 6:04 PM, Amit Nithian wrote:

Re: Poll: SolrCloud vs. Master-Slave usage

2013-02-28 Thread Amit Nithian
I don't know a ton about SolrCloud but for our setup and my limited understanding of it is that you start to bleed operational and non-operational aspects together which I am not comfortable doing (i.e. software load balancing). Also adding ZooKeeper to the mix is yet another thing to install, setu

Re: Poll: SolrCloud vs. Master-Slave usage

2013-02-28 Thread Amit Nithian
ng monitoring or High > Availability/Disaster Recovery tools, then you might find the cost/benefit > analysis changing. > > Personally, I think it's ironic that the memory improvements that came > along _with_ SolrCloud make it less necessary to shard. Which means that > traditi

Re: Poll: SolrCloud vs. Master-Slave usage

2013-03-01 Thread Amit Nithian
without Solr Cloud, but > there would be no redundancy. > > Michael Della Bitta > > > Appinions > 18 East 41st Street, 2nd Floor > New York, NY 10017-6271 > > www.appinions.com > > Where Influence Isn’t a Game > >

Re: ping query frequency

2013-03-03 Thread Amit Nithian
We too run a ping every 5 seconds and I think the concurrent Mark/Sweep helps to avoid the LB from taking a box out of rotation due to long pauses. Either that or I don't see large enough pauses for my LB to take it out (it'd have to fail 3 times in a row or 15 seconds total before it's gone). The

Re: SOLR on hdfs

2013-03-06 Thread Amit Nithian
Why wouldn't SolrCloud help you here? You can setup shards and replicas etc to have redundancy b/c HDFS isn't designed to serve real time queries as far as I understand. If you are using HDFS as a backup mechanism to me you'd be better served having multiple slaves tethered to a master (in a non-cl

Re: SOLR on hdfs

2013-03-06 Thread Amit Nithian
ggested using the following > command > > hadoop fs -copyFromLocal URI > > Ok let me try out solrcloud as I will need to make sure it works well with > nutch too.. > > Thanks for the help.. > > > On Thu, Mar 7, 2013 at 5:47 AM, Amit Nithian wrote: > > > Why w

Editing long Solr URLs - Chrome Extension

2012-05-10 Thread Amit Nithian
Hey all, I don't know about you but most of the Solr URLs I issue are fairly lengthy full of parameters on the query string and browser location bars aren't long enough/have multi-line capabilities. I tried to find something that does this but couldn't so I wrote a chrome extension to help. Pleas

Re: Editing long Solr URLs - Chrome Extension

2012-05-15 Thread Amit Nithian
;fl", "qf" or "bf". > Would be nice if the edit box was multi-line, or perhaps adjusts to the > size of the content > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.facebook.com/Cominvent > Solr Training - www.solrtraining.com > &

Re: Editing long Solr URLs - Chrome Extension

2012-05-15 Thread Amit Nithian
case I messed up github, complex > params like the fq here: > > http://localhost:8983/solr/select?q=:&fq={!geofilt sfield=store > pt=52.67,7.30 d=5} > > aren't properly handled. > > But I'm already using it occasionally > > Erick > > On Tue, May 1

Re: Editing long Solr URLs - Chrome Extension

2012-05-19 Thread Amit Nithian
uot;tab" to edit the next row but it helps a bit in that problem. Please keep submitting issues as you encounter them and I'll address them as best as possible. I hope that this helps everyone! Thanks! Amit On Tue, May 15, 2012 at 6:20 PM, Amit Nithian wrote: > Erick > > Yes th

Re: Something like 'bf' or 'bq' with MoreLikeThis

2012-07-03 Thread Amit Nithian
I had a similar problem so I submitted this patch: https://issues.apache.org/jira/browse/SOLR-2351 I haven't applied this to trunk in a while but my goal was to ensure that bf parameters were passed down and respected by the MLT handler. Let me know if this works for you or not. If there is suffic

Re: difference between stored="false" and stored="true" ?

2012-07-03 Thread Amit Nithian
So couple questions on this (comment first then question): 1) I guess you can't have four combinations b/c index=false/stored=false has no meaning? 2) If you set less fields stored=true does this reduce the memory footprint for the document cache? Or better yet, I can store more documents in the ca

Re: How to improve this solr query?

2012-07-04 Thread Amit Nithian
Couple questions: 1) Why are you explicitly telling solr to sort by score desc, shouldn't it do that for you? Could this be a source of performance problems since sorting requires the loading of the field caches? 2) Of the query parameters, q1 and q2, which one is actually doing "text" searching on

Use of Solr as primary store for search engine

2012-07-04 Thread Amit Nithian
Hello all, I am curious to know how people are using Solr in conjunction with other data stores when building search engines to power web sites (say an ecommerce site). The question I have for the group is given an architecture where the primary (transactional) data store is MySQL (Oracle, PostGre

Re: Something like 'bf' or 'bq' with MoreLikeThis

2012-07-04 Thread Amit Nithian
No worries! What version of Solr are you using? One that you downloaded as a tarball or one that you checked out from SVN (trunk)? I'll take a bit of time and document steps and respond. I'll review the patch to see that it fits a general case. Question for you with MLT, are your users doing a bla

Re: Use of Solr as primary store for search engine

2012-07-04 Thread Amit Nithian
m the xwiki objects which pull from the > SQL database). However, it implied that we had to rewrite anything necessary > for the rendering, hence the rendering has not re-used that many code. > > Paul > > > Le 4 juil. 2012 à 09:54, Amit Nithian a écrit : > >> Hello al

Nrt and caching

2012-07-06 Thread Amit Nithian
Sorry I'm a bit new to the nrt stuff in solr but I'm trying to understand the implications of frequent commits and cache rebuilding and auto warming. What are the best practices surrounding nrt searching and caches and query performance. Thanks! Amit

Re: Nrt and caching

2012-07-07 Thread Amit Nithian
Thanks for the responses. I guess my specific question is if I had something which was dependent on the mapping between lucene document ids and some object primary key so i could pull in external data from another data source without a constant reindex, how would this get affected by soft and hard

Re: Running out of memory

2012-08-16 Thread Amit Nithian
I am debugging an out of memory error myself and a few suggestions: 1) Are you looking at your search logs around the time of the memory error? In my case, I found a few bad queries requesting a ton of rows (basically the whole index's worth which I think is an error somewhere in our app just have

Re: N-gram ranking based on term position

2012-09-07 Thread Amit Nithian
I think your thought about using the edge ngram as a field and boosting that field in the qf/pf sections of the dismax handler sounds reasonable. Why do you have qualms about it? On Fri, Sep 7, 2012 at 12:28 PM, Kiran Jayakumar wrote: > Hi, > > Is it possible to score documents with a match "earl

Re: Trouble Setting Up Development Environment

2012-09-09 Thread Amit Nithian
Sorry i'm really late to this so not sure if this is even an issue: 1) I found that there is an ant eclipse that makes it easy to setup the eclipse .project and .classpath (I think I had done this by hand in the tutorial) 2) Yes you can attach to a remote instance of Solr but your JVM has to have t

Re: XInclude Multiple Elements

2012-09-10 Thread Amit Nithian
Way back when I opened an issue about using XML entity includes in Solr as a way to break up the config. I have found problems with XInclude having multiple elements to include because the file is not well formed. From what I have read, if you make this well formed, you end up with a document that'

Re: Replication policy

2012-09-10 Thread Amit Nithian
If I understand you right, replication of data has 0 downtime, it just works and the data flows through from master to slaves. If you want, you can configure the replication to replicate configuration files across the cluster (although to me my deploy script does this). I'd recommend tweaking the

Re: solr.StrField with stored="true" useless or bad?

2012-09-11 Thread Amit Nithian
This is great thanks for this post! I was curious about the same thing and was wondering why "fl" couldn't return the "indexed" representation of a field if that field were only indexed but not stored. My thoughts were return something than nothing but I didn't pay attention to the fact that gettin

Re: Solr - Lucene Debuging help

2012-09-11 Thread Amit Nithian
The wiki should probably be updated.. maybe I'll take a stab at it. I'll also try and update my article referenced there too. When you checkout the project from SVN, do "ant eclipse" Look at this bug (https://issues.apache.org/jira/browse/SOLR-3817) and either run the ruby program or download the

Re: In-memory indexing

2012-09-11 Thread Amit Nithian
I have wondered about this too but instead why not just set your cache sizes large enough to house most/all of your documents and pre-warm the caches accordingly? My bet is that a large enough document cache may suffice but that's just a guess. - Amit On Mon, Sep 10, 2012 at 10:56 AM, Kiran Jayak

Re: Is it possible to do an "if" statement in a Solr query?

2012-09-12 Thread Amit Nithian
If the fact that it's "original" vs "generic" is a field "is_original" 0/1 can you sort by is_original? Similarly, could you put a huge boost on is_original in the dismax so that document matches on is_original score higher than those that aren't original? Or is your goal to not show generics *at a

Prevent Log and other math functions from returning "Infinity" and erroring out

2012-09-20 Thread Amit Nithian
Is there any reason why the log function shouldn't be modified to always take 1+the number being requested to be log'ed? Reason I ask is I am taking the log of the value output by another function which could return 0. For testing, I modified it to return 1 which works but would rather have the log

Re: AutoIndexing

2012-09-25 Thread Amit Nithian
There's a couple ways to accomplish this from easy to hard depending on your database schema: 1) Use DB trigger -> I don't like triggers too much b/c to me they couple your database layer with your application layer which leads to untestable and sometimes unmaintainable code -> Also it gets dif

Re: Query filtering

2012-09-27 Thread Amit Nithian
I think one way to do this is issue another query and set a bunch of filter queries to restrict "interesting_facet" to just those ten values returned in the first query. fq=interesting_facet:1 OR interesting_facet:2 etc&q=context: Does that help? Amit On Thu, Sep 27, 2012 at 6:33 AM, Finotti Sim

Re: Getting the distribution information of scores from query

2012-09-27 Thread Amit Nithian
gram into result named list > http://searchhub.org/dev/2012/02/10/advanced-filter-caching-in-solr/ > > On Tue, Sep 25, 2012 at 10:03 PM, Amit Nithian wrote: > >> We have a federated search product that issues multiple parallel >> queries to solr cores and fetches the re

Solr 4.0 and Maven SNAPSHOT artifacts

2012-10-04 Thread Amit Nithian
Is there a maven repository location that contains the nightly build Maven artifacts of Solr? Are SNAPSHOT releases being generated by Jenkins or anything so that when I re-resolve the dependencies I'd get the latest snapshot jars? Thanks Amit

Re: Getting list of operators and terms for a query

2012-10-04 Thread Amit Nithian
I think you'd want to start by looking at the rb.getQuery() in the prepare (or process if you are trying to do post-results analysis). This returns a Query object that would contain everything in that and I'd then look at the Javadoc to see how to traverse it. I'm sure some runtime type-casting may

Re: Getting list of operators and terms for a query

2012-10-04 Thread Amit Nithian
ion from the class > org.apache.lucene.search.Query > I can just iterate over the terms using the method extractTerms. How can I > extract the operators? > > 2012/10/4 Amit Nithian > >> I think you'd want to start by looking at the rb.getQuery() in the >> prepare (or

Re: Auto Correction?

2012-10-09 Thread Amit Nithian
What's preventing you from using the spell checker and take the #1 result and re-issue the query from a sub-class of the query component? It should be reasonably fast to re-execute the query from the server side since you are already within Solr. You can modify the response to indicate that the new

PostFilters, Grouping, Sorting Oh My!

2012-10-09 Thread Amit Nithian
Hi all, I've been working with using Solr's post filters/delegate collectors to collect some statistics about the scores of all the documents and had a few questions with regards to this when combined with grouping and sorting: 1) I noticed that if I don't include the "score" field as part of the

Re: Sum of scores for documents from a query.

2012-10-14 Thread Amit Nithian
Are you looking for the sum of the scores of each document in the result? In other words, if there were 1000 documents in the numFound but you only of course show 10 (or 0 depending on rows parameter) you want the sum of all the scores of 1000 documents in a separate section of the results? If so,

With Grouping enabled, 0 results yields maxScore of -Infinity

2012-10-15 Thread Amit Nithian
I see that when there are 0 results with the grouping enabled, the max score is -Infinity which causes parsing problems on my client. Without grouping enabled the max score is 0.0. Is there any particular reason for this difference? If not, would there be any resistance to submitting a patch that w

Re: maven artifact for solr-solrj-4.0.0

2012-10-18 Thread Amit Nithian
I am not sure if this repository https://repository.apache.org/content/repositories/releases/ works but the modification dates seem reasonable given the timing of the release. I suspect it'll be on maven central soon (hopefully) On Wed, Oct 17, 2012 at 11:13 PM, Grzegorz Sobczyk wrote: > Hello >

Benchmarking/Performance Testing question

2012-10-19 Thread Amit Nithian
Hi all, I know there have been many posts about this already and I have done my best to read through them but one lingering question remains. When doing performance testing on a Solr instance (under normal production like circumstances, not the ones where commits are happening more frequently than

Re: Easy question ? docs with empty geodata field

2012-10-19 Thread Amit Nithian
What about querying on the dynamic lat/long field to see if there are documents that do not have the dynamic _latlon0 or whatever defined? On Fri, Oct 19, 2012 at 8:17 AM, darul wrote: > I have already tried but get a nice exception because of this field type : > > > > > -- > View this message in

Re: Easy question ? docs with empty geodata field

2012-10-19 Thread Amit Nithian
So here is my spec for lat/long (similar to yours except I explicitly define the sub-field names for clarity) So then the query would be location_0_latLon:[ * TO *]. Looking at your schema, my guess would be: location_0_coordinate:[* TO *] location_1_coordinate:[* TO *] Let me know if that

Understanding Filter Queries

2012-10-20 Thread Amit Nithian
Hi all, Quick question. I've been reading up on the filter query and how it's implemented and the multiple articles I see keep referring to this notion of leap frogging and filter query execution in parallel with the main query. Question: Can someone point me to the code that does this so I can be

Re: Understanding Filter Queries

2012-10-20 Thread Amit Nithian
set intersection which is > supplied into filtered search call > https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1474 > > You are welcome. > > On Sun, Oct 21, 2012 at 12:00 AM, Amit Nithian wrote: > >> Hi al

Re: Understanding Filter Queries

2012-10-20 Thread Amit Nithian
0 docs hence why it never went down this leap frog approach in my debugging. Next question though is what is the significance of this < 100? Is this supposed to be a heuristic for determining the sparseness of the filter bit set? Thanks again Amit On Sat, Oct 20, 2012 at 7:12 PM, Amit Nithi

Re: Solr Partial word search in a sentance.

2012-10-20 Thread Amit Nithian
On the surface this looks like you could use the minimum should match feature of the dismax handler and alter that behavior depending on whether or not the search is your main search or your fallback search as you described in your (c) case. On Sat, Oct 20, 2012 at 1:13 AM, Uma Mahesh wrote: > Hi

Re: Monitor Deleted Event

2012-10-24 Thread Amit Nithian
I'm not 100% sure about this but looks like update processors may help? http://wiki.apache.org/solr/UpdateRequestProcessor It looks like you can put in custom code to execute when certain actions happen so sounds like this is what you are looking for. Cheers Amit On Wed, Oct 24, 2012 at 8:43 AM,

Re: Monitor Deleted Event

2012-10-24 Thread Amit Nithian
Since Lucene is a library there isn't much of a support for this since in theory the client application issuing the delete could also then do something else upon delete. solr on the other hand being a layer (a server layer) sitting on top of lucene, it makes sense for hooks to be configured there.

Re: Any way to by pass the checking on QueryElevationComponent

2012-10-28 Thread Amit Nithian
Is the goal to have the elevation data read from somewhere else? In other words, why don't you want the elevate.xml to exist locally? If you want to read the data from somewhere else, could you put a dummy elevate.xml locally and subclass the QueryElevationComponent and override the loadElevationM

  1   2   >