Re: Boosting documents by categorical preferences

2014-01-30 Thread Amit Nithian
Chris, Sounds good! Thanks for the tips.. I'll be glad to submit my talk to this as I have a writeup pretty much ready to go. Cheers Amit On Tue, Jan 28, 2014 at 11:24 AM, Chris Hostetter wrote: > > : The initial results seem to be kinda promising... of course there are > many > : more optimiz

Re: Boosting documents by categorical preferences

2014-01-27 Thread Amit Nithian
Hi Chris (and others interested in this), Sorry for dropping off.. I got sidetracked with other work and came back to this and finally got a V1 of this implemented. The final process is as follows: 1) Pre-compute the global categorical num_ratings/average/std-dev (so for Action the average rating

Re: Boosting documents by categorical preferences

2013-11-20 Thread Amit Nithian
I thought about that but my concern/question was how. If I used the pow function then I'm still boosting the bad categories by a small amount..alternatively I could multiply by a negative number but does that work as expected? I haven't done much with negative boosting except for the sledgehammer

Re: Boosting documents by categorical preferences

2013-11-18 Thread Amit Nithian
Hey Chris, Sorry for the delay and thanks for your response. This was inspired by your talk on boosting and biasing that you presented way back when at a meetup. I'm glad that my general approach seems to make sense. My approach was something like: 1) Look at the categories that the user has pref

Boosting documents by categorical preferences

2013-11-12 Thread Amit Nithian
Hi all, I have a question around boosting. I wanted to use the &boost= to write a nested query that will boost a document based on categorical preferences. For a movie search for example, say that a user likes drama, comedy, and action. I could use things like qq=&q={!boost%20b=$b%20defType=edis

Re: When is/should qf different from pf?

2013-10-28 Thread Amit Nithian
"when phrases aren't important in the fields". > If you're doing a simple boolean match, adding phrase fields will add > expense, to no good purpose etc. Phrases on numeric > fields seems wrong. > > FWIW, > Erick > > > On Mon, Oct 28, 2013 at 1:03 AM, Amit

When is/should qf different from pf?

2013-10-27 Thread Amit Nithian
Hi all, I have been using Solr for years but never really stopped to wonder: When using the dismax/edismax handler, when do you have the qf different from the pf? I have always set them to be the same (maybe different weights) but I was wondering if there is a situation where you would have a fi

Re: How to configure solr to our java project in eclipse

2013-10-27 Thread Amit Nithian
Try this: http://hokiesuns.blogspot.com/2010/01/setting-up-apache-solr-in-eclipse.html I use this today and it still works. If anything is outdated (as it's a relatively old post) let me know. I wrote this so ping me if you have any questions. Thanks Amit On Sun, Oct 27, 2013 at 7:33 PM, Amit A

Re: Restaurant availability from database

2013-05-23 Thread Amit Nithian
Hossman did a presentation on something similar to this using spatial data at a Solr meetup some months ago. http://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/ May be helpful to you. On Thu, May 23, 2013 at 9:40 AM, rajh wrote: > Thank you for your answer. > > Do you m

Re: Need solr query help

2013-05-14 Thread Amit Nithian
Is it possible instead to store in your solr index a bounding box of store location + delivery radius, do a bounding box intersection between your user's point + radius (as a bounding box) and the shop's delivery bounding box. If you want further precision, the frange may work assuming it's a post-

Re: writing a custom Filter plugin?

2013-05-14 Thread Amit Nithian
At first I thought you were referring to Filters in Lucene at query time (i.e. bitset filters) but I think you are referring to token filters at indexing/text analysis time? I have had success writing my own Filter as the link presents. The key is that you should write a custom class that extends

Re: how to skip test while building

2013-04-06 Thread Amit Nithian
If you generate the maven pom files you can do this I think by doing mvn -DskipTests=true. On Sat, Apr 6, 2013 at 7:25 AM, Erick Erickson wrote: > Don't know a good way to skip compiling the tests, but there isn't > any harm in compiling them... > > changing to the solr directory and just issui

Re: Sharing index amongst multiple nodes

2013-04-06 Thread Amit Nithian
I don't understand why this would be more performant.. seems like it'd be more memory and resource intensive as you'd have multiple class-loaders and multiple cache spaces for no good reason. Just have a single core with sufficiently large caches to handle your response needs. If you want to load

Re: Solr 4.2 single server limitations

2013-04-04 Thread Amit Nithian
There's a whole heap of information that is missing like what you plan on storing vs indexing and yes QPS too. My short answer is try with one server until it falls over then start adding more. When you say multiple-server setup do you mean multiple servers where each server acts as a slave storin

Re: do SearchComponents have access to response contents

2013-04-04 Thread Amit Nithian
"We need to also track the size of the response (as the size in bytes of the whole xml response tat is streamed, with stored fields and all). I was a bit worried cause I am wondering if a searchcomponent will actually have access to the response bytes..." ==> Can't you get this from your container

Re: SOLR on hdfs

2013-03-06 Thread Amit Nithian
ggested using the following > command > > hadoop fs -copyFromLocal URI > > Ok let me try out solrcloud as I will need to make sure it works well with > nutch too.. > > Thanks for the help.. > > > On Thu, Mar 7, 2013 at 5:47 AM, Amit Nithian wrote: > > > Why w

Re: SOLR on hdfs

2013-03-06 Thread Amit Nithian
Why wouldn't SolrCloud help you here? You can setup shards and replicas etc to have redundancy b/c HDFS isn't designed to serve real time queries as far as I understand. If you are using HDFS as a backup mechanism to me you'd be better served having multiple slaves tethered to a master (in a non-cl

Re: ping query frequency

2013-03-03 Thread Amit Nithian
We too run a ping every 5 seconds and I think the concurrent Mark/Sweep helps to avoid the LB from taking a box out of rotation due to long pauses. Either that or I don't see large enough pauses for my LB to take it out (it'd have to fail 3 times in a row or 15 seconds total before it's gone). The

Re: Poll: SolrCloud vs. Master-Slave usage

2013-03-01 Thread Amit Nithian
without Solr Cloud, but > there would be no redundancy. > > Michael Della Bitta > > > Appinions > 18 East 41st Street, 2nd Floor > New York, NY 10017-6271 > > www.appinions.com > > Where Influence Isn’t a Game > >

Re: Poll: SolrCloud vs. Master-Slave usage

2013-02-28 Thread Amit Nithian
ng monitoring or High > Availability/Disaster Recovery tools, then you might find the cost/benefit > analysis changing. > > Personally, I think it's ironic that the memory improvements that came > along _with_ SolrCloud make it less necessary to shard. Which means that > traditi

Re: Poll: SolrCloud vs. Master-Slave usage

2013-02-28 Thread Amit Nithian
I don't know a ton about SolrCloud but for our setup and my limited understanding of it is that you start to bleed operational and non-operational aspects together which I am not comfortable doing (i.e. software load balancing). Also adding ZooKeeper to the mix is yet another thing to install, setu

Re: numFound is not correct while using Result Grouping

2013-02-26 Thread Amit Nithian
I need to write some tests which I hope to do tonight and then I think it'll get into 4.2 On Tue, Feb 26, 2013 at 6:24 AM, Nicholas Ding wrote: > Thanks Amit, that's cool! So it will also be fixed on Solr 4.2, right? > > On Mon, Feb 25, 2013 at 6:04 PM, Amit Nithian wrote:

Re: [ANN] vifun: tool to help visually tweak Solr boosting

2013-02-25 Thread Amit Nithian
This is cool! I had done something similar except changing via JConsole/JMX: https://issues.apache.org/jira/browse/SOLR-2306 We had something not as nice at Zvents but I wanted to expose these as MBean properties so you could change them via any JMX UI like JVisualVM Cheers! Amit On Mon, Feb 25

Re: numFound is not correct while using Result Grouping

2013-02-25 Thread Amit Nithian
Yeah I had a similar problem. I filed and submitted this patch: https://issues.apache.org/jira/browse/SOLR-4310 Let me know if this is what you are looking for! Amit On Mon, Feb 25, 2013 at 1:50 PM, Teun Duynstee wrote: > Ah, I see. The docs say "Although this result format does not have as mu

Re: Slaves always replicate entire index & Index versions

2013-02-21 Thread Amit Nithian
Sounds good I am trying the combination of my patch and 4413 now to see how it works and will have to see if I can put unit tests around them as some of what I thought may not be true with respect to the commit generation numbers. For your issue above in your last post, is it possible that there w

Re: Slaves always replicate entire index & Index versions

2013-02-21 Thread Amit Nithian
Thanks for the links... I have updated SOLR-4471 with a proposed solution that I hope can be incorporated or amended so we can get a clean fix into the next version so our operations and network staff will be happier with not having gigs of data flying around the network :-) On Thu, Feb 21, 2013

Re: Slaves always replicate entire index & Index versions

2013-02-21 Thread Amit Nithian
be added to the next release of Solr as this is a fairly significant bug to me. Cheers Amit On Thu, Feb 21, 2013 at 12:56 AM, Amit Nithian wrote: > So the diff in generation numbers are due to the commits I believe that > Solr does when it has the new index files but the fact that it's

Re: replication problems with solr4.1

2013-02-14 Thread Amit Nithian
additional commit > - if the master is 2 or more generations ahead then do _no_ commit > OR > - if the master is 2 or more generations ahead then do a commit but don't > change generation and version of index > > Can this be true? > > I would say "not really&q

Re: Anyone else see this error when running unit tests?

2013-02-14 Thread Amit Nithian
Okay so I think I found a solution if you are a maven user and don't mind forcing the test codec to Lucene40 then do the following: Add this to your pom.xml under the " " section org.apache.maven.plugins maven-surefire-plugin 2.13 -Dtests.codec=Lucene40 I

Re: Boost Specific Phrase

2013-02-13 Thread Amit Nithian
Ah yes sorry mis-understood. Another option is to use n-grams so that "projectmanager" is a term so any query involving "project manager in india with 2 years experience" would match higher because the query would contain "projectmanager" as a term. On Wed, Feb 13, 2013 at 9:56 PM, Hemant Verma w

Re: replication problems with solr4.1

2013-02-13 Thread Amit Nithian
Okay so then that should explain the generation difference of 1 between the master and slave On Wed, Feb 13, 2013 at 10:26 AM, Mark Miller wrote: > > On Feb 13, 2013, at 1:17 PM, Amit Nithian wrote: > > > doesn't it do a commit to force solr to recognize the changes? > > yes. > > - Mark >

Re: replication problems with solr4.1

2013-02-13 Thread Amit Nithian
So just a hunch... but when the slave downloads the data from the master, doesn't it do a commit to force solr to recognize the changes? In so doing, wouldn't that increase the generation number? In theory it shouldn't matter because the replication looks for files that are different to determine w

Re: what do you use for testing relevance?

2013-02-13 Thread Amit Nithian
Ultimately this is dependent on what your metrics for success are. For some places it may be just raw CTR (did my click through rate increase) but for other places it may be a function of money (either it may be gross revenue, profits, # items sold etc). I don't know if there is a generic answer fo

Re: Boost Specific Phrase

2013-02-13 Thread Amit Nithian
Have you looked at the "pf" parameter for dismax handlers? pf does I think what you are looking for which is to boost documents with the query term exactly matching in the various fields with some phrase slop. On Wed, Feb 13, 2013 at 2:59 AM, Hemant Verma wrote: > Hi All > > I have a use case wi

Re: Solr HTTP Replication Question

2013-01-25 Thread Amit Nithian
Okay one last note... just for closure... looks like it was addressed in solr 4.1+ (I was looking at 4.0). On Thu, Jan 24, 2013 at 11:14 PM, Amit Nithian wrote: > Okay so after some debugging I found the problem. While the replication > piece will download the index from the master serv

Re: Solr HTTP Replication Question

2013-01-24 Thread Amit Nithian
ough to simply say a full copy is needed if the slave's index version is >= master's index version. I'll create a patch and file a bug along with a more thorough writeup of how I got in this state. Thanks! Amit On Thu, Jan 24, 2013 at 2:33 PM, Amit Nithian wrote: > Does Solr&#

Re: group.ngroups behavior in response

2013-01-17 Thread Amit Nithian
A new response attribute would be better but it also complicates the patch in that it would require a new way to serialize DocSlices I think (especially when group.main=true)? I was looking to set group.main=true so that my existing clients don't have to change to parse the grouped resultset format

group.ngroups behavior in response

2013-01-16 Thread Amit Nithian
Hi all, I recently discovered the group.main=true/false parameter which really has made life simple in terms of ensuring that the format coming out of Solr for my clients (RoR app) is backwards compatible with the non-grouped results which ensures no special "handle grouped results" logic. The on

Re: Grouping by a date field

2012-11-29 Thread Amit Nithian
ue&group.func=**rint(div(ms(date_dt),mul(24,** > mul(60,mul(60,1000) > > -- Jack Krupansky > > -Original Message- From: Amit Nithian > Sent: Thursday, November 29, 2012 10:29 PM > To: solr-user@lucene.apache.org > Subject: Re: Grouping by a date field >

Re: Grouping by a date field

2012-11-29 Thread Amit Nithian
Why not create a new field that just contains the day component? Then you can group by this field. On Thu, Nov 29, 2012 at 12:38 PM, sdanzig wrote: > I'm trying to create a SOLR query that groups/field collapses by date. I > have a field in -MM-dd'T'HH:mm:ss'Z' format, "datetime", and I'm

Re: Search among multiple cores

2012-11-26 Thread Amit Nithian
You can simplify your code by searching across cores in the SearchComponent: 1) public class YourComponent implements SolrCoreAware --> Grab instance of CoreContainer and store (mCoreContainer = core.getCoreDescriptor().getCoreContainer();) 2) In the process method: * grab the core requested (SolrC

Re: is there a way to prevent abusing rows parameter

2012-11-26 Thread Amit Nithian
If you're going to validate the rows parameter, may as well validate the start parameter too.. I've run into problems with start and rows with ridiculously high values crash our servers. On Thu, Nov 22, 2012 at 9:58 AM, solr-user wrote: > Thanks guys. This is a problem with the front end not v

Re: 4.0 query question

2012-11-11 Thread Amit Nithian
Why not group by cid using the grouping component, within the group sort by version descending and return 1 result per group. http://wiki.apache.org/solr/FieldCollapsing Cheers Amit On Fri, Nov 9, 2012 at 2:56 PM, dm_tim wrote: > I think I may have found my answer buy I'd like additional vali

Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-11 Thread Amit Nithian
e should document when that moment is today in terms of an > explicit contract. It sounds like the problem is that the contract is > either nonexistent, vague, ambiguous, non-deterministic, or whatever. > > -- Jack Krupansky > > -Original Message- From: Amit Nithian >

Re: custom request handler

2012-11-11 Thread Amit Nithian
query > request handlers defined. The above approach is to prevent abusive / in > appropriate queries by clients. A query component sounds interesting would > this be implemented through an interface so could be separate from solr or > would it be sub classing a base component ? > > che

Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-10 Thread Amit Nithian
k Erickson wrote: > Hmmm, rather than hit the ping query, why not just send in a real query and > only let the queued ones through after the response? > > Just a random thought > Erick > > > On Sat, Nov 10, 2012 at 2:53 PM, Amit Nithian wrote: > > > Yes but the

Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-10 Thread Amit Nithian
Yes but the problem is that if user facing queries are hitting a server that is warming up and isn't being serviced quickly, then you could potentially bring down your site if all the front end threads are blocked on Solr queries b/c those queries are waiting (presumably at the container level sinc

Re: Splitting data into an array / lookup

2012-11-09 Thread Amit Nithian
Why not just do the join in the DB via your initial query? You'll be executing 1 query per *each* ID in your list which is expensive in your sub-entity. If you just have your query do the joins up front then each row could be a complete (or nearly complete) document? On Thu, Nov 8, 2012 at 9:31 A

Re: custom request handler

2012-11-09 Thread Amit Nithian
oped and mainatined every time would be over kill. > Again though I may have missed a point / over emphasised a difficulty? > > Are you saying my custom request handler is to tightly bound to solr? so > the parameters my apps talk is not de-coupled enough from solr? > > Lee C &

Re: My latest solr blog post on Solr's PostFiltering

2012-11-09 Thread Amit Nithian
link, I get to your "Random Writings", but > it > > tells me that the blog post doesn't exist... > > > > Erick > > > > > > On Thu, Nov 8, 2012 at 4:21 PM, Amit Nithian wrote: > > > > > Hey all, > > > > > >

Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Amit Nithian
Hi Aaron, Check out http://lucene.apache.org/solr/api-4_0_0-BETA/org/apache/solr/handler/PingRequestHandler.html You'll see the ?action=enable/disable. I have our load balancers remove the server out of rotation when the response code != 200 for some number of times in a row which I suspect you ar

Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Amit Nithian
he > logs, solr starts responding to admin/ping requests before firstSearcher > completes, and, the LB then puts the solr instance back in the pool, and it > starts accepting connections... > > > On Thu, Nov 8, 2012 at 4:24 PM, Amit Nithian wrote: > > > I think Solr do

Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Amit Nithian
I think Solr does this by default and are you executing warming queries in the firstSearcher so that these actions are done before Solr is ready to accept real queries? On Thu, Nov 8, 2012 at 11:54 AM, Aaron Daubman wrote: > Greetings, > > I have several custom QueryComponents that have high on

Re: Searching for Partial Words

2012-11-08 Thread Amit Nithian
Look at the normal ngram tokenizer. "Engine" with ngram size 3 would yield "eng" "ngi" "gin" "ine" so a search for engi should match. You can play around with the min/max values. Edge ngram is useful for prefix matching but sounds like you want intra-word matching too? ("eng" should match " Residen

Re: is it possible to save the search query?

2012-11-08 Thread Amit Nithian
Are you trying to do this in real time or offlline? Wouldn't mining your access logs help? It may help to have your front end application pass in some extra parameters that are not interpreted by Solr but are there for "stamping" purposes for log analysis. One example could be a user id or user coo

Re: custom request handler

2012-11-07 Thread Amit Nithian
Why not do this in a ServletFilter? Alternatively, I'd just write a front end application servlet to do this so that you don't firewall your internal admins off from accessing the core Solr admin pages. I guess you could solve this using some form of security but I don't know this well enough. If

Re: Dynamic core selection

2012-11-02 Thread Amit Nithian
I have done something similar in a search component for our search at Zvents.com. Our use case is where we have a core that invokes searches in other cores and merges the results together Basically we have: 1) FederatedComponent implements SolrCoreAware --> Grab instance of CoreContainer and store

Re: Urgent Help Needed: Solr Data import problem

2012-10-30 Thread Amit Nithian
his or ti add the password into the connection >> string >> > >> > e.g. readonly:[yourpassword]@'10.86.29.32' >> > >> > >> > >> > >> 'readonly'@'10.86.29.32' >> > >> (using password: NO)&

Re: Urgent Help Needed: Solr Data import problem

2012-10-29 Thread Amit Nithian
This looks like a MySQL permissions problem and not a Solr problem. "Caused by: java.sql.SQLException: Access denied for user 'readonly'@'10.86.29.32' (using password: NO)" I'd advise reading your stack traces a bit more carefully. You should check your permissions or if you don't own the DB, chec

Re: Any way to by pass the checking on QueryElevationComponent

2012-10-28 Thread Amit Nithian
Is the goal to have the elevation data read from somewhere else? In other words, why don't you want the elevate.xml to exist locally? If you want to read the data from somewhere else, could you put a dummy elevate.xml locally and subclass the QueryElevationComponent and override the loadElevationM

Re: Monitor Deleted Event

2012-10-24 Thread Amit Nithian
Since Lucene is a library there isn't much of a support for this since in theory the client application issuing the delete could also then do something else upon delete. solr on the other hand being a layer (a server layer) sitting on top of lucene, it makes sense for hooks to be configured there.

Re: Monitor Deleted Event

2012-10-24 Thread Amit Nithian
I'm not 100% sure about this but looks like update processors may help? http://wiki.apache.org/solr/UpdateRequestProcessor It looks like you can put in custom code to execute when certain actions happen so sounds like this is what you are looking for. Cheers Amit On Wed, Oct 24, 2012 at 8:43 AM,

Re: Solr Partial word search in a sentance.

2012-10-20 Thread Amit Nithian
On the surface this looks like you could use the minimum should match feature of the dismax handler and alter that behavior depending on whether or not the search is your main search or your fallback search as you described in your (c) case. On Sat, Oct 20, 2012 at 1:13 AM, Uma Mahesh wrote: > Hi

Re: Understanding Filter Queries

2012-10-20 Thread Amit Nithian
0 docs hence why it never went down this leap frog approach in my debugging. Next question though is what is the significance of this < 100? Is this supposed to be a heuristic for determining the sparseness of the filter bit set? Thanks again Amit On Sat, Oct 20, 2012 at 7:12 PM, Amit Nithi

Re: Understanding Filter Queries

2012-10-20 Thread Amit Nithian
set intersection which is > supplied into filtered search call > https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1474 > > You are welcome. > > On Sun, Oct 21, 2012 at 12:00 AM, Amit Nithian wrote: > >> Hi al

Understanding Filter Queries

2012-10-20 Thread Amit Nithian
Hi all, Quick question. I've been reading up on the filter query and how it's implemented and the multiple articles I see keep referring to this notion of leap frogging and filter query execution in parallel with the main query. Question: Can someone point me to the code that does this so I can be

Re: Easy question ? docs with empty geodata field

2012-10-19 Thread Amit Nithian
So here is my spec for lat/long (similar to yours except I explicitly define the sub-field names for clarity) So then the query would be location_0_latLon:[ * TO *]. Looking at your schema, my guess would be: location_0_coordinate:[* TO *] location_1_coordinate:[* TO *] Let me know if that

Re: Easy question ? docs with empty geodata field

2012-10-19 Thread Amit Nithian
What about querying on the dynamic lat/long field to see if there are documents that do not have the dynamic _latlon0 or whatever defined? On Fri, Oct 19, 2012 at 8:17 AM, darul wrote: > I have already tried but get a nice exception because of this field type : > > > > > -- > View this message in

Benchmarking/Performance Testing question

2012-10-19 Thread Amit Nithian
Hi all, I know there have been many posts about this already and I have done my best to read through them but one lingering question remains. When doing performance testing on a Solr instance (under normal production like circumstances, not the ones where commits are happening more frequently than

Re: maven artifact for solr-solrj-4.0.0

2012-10-18 Thread Amit Nithian
I am not sure if this repository https://repository.apache.org/content/repositories/releases/ works but the modification dates seem reasonable given the timing of the release. I suspect it'll be on maven central soon (hopefully) On Wed, Oct 17, 2012 at 11:13 PM, Grzegorz Sobczyk wrote: > Hello >

With Grouping enabled, 0 results yields maxScore of -Infinity

2012-10-15 Thread Amit Nithian
I see that when there are 0 results with the grouping enabled, the max score is -Infinity which causes parsing problems on my client. Without grouping enabled the max score is 0.0. Is there any particular reason for this difference? If not, would there be any resistance to submitting a patch that w

Re: Sum of scores for documents from a query.

2012-10-14 Thread Amit Nithian
Are you looking for the sum of the scores of each document in the result? In other words, if there were 1000 documents in the numFound but you only of course show 10 (or 0 depending on rows parameter) you want the sum of all the scores of 1000 documents in a separate section of the results? If so,

PostFilters, Grouping, Sorting Oh My!

2012-10-09 Thread Amit Nithian
Hi all, I've been working with using Solr's post filters/delegate collectors to collect some statistics about the scores of all the documents and had a few questions with regards to this when combined with grouping and sorting: 1) I noticed that if I don't include the "score" field as part of the

Re: Auto Correction?

2012-10-09 Thread Amit Nithian
What's preventing you from using the spell checker and take the #1 result and re-issue the query from a sub-class of the query component? It should be reasonably fast to re-execute the query from the server side since you are already within Solr. You can modify the response to indicate that the new

Re: Getting list of operators and terms for a query

2012-10-04 Thread Amit Nithian
ion from the class > org.apache.lucene.search.Query > I can just iterate over the terms using the method extractTerms. How can I > extract the operators? > > 2012/10/4 Amit Nithian > >> I think you'd want to start by looking at the rb.getQuery() in the >> prepare (or

Re: Getting list of operators and terms for a query

2012-10-04 Thread Amit Nithian
I think you'd want to start by looking at the rb.getQuery() in the prepare (or process if you are trying to do post-results analysis). This returns a Query object that would contain everything in that and I'd then look at the Javadoc to see how to traverse it. I'm sure some runtime type-casting may

Solr 4.0 and Maven SNAPSHOT artifacts

2012-10-04 Thread Amit Nithian
Is there a maven repository location that contains the nightly build Maven artifacts of Solr? Are SNAPSHOT releases being generated by Jenkins or anything so that when I re-resolve the dependencies I'd get the latest snapshot jars? Thanks Amit

Re: Getting the distribution information of scores from query

2012-09-27 Thread Amit Nithian
gram into result named list > http://searchhub.org/dev/2012/02/10/advanced-filter-caching-in-solr/ > > On Tue, Sep 25, 2012 at 10:03 PM, Amit Nithian wrote: > >> We have a federated search product that issues multiple parallel >> queries to solr cores and fetches the re

Re: Query filtering

2012-09-27 Thread Amit Nithian
I think one way to do this is issue another query and set a bunch of filter queries to restrict "interesting_facet" to just those ten values returned in the first query. fq=interesting_facet:1 OR interesting_facet:2 etc&q=context: Does that help? Amit On Thu, Sep 27, 2012 at 6:33 AM, Finotti Sim

Re: AutoIndexing

2012-09-25 Thread Amit Nithian
There's a couple ways to accomplish this from easy to hard depending on your database schema: 1) Use DB trigger -> I don't like triggers too much b/c to me they couple your database layer with your application layer which leads to untestable and sometimes unmaintainable code -> Also it gets dif

Prevent Log and other math functions from returning "Infinity" and erroring out

2012-09-20 Thread Amit Nithian
Is there any reason why the log function shouldn't be modified to always take 1+the number being requested to be log'ed? Reason I ask is I am taking the log of the value output by another function which could return 0. For testing, I modified it to return 1 which works but would rather have the log

Re: Is it possible to do an "if" statement in a Solr query?

2012-09-12 Thread Amit Nithian
If the fact that it's "original" vs "generic" is a field "is_original" 0/1 can you sort by is_original? Similarly, could you put a huge boost on is_original in the dismax so that document matches on is_original score higher than those that aren't original? Or is your goal to not show generics *at a

Re: In-memory indexing

2012-09-11 Thread Amit Nithian
I have wondered about this too but instead why not just set your cache sizes large enough to house most/all of your documents and pre-warm the caches accordingly? My bet is that a large enough document cache may suffice but that's just a guess. - Amit On Mon, Sep 10, 2012 at 10:56 AM, Kiran Jayak

Re: Solr - Lucene Debuging help

2012-09-11 Thread Amit Nithian
The wiki should probably be updated.. maybe I'll take a stab at it. I'll also try and update my article referenced there too. When you checkout the project from SVN, do "ant eclipse" Look at this bug (https://issues.apache.org/jira/browse/SOLR-3817) and either run the ruby program or download the

Re: solr.StrField with stored="true" useless or bad?

2012-09-11 Thread Amit Nithian
This is great thanks for this post! I was curious about the same thing and was wondering why "fl" couldn't return the "indexed" representation of a field if that field were only indexed but not stored. My thoughts were return something than nothing but I didn't pay attention to the fact that gettin

Re: Replication policy

2012-09-10 Thread Amit Nithian
If I understand you right, replication of data has 0 downtime, it just works and the data flows through from master to slaves. If you want, you can configure the replication to replicate configuration files across the cluster (although to me my deploy script does this). I'd recommend tweaking the

Re: XInclude Multiple Elements

2012-09-10 Thread Amit Nithian
Way back when I opened an issue about using XML entity includes in Solr as a way to break up the config. I have found problems with XInclude having multiple elements to include because the file is not well formed. From what I have read, if you make this well formed, you end up with a document that'

Re: Trouble Setting Up Development Environment

2012-09-09 Thread Amit Nithian
Sorry i'm really late to this so not sure if this is even an issue: 1) I found that there is an ant eclipse that makes it easy to setup the eclipse .project and .classpath (I think I had done this by hand in the tutorial) 2) Yes you can attach to a remote instance of Solr but your JVM has to have t

Re: N-gram ranking based on term position

2012-09-07 Thread Amit Nithian
I think your thought about using the edge ngram as a field and boosting that field in the qf/pf sections of the dismax handler sounds reasonable. Why do you have qualms about it? On Fri, Sep 7, 2012 at 12:28 PM, Kiran Jayakumar wrote: > Hi, > > Is it possible to score documents with a match "earl

Re: Running out of memory

2012-08-16 Thread Amit Nithian
I am debugging an out of memory error myself and a few suggestions: 1) Are you looking at your search logs around the time of the memory error? In my case, I found a few bad queries requesting a ton of rows (basically the whole index's worth which I think is an error somewhere in our app just have

Re: Nrt and caching

2012-07-07 Thread Amit Nithian
Thanks for the responses. I guess my specific question is if I had something which was dependent on the mapping between lucene document ids and some object primary key so i could pull in external data from another data source without a constant reindex, how would this get affected by soft and hard

Nrt and caching

2012-07-06 Thread Amit Nithian
Sorry I'm a bit new to the nrt stuff in solr but I'm trying to understand the implications of frequent commits and cache rebuilding and auto warming. What are the best practices surrounding nrt searching and caches and query performance. Thanks! Amit

Re: Use of Solr as primary store for search engine

2012-07-04 Thread Amit Nithian
m the xwiki objects which pull from the > SQL database). However, it implied that we had to rewrite anything necessary > for the rendering, hence the rendering has not re-used that many code. > > Paul > > > Le 4 juil. 2012 à 09:54, Amit Nithian a écrit : > >> Hello al

Re: Something like 'bf' or 'bq' with MoreLikeThis

2012-07-04 Thread Amit Nithian
No worries! What version of Solr are you using? One that you downloaded as a tarball or one that you checked out from SVN (trunk)? I'll take a bit of time and document steps and respond. I'll review the patch to see that it fits a general case. Question for you with MLT, are your users doing a bla

Use of Solr as primary store for search engine

2012-07-04 Thread Amit Nithian
Hello all, I am curious to know how people are using Solr in conjunction with other data stores when building search engines to power web sites (say an ecommerce site). The question I have for the group is given an architecture where the primary (transactional) data store is MySQL (Oracle, PostGre

Re: How to improve this solr query?

2012-07-04 Thread Amit Nithian
Couple questions: 1) Why are you explicitly telling solr to sort by score desc, shouldn't it do that for you? Could this be a source of performance problems since sorting requires the loading of the field caches? 2) Of the query parameters, q1 and q2, which one is actually doing "text" searching on

Re: difference between stored="false" and stored="true" ?

2012-07-03 Thread Amit Nithian
So couple questions on this (comment first then question): 1) I guess you can't have four combinations b/c index=false/stored=false has no meaning? 2) If you set less fields stored=true does this reduce the memory footprint for the document cache? Or better yet, I can store more documents in the ca

Re: Something like 'bf' or 'bq' with MoreLikeThis

2012-07-03 Thread Amit Nithian
I had a similar problem so I submitted this patch: https://issues.apache.org/jira/browse/SOLR-2351 I haven't applied this to trunk in a while but my goal was to ensure that bf parameters were passed down and respected by the MLT handler. Let me know if this works for you or not. If there is suffic

Re: Editing long Solr URLs - Chrome Extension

2012-05-19 Thread Amit Nithian
uot;tab" to edit the next row but it helps a bit in that problem. Please keep submitting issues as you encounter them and I'll address them as best as possible. I hope that this helps everyone! Thanks! Amit On Tue, May 15, 2012 at 6:20 PM, Amit Nithian wrote: > Erick > > Yes th

Re: Editing long Solr URLs - Chrome Extension

2012-05-15 Thread Amit Nithian
case I messed up github, complex > params like the fq here: > > http://localhost:8983/solr/select?q=:&fq={!geofilt sfield=store > pt=52.67,7.30 d=5} > > aren't properly handled. > > But I'm already using it occasionally > > Erick > > On Tue, May 1

  1   2   >