Re: Facing problem with the FieldType of UniqueField

2008-02-14 Thread Rishabh Joshi
Ryan, Using the KeywordTokenizer does not help. And there are not any spaces in the unique keys. the keys are alpha numeric. E.g.: AA-23-E1 Regards, Rishabh On Thu, Feb 14, 2008 at 10:28 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > > I noticed this happened because the field type was "strin

Facing problem with the FieldType of UniqueField

2008-02-14 Thread Rishabh Joshi
Hi, Initially, in my schema, I had my uniqueField's field type as "string" and everything was working fine. But, then the users of my application wanted to search on the unique field and entered values which were in a different case than what was indexed. They never got proper results, at times, n

Restrict values in a multivalued field

2008-01-12 Thread Rishabh Joshi
Hi, In my schema I have a multivalued field, and the values of that field are "stored" and "indexed" in the index. I wanted to know if its possible to restrict the number of multiple values being returned from that field, on a search? And how? Because, lets say, if I have thousands of values in th

How to perform a double query in one

2008-01-02 Thread Rishabh Joshi
Hi, Is there a way to perform 2 search queries in one search request, and then return their combined results? Currently I am performing the following: I have a document which consists of "id" field which is the unique identifier, the "info" field, and an "xid" field which contains the ids of oth

Re: Retrieving Tokens

2007-12-19 Thread Rishabh Joshi
ards, Rishabh On Dec 19, 2007 10:02 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Dec 19, 2007 10:59 AM, Rishabh Joshi <[EMAIL PROTECTED]> wrote: > > I have created my own Tokenizer and I am indexing the documents using > the > > same. > > > > I

Retrieving Tokens

2007-12-19 Thread Rishabh Joshi
Hi, I have created my own Tokenizer and I am indexing the documents using the same. I wanted to know if there is a way to retrieve the tokens (created by my custom tokenizer) from the index. Do we have to modify the code to get these tokens? Regards, Rishabh

Creating user-defined field types

2007-12-11 Thread Rishabh Joshi
Hi, Can anyone guide me as to how one can go on to implement a user defined field types in solr? I could not find anything on the solr-wiki. Help of any kind would be appreciated. Regards, Rishabh

Re: How to store a HashSet in the index?

2007-12-10 Thread Rishabh Joshi
Thanks Eric! Rishabh On Dec 10, 2007 3:30 PM, Erik Hatcher <[EMAIL PROTECTED]> wrote: > On Dec 10, 2007, at 3:10 AM, Rishabh Joshi wrote: > > Can anyone help me on, as to how I can go about "efficiently" indexing > > (actually, storing in the index) and retrievi

How to store a HashSet in the index?

2007-12-10 Thread Rishabh Joshi
Hi, Can anyone help me on, as to how I can go about "efficiently" indexing (actually, storing in the index) and retrieving, a HashSet object, which contains multiple string arrays? I just want to store the HashSet in the index, and not search on it. The HashSet should be returned with the document

Re: Strange behavior MoreLikeThis Feature

2007-11-22 Thread Rishabh Joshi
Thanks Ryan. I now know the reason why. Before I explain the reason, let me correct the mistake I made in my earlier mail. I was not using the first document mentioned in the xml . Instead it was this one: IW-02 iPod & iPod Mini USB 2.0 Cable Belkin electronics connector car power adap

Re: Near Duplicate Documents

2007-11-21 Thread Rishabh Joshi
ry interesting. I'm not sure if you can implement the > algorithm because they have patented it. That said, there are plenty > literature on near dup detection so you should be able to get one for > free! > > On Nov 21, 2007 6:57 PM, Rishabh Joshi <[EMAIL PROTECTED]> wrote:

Re: Near Duplicate Documents

2007-11-20 Thread Rishabh Joshi
Otis, Thanks for your response. I just gave a quick look to the Nutch Forum and find that there is an implementation to obtain de-duplicate documents/pages but none for Near Duplicates documents. Can you guide me a little further as to where exactly under Nutch I should be concentrating, regardin

rows=VERY_LARGE_VALUE throws exception, and error in some cases

2007-11-20 Thread Rishabh Joshi
Hi, We are using Solr 1.2 for our project and have come across the following exception and error: Exception: SEVERE: java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.util.PriorityQueue.initialize (PriorityQueue.java :36) Steps to reproduce: 1. Restart your Web Server. 2. Ente

Re: Performance of Solr on different Platforms

2007-11-20 Thread Rishabh Joshi
Eswar, This link would give you a fair idea of how Solr is used by some of the sites/companies - http://wiki.apache.org/solr/SolrPerformanceData Rishabh On Nov 20, 2007 10:49 AM, Eswar K <[EMAIL PROTECTED]> wrote: > In our case, the load is kind of distributed. On an average, the QPS could > be

Near Duplicate Documents

2007-11-16 Thread Rishabh Joshi
Hi, I am evaluating "Solr 1.2" for my project and wanted to know if it can return near duplicate documents (near dups) and how do i go about it? I am not sure, but is "MoreLikeThisHandler" the implementation for near dups? Rishabh

RE: Best way to create multiple indexes

2007-11-12 Thread Rishabh Joshi
e documents are categorized > into many 'groups' and 'sub-groups'. I wanted to know if we can create > multiple indexes based on 'groups' and then on 'sub-groups' in Solr? If yes, > then how do we go about it? I tried going through the section on > 'Collections' in the Solr Wiki, but could not make much use of it. > > Regards, > Rishabh Joshi > > > > >