from:"\"Pranav Prakash\""

Solr 3.3. Grouping vs DeDuplication and Deduplication Use Case

2011-08-29 Thread Pranav Prakash

later (and gets added later to index). AFAIK, Deduplication targets index time. Is there a means I can specify the original which should be returned and the duplicates which could be removed from coming up.? *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pran

How To Implement Sweet Spot Similarity?

2011-09-16 Thread Pranav Prakash

are the good approaches for figuring out sweet spots? Can a combination of multiple Similarity Classes be used? Any information would be so appreciated. *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google

java.io.CharConversionException While Indexing in Solr 3.4

2011-09-19 Thread Pranav Prakash

around, I see issue https://issues.apache.org/jira/browse/SOLR-2381 which seems to fix the issue. I thought this patch is already applied to Solr 3.4.0. Is there something I am missing? Is there anything else I need to mention? Logs/ My document details etc.? *Pranav Prakash* "temet

Re: java.io.CharConversionException While Indexing in Solr 3.4

2011-09-19 Thread Pranav Prakash

there a setting so I can change the level of backtrace? This would be helpful in showing the complete stack instead of 26 more ... *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profi

Re: java.io.CharConversionException While Indexing in Solr 3.4

2011-09-20 Thread Pranav Prakash

file to test Solr's behavior towards UTF-8 chars. Great wok Solr team, and special thanks to Erik Hatcher. *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny>

Re: Stemming and other tokenizers

2011-09-20 Thread Pranav Prakash

(which is expected in Solr 3.5) The point where I am unclear is, how do I specify at Index time, to use a certain field for a certain language? *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http:

StopWords coming in Top 10 terms despite using StopFilterFactory

2011-09-22 Thread Pranav Prakash

Hi List, I included StopFilterFactory and I can see it taking action in the Analyzer Interface. However, when I go to Schema Analyzer, I see those stop words in the top 10 terms. Is this normal? *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavpr

Re: StopWords coming in Top 10 terms despite using StopFilterFactory

2011-09-23 Thread Pranav Prakash

ng, then I'm not sure what's going on, and some cut/paste of > what you're actually seeing might be in order. term frequencyto 26164and 25804the 25566of 25022a 24918in 24590for 23646n23588 with 23055is 22510 > Did you do delete and do a full reindex after you changed y

Can't use ms() function on non-numeric legacy date field

2011-09-27 Thread Pranav Prakash

field created_at * * *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny>

Suggestions on how to perform infrastructure migration from 1.4 to 3.4?

2011-09-30 Thread Pranav Prakash

might come along? *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny>

How to achieve Indexing @ 270GiB/hr

2011-10-04 Thread Pranav Prakash

lr in batches of max 500 docs. Even if using DataImportHandler what are the ways this could be optimized? If I am able to solve the problem of indexing data in our current setup, my life would become a lot easier. *Pranav Prakash* "temet nosce" Twitter <http://twitter.co

Painfully slow indexing

2011-10-19 Thread Pranav Prakash

would really appreciate kindness of community in order to get this indexing faster. false 10 10 2048 2147483647 300 1000 5 256 10 false true true 1 0 false 10 *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavpr

Re: Painfully slow indexing

2011-10-24 Thread Pranav Prakash

uld convert all of it into one XML file and then index? *are you calling commit after your batches or do an optimize by any chance?* I am not optimizing, but I am performing an autocommit every 10 docs. *Pranav Prakash* "temet nosce" Twitter <http://twitter.com

Howto Programatically check if the index is optimized or not?

2011-11-15 Thread Pranav Prakash

Hi, After the commit, my optimize usually takes 20 minutes. The thing is that I need to know programatically if the optimization has completed or not. Is there an API call through which I can know the status of optimization? *Pranav Prakash* "temet nosce" Twitter <http:

Highlighting uses lots of memory and eventually slows down Solr

2011-12-09 Thread Pranav Prakash

n&wt=ruby&hl=true&rows=12&defType=dismax&fl=id,title,description&debugQuery=false&start=0&q=asdfghjkl&bf=recip(ms(NOW,created_at),1.88e-11,1,1)&hl.simple.post=&ps=50} Any help on this would be greatly appreciated. Thanks in advance !! *Pranav Prakash* &

Re: Highlighting uses lots of memory and eventually slows down Solr

2011-12-19 Thread Pranav Prakash

No respinse !! Bumping it up *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny> On Fri, Dec 9, 2011 at 14:11, Pranav Prakash wrote: > Hi Group, > >

Something like "featured results" in solr response?

2012-01-30 Thread Pranav Prakash

art from the results generated by Solr (which is based on relevancy, score), there is another set of documents which just comes up. It is very much similar to the "sponsored results" feature of Google. Can you guys point me to the appropriate resources for the same? *Pranav Prakash*

Re: Something like "featured results" in solr response?

2012-01-30 Thread Pranav Prakash

restarts are as infrequent as config changes. What could be a sound way to implement this? *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny> 2012/1/30 Rafał Ku

Re: Something like "featured results" in solr response?

2012-01-30 Thread Pranav Prakash

Wow, this looks interesting. *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny> On Mon, Jan 30, 2012 at 21:16, Erick Erickson wrote: > There's the t

Typical Cache Values

2012-02-07 Thread Pranav Prakash

cumulative_inserts : 1309934 cumulative_evictions : 1309245 *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny>

Re: Typical Cache Values

2012-02-07 Thread Pranav Prakash

> > * > * > This is not unusual, but there's also not much reason to give this much > memory in your case. This is the cache that is hit when a user pages > through result set. Your numbers would seem to indicate one of two things: > 1> your window is smaller than 2 pages, see solrconfig.xml, >

Deduplication in MLT

2012-06-12 Thread Pranav Prakash

? *Pranav Prakash* "temet nosce"

Questions about Solr MLTHanlder, performance, Indexes

2011-06-20 Thread Pranav Prakash

Hi folks, I am new to Solr, and using it for web application. I have been experimenting with it and have a couple of doubts which I was unable to resolve by Google. Our portal allows users to upload content and the fields we use are - title, description, transcript, tags. Now each of the content h

Removing duplicate documents from search results

2011-06-23 Thread Pranav Prakash

functionality using Solr? Does Solr has an implied or plugin which could help me with it? *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny>

Re: Removing duplicate documents from search results

2011-06-23 Thread Pranav Prakash

% similar. *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny> On Thu, Jun 23, 2011 at 15:16, Omri Cohen wrote: > What you need to do, is to calculate some

Re: how to index data in solr form database automatically

2011-06-24 Thread Pranav Prakash

Cron is a time-based job scheduler in Unix-like computer operating systems. en.wikipedia.org/wiki/Cron *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny> O

Custom Handler support in Solr-ruby

2011-06-28 Thread Pranav Prakash

f an overhead? Or am I missing something? Also, where can I file bugs to solr-ruby? *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny>

Index Version and Epoch Time?

2011-06-28 Thread Pranav Prakash

ge on every commit? If not, is there a way to look into the last index time? Also, this page http://wiki.apache.org/solr/SolrReplication#Replication_Dashboard shows a Replication Dashboard. How is this dashboard invoked? Is there any URL which needs to be called? *Pranav Prakash* "temet nosce

Re: Removing duplicate documents from search results

2011-06-28 Thread Pranav Prakash

I found the deduplication thing really useful. Although I have not yet started to work on it, as there are some other low hanging fruits I've to capture. Will share my thoughts soon. *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <

Re: Index Version and Epoch Time?

2011-06-28 Thread Pranav Prakash

is there a configuration that can be adjusted for this? Also, what would the index state be if after the restarting Solr, a commit is applied or a commit is not applied? I'd be happy to provide any other information that might be needed. *Pranav Prakash* "temet nosce" Twitter <http

Dealing with keyword stuffing

2011-07-27 Thread Pranav Prakash

of things like providing different boosts to different fields, but almost everything seems to fail. I'd like to know how did you guys fixed this thing? *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny>

Re: Dealing with keyword stuffing

2011-07-28 Thread Pranav Prakash

On Thu, Jul 28, 2011 at 08:31, Chris Hostetter wrote: > > : Presumably, they are doing this by increasing tf (term frequency), > : i.e., by repeating keywords multiple times. If so, you can use a custom > : similarity class that caps term frequency, and/or ensures that the > scoring > : increases

Re: Index

2011-07-28 Thread Pranav Prakash

gain more insight about the index. *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny> On Fri, Jul 29, 2011 at 03:40, GAURAV PAREEK wrote: > Yes NICK you ar

Re: Dealing with keyword stuffing

2011-07-29 Thread Pranav Prakash

ld I start beginning with it? Pl. do not assume less obvious things, I am still learning !! :-) *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny> On Thu, Jul 28, 201

Re: Solr Incremental Indexing

2011-07-31 Thread Pranav Prakash

at the db level), which would fork a process to update Solr about this change by means of delayed task. If using this approach, it is suggested to use autocommit every N documents, N could be anything depending your app. *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/

Re: Solr 3.3 crashes after ~18 hours?

2011-08-02 Thread Pranav Prakash

performing commit and then optimize while the load from app server was at its peak. This caused slow response from search server, which caused requests getting stacked up at app server and causing 503s. Could you look if you have a similar syndrome? *Pranav Prakash* "temet nosce" Twi

Re: PivotFaceting in solr 3.3

2011-08-02 Thread Pranav Prakash

>From what I know, this is a feature in Solr 4.0 marked as SOLR-792 in JIRA. Is this what you are looking for ? https://issues.apache.org/jira/browse/SOLR-792 *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.mybli

Re: Is optimize needed on slaves if it replicates from optimized master?

2011-08-10 Thread Pranav Prakash

size. Am I doing something wrong? How can I get the master to serve only delta index instead of serving whole index and the slaves merging the new and old index? *Pranav Prakash*

How come this query string starts with wildcard?

2011-08-10 Thread Pranav Prakash

bly my search index didn't had any. *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny>

Re: Is optimize needed on slaves if it replicates from optimized master?

2011-08-10 Thread Pranav Prakash

Very well explained. Thanks. Yes, we do optimize Index before replication. I am not particularly worried about disk space usage. I was more curious of that behavior. *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.

OOM due to JRE Issue (LUCENE-1566)

2011-08-16 Thread Pranav Prakash

ve to manually apply the patch? What are the other workarounds of the problem? Thanks in adv. *Pranav Prakash* "temet nosce" Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> | Google <http://www.google.com/profiles/pranny>

Re: OOM due to JRE Issue (LUCENE-1566)

2011-08-16 Thread Pranav Prakash

> > > AFAIK, solr 1.4 is on Lucene 2.9.1 so this patch is already applied to > the version you are using. > maybe you can provide the stacktrace and more deatails about your > problem and report back? > Unfortunately, I have only this much information with me. However following is my speficiations

Top 5 high freq words - UpdateProcessorChain or DIH Script?

2012-07-08 Thread Pranav Prakash

dd it to UpdateRequestProcessor Chain and insert the function after StopWordsFilterFactory and DuplicateRemoveFilterFactory, should be rather good way of doing this? -- *Pranav Prakash* "temet nosce"

DIH XML configs for multi environment

2012-07-11 Thread Pranav Prakash

dlers and so on. What is a good way to deal with this? *Pranav Prakash* "temet nosce"

Re: DIH XML configs for multi environment

2012-07-11 Thread Pranav Prakash

That's cool. Is there something similar for Jetty as well? We use Jetty! *Pranav Prakash* "temet nosce" On Wed, Jul 11, 2012 at 1:49 PM, Rahul Warawdekar < rahul.warawde...@gmail.com> wrote: > Hi Pranav, > > If you are using Tomcat to host Solr, you

How To apply transformation in DIH for multivalued numeric field?

2012-07-18 Thread Pranav Prakash

this case? *Pranav Prakash* "temet nosce"

Re: DIH XML configs for multi environment

2012-07-18 Thread Pranav Prakash

That approach would work for core dependent parameters. In my case, the params are environment dependent. I think a simpler approach would be to pass the url param as JVM options, and these XMLs get it from there. I haven't tried it yet. *Pranav Prakash* "temet nosce" On Tue,

Re: How To apply transformation in DIH for multivalued numeric field?

2012-07-18 Thread Pranav Prakash

I had tried with splitBy for numeric field, but that also did not worked for me. However I got rid of group_concat and it was all good to go. Thanks a lot!! I really had a difficult time understanding this behavior. *Pranav Prakash* "temet nosce" On Thu, Jul 19, 2012 at 1:34 AM, D

Re: can solr admin tab statistics be customized... how can this be achived.

2012-07-23 Thread Pranav Prakash

You can checkout Solr source code, do the patch work in admin JSP files and use it as your custom Solr Instance. *Pranav Prakash* "temet nosce" On Fri, Jul 20, 2012 at 12:14 PM, yayati wrote: > > > Hi, > > I want to compute my own stats in addition to solr

Re: DIH XML configs for multi environment

2012-07-25 Thread Pranav Prakash

configs. *Pranav Prakash* "temet nosce" On Tue, Jul 24, 2012 at 1:17 AM, jerry.min...@gmail.com < jerry.min...@gmail.com> wrote: > Pranav, > > Sorry, I should have checked my response a little better as I > misspelled your name and, mentioned that I tried what Marcu

Exact match on few fields, fuzzy on others

2012-08-01 Thread Pranav Prakash

the fields are below: -- *Pranav Prakash* "temet nosce"

Re: Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-10 Thread Pranav Prakash

that would resolve this. *Pranav Prakash* "temet nosce" On Sat, Sep 8, 2012 at 3:16 AM, Shawn Heisey wrote: > On 9/6/2012 6:54 PM, kiran chitturi wrote: > >> The error i am getting is 'org.apache.solr.common.**SolrException: >> Invalid >> Date String

Re: Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-10 Thread Pranav Prakash

The character is actually - “ and not " *Pranav Prakash* "temet nosce" On Mon, Sep 10, 2012 at 2:45 PM, Pranav Prakash wrote: > I am experiencing similar problem related to encoding. In my case, the > char like " (double quote) > is also garbaled. >

Re: DIH import from MySQL results in garbage text for special chars

2012-09-20 Thread Pranav Prakash

I am seeing the garbage text in browser, Luke Index Toolbox and everywhere it is the same. My servlet container is Jetty which is the out-of-box one. Many other special chars are getting indexed and stored properly, only few characters causes pain. *Pranav Prakash* "temet nosce" O

Re: DIH import from MySQL results in garbage text for special chars

2012-09-26 Thread Pranav Prakash

I looked at the HEX codes of the texts. The hex code in MySQL is different from that which is stored in the index. The hex code in index is longer than the hex code in MySQL, this leads me to the fact that somewhere in between smething is messing up, *Pranav Prakash* "temet nosce"

Re: DIH import from MySQL results in garbage text for special chars

2012-09-26 Thread Pranav Prakash

| | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ *Pranav Prakash* "temet nosce" On Wed, Sep 26, 2012 at 6:45 PM, Gora Mohanty wrote: > On 21 September 2012 11:19, Pranav Prakash wrote: > > > I am seeing the g

56 matches

Mail list logo