Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread Otis Gospodnetic
Mitch, If you use Nutch+Solr then you wouldn't *index* the fetched content with Nutch. Solr doesn't know anything about OPIC, but I suppose you can feed the OPIC score computed by Nutch into a Solr field and use it during scoring, if you want, say with a function query. Yes, ES has built-in sup

About the example in the wiki page of FunctionQuery

2010-06-16 Thread Chia Hao Lo
( I've sent this mail two days ago, but I cannot find it in the mail archive. So I guess the mail is not sent successfully. Sorry for sending this mail twice in case that it did send. ) Hi, I'm a newbie to Solr and have a question about the example in FunctionQuery. I've read the document of

RE: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread MitchK
Good morning! Great feedback from you all. This really helped a lot to get an impression of what is possible and what is not. What is interesting to me are some detail questions. Let's assume Solr is possible to work on his own with distributed indexing, so that the client does not need to know

Re: how to apply patch SOLR-1316

2010-06-16 Thread Blargy
Im trying to apply this via the command line "patch -p0 < SOLR-1316.patch". When patching against trunk I get the following errors. ~/workspace $ patch -p0 < SOLR-1316.patch patching file dev/trunk/solr/src/java/org/apache/solr/handler/component/SpellCheckComponent.java Hunk #2 succeeded at 575

Re: SpellCheckComponent questions

2010-06-16 Thread Blargy
Follow up question. How can I influence the "scoring" of results that comeback either through term frequency (if i build of an index) or through # of search results returned (if using a search log)? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/SpellCheckComponent-

Re: Solr DataConfig / DIH Question

2010-06-16 Thread Alexey Serba
> There is a 1-[0,1] relationship between Person and Address with address_id > being the nullable foreign key. I think you should be good with single query/entity then (no need for nested entities) On Sunday, June 13, 2010, Holmes, Charles V. wrote: > I'm putting together an entity.  A simpli

SpellCheckComponent questions

2010-06-16 Thread Blargy
Is it generally wiser to build the dictionary from the existing index? Search Log? Other? For "Did you mean" does one usually just use collate=true and then return that string? Should I be using a separate spellchecker handler to should I just always include spellcheck=true in my original searc

MailEntityProcessor class cast exception

2010-06-16 Thread Max Lynch
With last night's build of solr, I am trying to use the MailEntityProcessor to index an email account. However, when I call my dataimport url, I receive a class cast exception: INFO: [] webapp=/solr path=/dataimport params={command=full-import} status=0 QTime=44 Jun 16, 2010 8:16:03 PM org.apache

how to index the words of a lecture transcript, and the timecodes for each word?

2010-06-16 Thread Peter Wilkins
I have lecture transcripts with start and stop times for each word. The time codes allow us to search the transcripts, and show the part of the lecture video that contain the search results. I want to structure the index so that I can search the transcripts for phrases, and have the search res

Re: Indexing HTML files in SOLR

2010-06-16 Thread Lance Norskog
This is the tool in Solr for indexing various kinds of content. After you learn the basics of indexing (see solr/example/exampledocs for samples), the ExtractingRequestHandler will make sense: http://wiki.apache.org/solr/ExtractingRequestHandler On Tue, Jun 15, 2010 at 12:35 AM, seesiddharth wro

Re: LocalParams, quotes, bug?

2010-06-16 Thread Yonik Seeley
On Wed, Jun 16, 2010 at 3:27 PM, Jonathan Rochkind wrote: > {!dismax qf=$some_qf}   => no problem, and debugQuery reveals it is indeed > using the qf I desire. > > {!dismax qf='$some_qf'}  => Solr throws "undefined field $some_qf". > > Is this a bug in Solr? Nope, it's by design. Parameter refere

Re: Field Collapsing SOLR-236

2010-06-16 Thread Mark Diggory
Blargy? I produced a patched version of Solr 1.4 and released it into the maven central repository under our DSpace groupid as a dependency for our applications. Your welcome to test it out and use our code for examples. Although, it is not the most recent patch of Field Collapsing, it has be

Re: SolrEventListener

2010-06-16 Thread Chris Hostetter
: Can someone explain how to register a SolrEventListener? *typically* it's done using the syntax as noted on the wiki page you linked to However... : I am actually interested in using the SpellCheckerListener and it appears ...that listener is designed to be registred for you automaticly d

Re: SolrCoreAware

2010-06-16 Thread Chris Hostetter
: Can someone please explain what the inform method should accomplish? Thanks whatever you want it to acomplish ... it's just a hook that (some types of) plugins can use to finish their initialize themselves after init() has been called on the SolrCore and all of the other plugins. (it's a two

Re: Issue with response header in SOLR running on Linux instance

2010-06-16 Thread Chris Hostetter
: My issue now is that the response header is not at all consistent. : : Sometimes the response header is in this format, ... : sometimes its in this format (for same query) same query differnet solr instance, or same query same server? please be specific .. show us URLs, show us solrco

Re: Field Collapsing SOLR-236

2010-06-16 Thread Moazzam Khan
I got the code from trunk again and now I get this error: [javac] symbol : class StringIndex [javac] location: interface org.apache.lucene.search.FieldCache [javac] private final Map fieldCaches = new HashMap(); [javac] ^ [javac] C:\

Re: SOLR search performance - Linux vs Windows servers

2010-06-16 Thread Israel Ekpo
Thats a good note. I get this kind of question a lot. Most of the time, the reason is because there are database servers (MySQL) and Webservers (Apache) and other processes running on the Linux box. Try to verify that the load, number of processors/cores as well as other environment settings are

Re: SOLR search performance - Linux vs Windows servers

2010-06-16 Thread Otis Gospodnetic
BB, Could it be that you are comparing apples and oranges? * Is the hardware identical? * Are indices identical? * Are JVM versions the same? * Are JVM arguments identical? * Are the two boxes "equally idle" when Solr is not running? * etc. In general, no, there is no reason why Windows would au

Re: Field Collapsing SOLR-236

2010-06-16 Thread Moazzam Khan
I did the same thing. And, the code compiles without the patch but when I apply the patch I get these errors: [javac] C:\svn\solr\src\java\org\apache\solr\search\fieldcollapse\collector\ FieldValueCountCollapseCollectorFactory.java:127: class, interface, or enum expe cted [javac] import ja

Re: Reindexing only occurs after bouncing app

2010-06-16 Thread John Ament
So just to throw the idea out there, what would happen if I shutdown and created a new solrServer on reindex? We only reindex daily. Will that force the reread of all lucene files? John On Tue, Jun 15, 2010 at 4:47 PM, John Ament wrote: > Hi all > > I wrote a small app using solrj and solr.

SOLR search performance - Linux vs Windows servers

2010-06-16 Thread bbarani
Hi, I have SOLR instances running in both Linux / windows server (same version / same index data). Search performance is good in windows box compared to Linux box. Some queries takes more than 10 seconds in Linux box but takes just a second in windows box. Have anyone encountered this kind of i

RE: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread Markus Jelsma
You're right. Currently clients need to take care of this, in this case, Nutch would be the client but it cannot be configured as such. It would, indeed, be more appropriate for Solr to take care of this. We can already query any server with a set of shard hosts specified, so it would make sense

Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread Otis Gospodnetic
Well, it's not that Nutch doesn't support it. Solr itself doesn't support it. Indexing applications need to know which shard they want to send documents to. This may be a good case for a new wish issue in Solr JIRA? Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene e

RE: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread Markus Jelsma
Nutch does not, at this moment, support some form of consistent hashing to select an appropriate shard. It would be nice if someone could file an issue in Nutch' Jira to add sharding support to it, perhaps someone with a better understanding and more experience with Solr's distributed search tha

LocalParams, quotes, bug?

2010-06-16 Thread Jonathan Rochkind
So using LocalParams with dollar-sign references to other parameters. In LocalParams in general, you can use single-quotes for values that have spaces in them: {!dismax qf='field^5 field2^10'}=> no problem And even if the value does not have spaces, you can use single quotes too, why n

Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread Otis Gospodnetic
Hi Mitch, Solr can do distributed search, so it can definitely handle indices that can't fit on a single server without sharding. What I think *might* be the case that the Nutch indexer that sends docs to Solr might not be capable of sending documents to multiple Solr cores/shards. If that is

Re: access term vectors in lucene

2010-06-16 Thread Grant Ingersoll
See http://wiki.apache.org/solr/TermVectorComponent YOu might also be interested in the TermsComponent: http://wiki.apache.org/solr/TermsComponent On Jun 16, 2010, at 8:47 AM, sarfaraz masood wrote: > hello all, > > I wanna know that how can we access terms vectors in lucene.. actually i > ma

Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread MitchK
Thanks, that really helps to find the right beginning for such a journey. :-) > * Use Solr, not Nutch's search webapp > As far as I have read, Solr can't scale, if the index gets too large for one Server > The setup explained here has one significant caveat you also need to keep > in mind:

Re: Some questions about ability of solr.

2010-06-16 Thread Otis Gospodnetic
Vitaliy: Check http://blog.sematext.com/2010/06/01/hbase-digest-may-2010/ and http://twitter.com/otisg/status/16320594923 Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Vitaliy Avdee

Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread Otis Gospodnetic
Mitch, I think you really have 2 distinct questions there: One question is Nutch vs. Droids. The other one is Solr vs. Nutch for search. My suggestions: * Use Nutch, not Droids, if scaling is important * Use Solr, not Nutch's search webapp Otis Sematext :: http://sematext.com/ :: Solr - L

Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread MitchK
Thank you for the feedback, Otis. Yes, I thought that such an approach is usefull if the number of pages to crawl is relatively low. However, what about using solr + nutch? Exists the problem that this would not scale, if the index becomes too large, up to now? What about extending nutch with fe

Re: Field Collapsing SOLR-236

2010-06-16 Thread Eric Caron
I've had the best luck checking out the newest Solr/Lucene (so the 1.5-line) from SVN, then just doing "patch -p0 < SOLR-236-trunk.patch" from inside the trunk directory. I just did it against the newest checkout and it works fine still. On Wed, Jun 16, 2010 at 11:35 AM, Moazzam Khan wrote: > Ac

Re: Field Collapsing SOLR-236

2010-06-16 Thread Moazzam Khan
Actually I take that back. I am just as lost as you. I wish there was a tutorial on how to do this (although I get the feeling that once I know how to do it I will go "ohh... I can't believe I couldn't figure that out") - Moazzam On Wed, Jun 16, 2010 at 8:25 AM, Moazzam Khan wrote: > Hi Rakhi, >

Re: Solr 1.4 and Nutch 1.0 Integration

2010-06-16 Thread Dean Del Ponte
Thanks! On Wed, Jun 16, 2010 at 10:24 AM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Dean, > > In general, you'll get more help about Nutch with Solr on the Nutch list > than on the Solr one. > > Here it the info: > > http://wiki.apache.org/nutch/RunningNutchAndSolr > > Otis >

Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread Otis Gospodnetic
My quick feedback would be: Try using Nutch first, because it is a more complete "platform". From what I know, Droids is just the crawler with an in-memory queue + link extractor. We did use it for crawling Lucene project sites (for the index on http://search-lucene.com/ ), but that is because

Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread MitchK
Hello community, from several discussions about Solr and Nutch, I got some questions for a virtual web-search-engine. I know I've posted this message to the mailing list a few days ago, but the thread got injected and at least I did not get any more postings about the topic and so I try to reop

Re: Solr 1.4 and Nutch 1.0 Integration

2010-06-16 Thread Otis Gospodnetic
Dean, In general, you'll get more help about Nutch with Solr on the Nutch list than on the Solr one. Here it the info: http://wiki.apache.org/nutch/RunningNutchAndSolr Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ -

Re: Is there a way to set the default response handler version to 2.2

2010-06-16 Thread Otis Gospodnetic
BB, You can set the versoin param default in solrconfig.xml. Here is a snippet: explicit 10 * 2.1 <= look mom, no hands! ... Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/

Is there a way to set the default response handler version to 2.2

2010-06-16 Thread bbarani
Hi, I am facing issues in getting back the response header correctly in different OS (Windows / Linux). I could see that in windows OS, the version of response handler is set to 2.2 by default and even without specifying the version in the query I am getting the proper response header. In Linux

Re: Issue with response header in SOLR running on Linux instance

2010-06-16 Thread bbarani
Hi, Thanks a lot for your response. My issue now is that the response header is not at all consistent. Sometimes the response header is in this format, - 0 - credit sometimes its in this format (for same query) 0 Do you think that this might be due to version issu

Re: Solr: query in admin and where is my data?

2010-06-16 Thread Otis Gospodnetic
Hello, A1: you can search and commit concurrently. Sounds like you are using 1 box with Solr to both index and search. Solr is typically deployed a multi-node cluster where 1 of those nodes is a master that does all indexing, and 1 or more slaves perform searches. A2: *:* is correct A3: maybe

Re: Solr: query in admin and where is my data?

2010-06-16 Thread cstc
Hello, Thank you for your help. Q1: Do I have to wait until all the data is fully committed before querying? Q2: I put '*:*' (without quotes) in the admin query box. Is that the correct syntax for a search? Q3: Why did it come back with no results if the data is being committed? Again, any help

Re: Some basics

2010-06-16 Thread Otis Gospodnetic
Frank, Is the following what you are after: Here is a query for my last name, but misspelled: http://search-lucene.com/?q=gospodneticc But if you look above the results, you will see this text: Search results for "gospodnetic" : ... and the search results are indeed for the auto-corrected q

Re: Field Collapsing SOLR-236

2010-06-16 Thread Moazzam Khan
Hi Rakhi, You are supposed to get the code for solr 1.4 from SVN here: http:/svn.apache.org/repos/asf/lucene/solr/tags/ Then apply the path to it and comppile. It should work. However, you will probably get an error at run time saying some java class is missing. I haven't been able to figure ou

Re: Master master?

2010-06-16 Thread Otis Gospodnetic
Hello, The closest thing to this with Solr is a Repeater: http://wiki.apache.org/solr/SolrReplication#Setting_up_a_Repeater A Repeater is an instance that acts as both the master and slave at the same time. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem

Re: Spellchecker index cannot be optimized

2010-06-16 Thread Otis Gospodnetic
Lutz, Look at this: http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL You should be able to do this with your spellchecker index, too. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.c

Re: SolrEventListener

2010-06-16 Thread Otis Gospodnetic
Hi, Look at https://issues.apache.org/jira/browse/SOLR-795?focusedCommentId=12642870&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12642870 Look for "buildOn" string in various example solrconfigs. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - N

Re: HOWTO get a working copy of SOLR?

2010-06-16 Thread Otis Gospodnetic
Bernd, Not everything has to be bundled in one package. :) Luke is a separate project. It also depends on some software that makes it unsuitable for bundling with Lucene/Solr for licensing reasons. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search ::

Re: Solr: query in admin and where is my data?

2010-06-16 Thread Otis Gospodnetic
Hello, Yes, this looks like it's working correctly - it looks like the docs are getting committed. You should see some logging messages about the searcher being reloaded after the commit. When that happens you will see your changes in the index. Otis Sematext :: http://sematext.com/ ::

access term vectors in lucene

2010-06-16 Thread sarfaraz masood
hello all, I wanna know that how can we access terms vectors in lucene.. actually i making a project where i need tf idf values of all the terms in the documents.. but i m unable to get any reference eg where it shows how to use these term vectors to get the tf idf values of ALL the terms in my

TermsComponent Reverse !?

2010-06-16 Thread stockii
Hello again Nabble :D TermsComponent works fine so far, but how can i get the same result for the typing: "harry pot" -> "harry potter" AND "potter harr" -> "harry potter" i try ReversedWildcardFilterFactory, but i dont want the reversed Word. i want the reversed sentence. ^^ thx -- Vi

Solr: query in admin and where is my data?

2010-06-16 Thread cstc
Dear Solr gurus, I am still currently running a script which says that the Solr software is still commiting the data: == INFO: [] Registered new searcher searc...@3b48a17a main Jun 16, 2010 12:56:58 PM org.apache.solr.search.SolrIndexSearcher close INFO: Closing searc...@5

Re: Field Collapsing SOLR-236

2010-06-16 Thread Rakhi Khatwani
Hi, I wanted to try out field collapsing for a requirement. i went through the wiki and solr-236. but there are lot of patch files. and the comments below left me confused. i tried applyin the patch file on 1.4.0 release but ended up with many compile errors. i even downloaded the latest code