Re: matching reponse and request

2011-09-27 Thread Roland Tollenaar
Hi Otis, thanks again for your help. This does sound right but I am not familiar enough with solr to venture into applying this. I'll need something eventually if I want to use solr but maybe 1.5 will have it built in for me just when I need it. :) Thanks, RR Otis Gospodnetic wrote: Hi R

Re: SOLR Index Speed

2011-09-27 Thread Lord Khan Han
Sorry :) it is not 500 doc per sec. ( It is what i wish I think) It is 500 doc per MINUTE.. On Tue, Sep 27, 2011 at 7:14 AM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Hello, > > > PS: solr streamindex is not option because we need to submit javabin... > > > If you are referrin

RE: how to implemente a query like " like '%pattern%' "

2011-09-27 Thread libnova
Hola Tomás. it seems that yes, using q = "word1 word2" over a tokenized field, it seems to work. I will do some additional testing. thanks a lot, rode. > -Mensaje original- > De: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com] > Enviado el: lunes, 26 de septiembre de 2011 22:12 >

Re: SOLR Index Speed

2011-09-27 Thread Lord Khan Han
Our producer (hadoop mapper prepare the docs for submitting and the reducer diriectly submit from solrj http submit..) now 32 reducer but still the indexing speed 500 - 700 doc per minute. submission coming from a hadoop cluster so submit speed is not a problem. I couldnt use the full solr inde

Re: SOLR Index Speed

2011-09-27 Thread Lord Khan Han
1- each document around 50 kb - 150 kb (web document) 2-final index is 40 gig 3-jre memory carefully given. On Mon, Sep 26, 2011 at 9:57 PM, Jaeger, Jay - DOT wrote: > 500 / second would be 1,800,000 per hour (much more than 500K documents). > > 1) how big is each document? > 2) how big are

Re: SOLR Index Speed

2011-09-27 Thread Lord Khan Han
For SUSS https://issues.apache.org/jira/browse/SOLR-1565 it says no binary support... When we try to use from solrj binary through SUSS adding document took thousands of milliseconds to million per doc.!! so we turn back normal submit. On Tue, Sep 27, 2011 at 7:14 AM, Otis Gospodnetic < ot

Re: Solr and External Fields

2011-09-27 Thread Jamie Johnson
Thanks Hoss. This looks very interesting but does not yet have support for highlighting. I'll watch this though and look at transitioning to this once highlighting support is available. On Tue, Aug 9, 2011 at 3:56 PM, Chris Hostetter wrote: > > : I recently modified the DefaultSolrHighlighter t

Re: Solr and External Fields

2011-09-27 Thread Jamie Johnson
hmm...perhaps I spoke too soon. I looked at the patch and there are some changes in highlighter, can anyone confirm that highlighting is supported on this? Also is there any status on batch retrieval of these extra fields to improve performance? On Tue, Sep 27, 2011 at 8:47 AM, Jamie Johnson wr

Can't use ms() function on non-numeric legacy date field

2011-09-27 Thread Pranav Prakash
Hi, I had been trying to boost my recent documents, using what is described here http://wiki.apache.org/solr/FunctionQuery#Date_Boosting My date field looks like However, upon trying to do ms(NOW, created_at) it shows the error Can't use ms() function on non-numeric legacy date field created_a

Re: Can't use ms() function on non-numeric legacy date field

2011-09-27 Thread Markus Jelsma
Try solr.TrieDateField instead On Tuesday 27 September 2011 15:53:30 Pranav Prakash wrote: > Hi, I had been trying to boost my recent documents, using what is described > here http://wiki.apache.org/solr/FunctionQuery#Date_Boosting > > My date field looks like > > omitNorms="true"/> > ="true"

Re: indexing a xml file

2011-09-27 Thread ahmad ajiloo
find the attachments. thanks On Sun, Sep 25, 2011 at 7:41 AM, Bill Bell wrote: > Send us the example "solr.xml" and "schema.xml'". You are missing fields > in the schema.xml that you are referencing. > > On 9/24/11 8:15 AM, "ahmad ajiloo" wrote: > > >hello > >Solr Tutorial page explains about i

Re: Solr stopword problem in Query

2011-09-27 Thread Rahul Warawdekar
Hi Isan, The schema.xml seems OK to me. Is "textForQuery" the only field you are searching in ? Are you also searching on any other non text based fields ? If yes, please provide schema description for those fields also. Also, provide your solrconfig.xml file. On Tue, Sep 27, 2011 at 1:12 AM, I

Re: Searching multiple fields

2011-09-27 Thread Mark
I thought that a similarity class will only affect the scoring of a single field.. not across multiple fields? Can anyone else chime in with some input? Thanks. On 9/26/11 9:02 PM, Otis Gospodnetic wrote: Hi Mark, Eh, I don't have Lucene/Solr source code handy, but I *think* for that you'd n

Re: Searching multiple fields

2011-09-27 Thread lee carroll
see http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html On 27 September 2011 16:04, Mark wrote: > I thought that a similarity class will only affect the scoring of a single > field.. not across multiple fields? Can anyone else chime in with some > input? Thanks. >

SOLR architecture recommendation

2011-09-27 Thread Robert Stewart
I need some recommendations for a new SOLR project. We currently have a large (200M docs) production system using Lucene.Net and what I would call our own .NET implementation of SOLR (built early on when SOLR was less mature and did not run as well on Windows). Our current architecture works

Re: Can't use ms() function on non-numeric legacy date field

2011-09-27 Thread Chris Hostetter
: Try solr.TrieDateField instead and note the docs on the "ms" function, from the same wiki page you linked to... >> Arguments may be numerically indexed date fields such as TrieDate >> (recommended field type for dates since Solr 1.4), or date math >> (examples in SolrQuerySyntax) based on a

payloads - Inconsistency between the document score and the explain score

2011-09-27 Thread Jean-Claude Dauphin
Hello, I have implemented payloads at the index and query levels using specific PayloadSimilarity and PayloadQparserPlugin classes. Now I wish to check that the payloads processing is correct and thus I inserted the following code to check the document scores of a Solr request: // Di

Re: payloads - Inconsistency between the document score and the explain score

2011-09-27 Thread Robert Muir
https://issues.apache.org/jira/browse/LUCENE-3421 Note: if you are using this 'includeSpanScore=false' (which I think you are, as thats where the bug applies), be aware this means the score is *only* the result of your payload, boosts, tf, length normalization, idf, none of this is incorporated in

help understanding match

2011-09-27 Thread Vijay Ramachandran
Hi. I recently started using solr in a project, and experienced what I think is strange matching behaviour, and would like some help in understanding what happened. I'm using solr 3.1 with java 1.6 on linux. My index consists of a set of phrases, which I'd like to match against incoming text such

Re: help understanding match

2011-09-27 Thread tamanjit.bin...@yahoo.co.in
Hi, 1. Just curious - you have your defaultsearchfield - defaultquery as not stored, how do you know that it contains what you think it contains? 2. the fieldType of defaultquery is query_text, am not sure what all analyzers are you using on this fields type both at indexing time and querying time

Re: SOLR Index Speed

2011-09-27 Thread Otis Gospodnetic
Hi, No need to use reply-all and CC me directly, I'm on the list :) It sounds like Solr is not the problem, but the Hadoop side.  For example, what if you change your reducer not to call Solr but do some no-op.  Does it go beyond 500-700 docs/minute? Otis Sematext :: http://sematext.com/

Re: How to reserve ids?

2011-09-27 Thread Otis Gospodnetic
Gabriele, Using "msn.com" as a stopword would simply mean that msn.com would not be indexed and therefore a search for "msn.com" would not yield results.  You could still search for "hotmail" and it may match documents that have "msn.com" token stored in them, even though "msn.com" is a stopwor

Re: indexing a xml file

2011-09-27 Thread Gora Mohanty
On Tue, Sep 27, 2011 at 7:46 PM, ahmad ajiloo wrote: > find the attachments. [...] So, it is pretty clear then. As people have mentioned earlier, your solr.xml has fields that are not defined in schema.xml. E.g., you need to have a field with name="name" defined for the particular field referred

apply filter to spell filed

2011-09-27 Thread alxsss
Hello, I have implemented spellchecker in two ways. 1. Adding a textspell type to schema.xml and making a copy field from original content field, which is type text. 2. without adding new type and copy field. Simple adding name of spell field, content to solrconfig.xml I have an issue in

Re: how would I use the new join feature given my schema.

2011-09-27 Thread Chris Hostetter
: I've been reading the information on the new join feature and am not quite : sure how I would use it given my schema structure. I have "User" docs and : "BlogPost" docs and I want to return all BlogPosts that match the fulltext : title "cool" that belong to Users that match the description "solr"

Re: How to write core's name in log

2011-09-27 Thread Chris Hostetter
: I'm thinking to add MDC variable, this will be name of core. Finally I'll : use it in log4j configuration like this in ConversionPattern %X{core} : : The idea is that when Solr received a request I'll add this new variable : "name of core". : : But I don't know if it's a good idea or not. : :

basic solr cloud questions

2011-09-27 Thread Sam Jiang
Hi all I'm a relatively new solr user, and recently I discovered the interesting solr cloud feature. I have some basic questions: (please excuse me if I get the terminologies wrong) - from my understanding, this is still a work in progress. How mature is it? Is there any estimate on the official

DIH when using XML Files questions

2011-09-27 Thread Gabriel Cooper
I'm researching using DataImportHandler to import my data files utilizing FileDataSource with FileListEntityProcessor and have a couple questions before I get started that I'm hoping you guys can assist with. 1) I would like to put a file on the local filesystem in the configured location and have

Re: basic solr cloud questions

2011-09-27 Thread Darren Govoni
On 09/27/2011 05:05 PM, Yury Kats wrote: You need to either submit the docs to both nodes, or have a replication setup between the two. Otherwise they are not in sync. I hope that's not the case. :/ My understanding (or hope maybe) is that the new Solr Cloud implementation will support auto-shar

Re: basic solr cloud questions

2011-09-27 Thread Yury Kats
On 9/27/2011 5:16 PM, Darren Govoni wrote: > On 09/27/2011 05:05 PM, Yury Kats wrote: >> You need to either submit the docs to both nodes, or have a replication >> setup between the two. Otherwise they are not in sync. > I hope that's not the case. :/ My understanding (or hope maybe) is that > the

Re: hi. allowLeadingWildcard is it possible or not yet?

2011-09-27 Thread Chris Hostetter
: Subject: Re: hi. allowLeadingWildcard is it possible or not yet? : : i wonder the same thing... so wanna "re-animate" the topic : : is it possible? Leading wildcard style queries can work, and can work very efficiently, thanks to SOLR-1321. The key is to use ReversedWildcardFilterFactory i

RE: aggregate functions in Solr?

2011-09-27 Thread Steve McKay
> -Original Message- > From: Esteban Donato [mailto:esteban.don...@gmail.com] > Sent: Monday, September 26, 2011 2:08 PM > To: solr-user@lucene.apache.org > Subject: aggregate functions in Solr? > > Hello guys, > >   I need to implement a functionality which requires something similar > t

Re: getting answers starting with a requested string first

2011-09-27 Thread Chris Hostetter
: 1) giving NAME_ANALYZED a type where omitNorms=false: I thought this would : give answers with shorter NAME_ANALYZED field a higher score. I've tested : that solution, but it's not working. I guess this is because there is no : score for fq parameter (all my answers have same score) both of tho

Re: How to sort results based on matching term position

2011-09-27 Thread Chris Hostetter
: We have a requirement to sort/boost documents returned for phrase : matches depending on where the match was within the field, the nearer : the beginning the better Deja-Vu, see the reply i just sent to a similar thread... http://www.lucidimagination.com/search/document/dfa18d52e7e8197c/gett

Re: How to reserve ids?

2011-09-27 Thread Gabriele Kahlout
Otis, I'm following up on this as solving my problem though the stopwords mechanism would be great. *Do stopwords apply also to the url/id field?* Continuing on the msn.com example, with "msn.com" as a stopword msn.comwebpage may still actually be indexed if neither the title nor the body contain

Re: Solr Cloud Number of Shard Limitation?

2011-09-27 Thread Erick Erickson
No, not really. The administration becomes "interesting", especially if the slaves are replicated. One thing to be aware of is the "laggard shard" issue. Essentially, your aggregated response is limited by the slowest shard to respond. As you have more and more shards, the odds that at least one o

Re: Solr Cloud Number of Shard Limitation?

2011-09-27 Thread Mark Miller
On Sep 26, 2011, at 11:42 AM, Jamie Johnson wrote: > Is there any limitation, be it technical or for sanity reasons, on the > number of shards that can be part of a solr cloud implementation? The loggly guys ended up hitting a limit somewhere. Essentially, whenever the cloud state is updated,

Re: Search for empty string in 1.4.1 vs 3.4

2011-09-27 Thread Chris Hostetter
: I am using SOLR 1.4.1. When I search for empty string in a string field, : q=tag_facet:"", it return documents with values in tag_facet. I can't reproduce the behavior you are describing. when i query the Solr 1.4.1 example with the following URL... http://localhost:8983/solr/select/?q=id:%

Re: How to reserve ids?

2011-09-27 Thread Otis Gospodnetic
Hi Gabriele, If you have a copy of Lucene in Action 2, that may be the easiest place to read up on stopwords.  In short, when something is a stopword, it is just that stopword that gets removed and thus not indexed and thus when you search for it, it will not find a document that originally had

Re: field value getting null with special char

2011-09-27 Thread Ranveer
Hi Erick, I am using SolrQuery.setFields and following are my code : query.setParam("fq", "type:Livescore"); query.addSortField("last_updated", ORDER.desc); query.setRows(5); I think solr is connecting to server because with same query I am getting field value other than special char field val