Hi Otis,
thanks again for your help. This does sound right but I am not familiar
enough with solr to venture into applying this.
I'll need something eventually if I want to use solr but maybe 1.5 will
have it built in for me just when I need it. :)
Thanks,
RR
Otis Gospodnetic wrote:
Hi R
Sorry :) it is not 500 doc per sec. ( It is what i wish I think) It is
500 doc per MINUTE..
On Tue, Sep 27, 2011 at 7:14 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
> Hello,
>
> > PS: solr streamindex is not option because we need to submit javabin...
>
>
> If you are referrin
Hola Tomás.
it seems that yes, using q = "word1 word2" over a tokenized field, it seems to
work. I will do some additional testing.
thanks a lot,
rode.
> -Mensaje original-
> De: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com]
> Enviado el: lunes, 26 de septiembre de 2011 22:12
>
Our producer (hadoop mapper prepare the docs for submitting and the reducer
diriectly submit from solrj http submit..) now 32 reducer but still the
indexing speed 500 - 700 doc per minute. submission coming from a hadoop
cluster so submit speed is not a problem. I couldnt use the full solr inde
1- each document around 50 kb - 150 kb (web document)
2-final index is 40 gig
3-jre memory carefully given.
On Mon, Sep 26, 2011 at 9:57 PM, Jaeger, Jay - DOT wrote:
> 500 / second would be 1,800,000 per hour (much more than 500K documents).
>
> 1) how big is each document?
> 2) how big are
For SUSS https://issues.apache.org/jira/browse/SOLR-1565 it says no
binary support...
When we try to use from solrj binary through SUSS adding document took
thousands of milliseconds to million per doc.!! so we turn back normal
submit.
On Tue, Sep 27, 2011 at 7:14 AM, Otis Gospodnetic <
ot
Thanks Hoss. This looks very interesting but does not yet have
support for highlighting. I'll watch this though and look at
transitioning to this once highlighting support is available.
On Tue, Aug 9, 2011 at 3:56 PM, Chris Hostetter
wrote:
>
> : I recently modified the DefaultSolrHighlighter t
hmm...perhaps I spoke too soon. I looked at the patch and there are
some changes in highlighter, can anyone confirm that highlighting is
supported on this?
Also is there any status on batch retrieval of these extra fields to
improve performance?
On Tue, Sep 27, 2011 at 8:47 AM, Jamie Johnson wr
Hi, I had been trying to boost my recent documents, using what is described
here http://wiki.apache.org/solr/FunctionQuery#Date_Boosting
My date field looks like
However, upon trying to do ms(NOW, created_at) it shows the error
Can't use ms() function on non-numeric legacy date field created_a
Try solr.TrieDateField instead
On Tuesday 27 September 2011 15:53:30 Pranav Prakash wrote:
> Hi, I had been trying to boost my recent documents, using what is described
> here http://wiki.apache.org/solr/FunctionQuery#Date_Boosting
>
> My date field looks like
>
> omitNorms="true"/>
> ="true"
find the attachments.
thanks
On Sun, Sep 25, 2011 at 7:41 AM, Bill Bell wrote:
> Send us the example "solr.xml" and "schema.xml'". You are missing fields
> in the schema.xml that you are referencing.
>
> On 9/24/11 8:15 AM, "ahmad ajiloo" wrote:
>
> >hello
> >Solr Tutorial page explains about i
Hi Isan,
The schema.xml seems OK to me.
Is "textForQuery" the only field you are searching in ?
Are you also searching on any other non text based fields ? If yes, please
provide schema description for those fields also.
Also, provide your solrconfig.xml file.
On Tue, Sep 27, 2011 at 1:12 AM, I
I thought that a similarity class will only affect the scoring of a
single field.. not across multiple fields? Can anyone else chime in with
some input? Thanks.
On 9/26/11 9:02 PM, Otis Gospodnetic wrote:
Hi Mark,
Eh, I don't have Lucene/Solr source code handy, but I *think* for that you'd
n
see
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html
On 27 September 2011 16:04, Mark wrote:
> I thought that a similarity class will only affect the scoring of a single
> field.. not across multiple fields? Can anyone else chime in with some
> input? Thanks.
>
I need some recommendations for a new SOLR project.
We currently have a large (200M docs) production system using Lucene.Net and
what I would call our own .NET implementation of SOLR (built early on when SOLR
was less mature and did not run as well on Windows).
Our current architecture works
: Try solr.TrieDateField instead
and note the docs on the "ms" function, from the same wiki page you linked
to...
>> Arguments may be numerically indexed date fields such as TrieDate
>> (recommended field type for dates since Solr 1.4), or date math
>> (examples in SolrQuerySyntax) based on a
Hello,
I have implemented payloads at the index and query levels using specific
PayloadSimilarity and PayloadQparserPlugin classes.
Now I wish to check that the payloads processing is correct and thus I
inserted the following code to check the document scores of a Solr request:
// Di
https://issues.apache.org/jira/browse/LUCENE-3421
Note: if you are using this 'includeSpanScore=false' (which I think
you are, as thats where the bug applies), be aware this means the
score is *only* the result of your payload, boosts, tf, length
normalization, idf, none of this is incorporated in
Hi. I recently started using solr in a project, and experienced what I think
is strange matching behaviour, and would like some help in understanding
what happened. I'm using solr 3.1 with java 1.6 on linux.
My index consists of a set of phrases, which I'd like to match against
incoming text such
Hi,
1. Just curious - you have your defaultsearchfield - defaultquery as not
stored, how do you know that it contains what you think it contains?
2. the fieldType of defaultquery is query_text, am not sure what all
analyzers are you using on this fields type both at indexing time and
querying time
Hi,
No need to use reply-all and CC me directly, I'm on the list :)
It sounds like Solr is not the problem, but the Hadoop side. For example, what
if you change your reducer not to call Solr but do some no-op. Does it go
beyond 500-700 docs/minute?
Otis
Sematext :: http://sematext.com/
Gabriele,
Using "msn.com" as a stopword would simply mean that msn.com would not be
indexed and therefore a search for "msn.com" would not yield results. You
could still search for "hotmail" and it may match documents that have "msn.com"
token stored in them, even though "msn.com" is a stopwor
On Tue, Sep 27, 2011 at 7:46 PM, ahmad ajiloo wrote:
> find the attachments.
[...]
So, it is pretty clear then. As people have mentioned earlier,
your solr.xml has fields that are not defined in schema.xml.
E.g., you need to have a field with name="name" defined
for the particular field referred
Hello,
I have implemented spellchecker in two ways.
1. Adding a textspell type to schema.xml and making a copy field from original
content field, which is type text.
2. without adding new type and copy field. Simple adding name of spell field,
content to solrconfig.xml
I have an issue in
: I've been reading the information on the new join feature and am not quite
: sure how I would use it given my schema structure. I have "User" docs and
: "BlogPost" docs and I want to return all BlogPosts that match the fulltext
: title "cool" that belong to Users that match the description "solr"
: I'm thinking to add MDC variable, this will be name of core. Finally I'll
: use it in log4j configuration like this in ConversionPattern %X{core}
:
: The idea is that when Solr received a request I'll add this new variable
: "name of core".
:
: But I don't know if it's a good idea or not.
:
:
Hi all
I'm a relatively new solr user, and recently I discovered the interesting
solr cloud feature. I have some basic questions:
(please excuse me if I get the terminologies wrong)
- from my understanding, this is still a work in progress. How mature is it?
Is there any estimate on the official
I'm researching using DataImportHandler to import my data files utilizing
FileDataSource with FileListEntityProcessor and have a couple questions
before I get started that I'm hoping you guys can assist with.
1) I would like to put a file on the local filesystem in the configured
location and have
On 09/27/2011 05:05 PM, Yury Kats wrote:
You need to either submit the docs to both nodes, or have a replication
setup between the two. Otherwise they are not in sync.
I hope that's not the case. :/ My understanding (or hope maybe) is that
the new Solr Cloud implementation will support auto-shar
On 9/27/2011 5:16 PM, Darren Govoni wrote:
> On 09/27/2011 05:05 PM, Yury Kats wrote:
>> You need to either submit the docs to both nodes, or have a replication
>> setup between the two. Otherwise they are not in sync.
> I hope that's not the case. :/ My understanding (or hope maybe) is that
> the
: Subject: Re: hi. allowLeadingWildcard is it possible or not yet?
:
: i wonder the same thing... so wanna "re-animate" the topic
:
: is it possible?
Leading wildcard style queries can work, and can work very
efficiently, thanks to SOLR-1321.
The key is to use ReversedWildcardFilterFactory i
> -Original Message-
> From: Esteban Donato [mailto:esteban.don...@gmail.com]
> Sent: Monday, September 26, 2011 2:08 PM
> To: solr-user@lucene.apache.org
> Subject: aggregate functions in Solr?
>
> Hello guys,
>
> I need to implement a functionality which requires something similar
> t
: 1) giving NAME_ANALYZED a type where omitNorms=false: I thought this would
: give answers with shorter NAME_ANALYZED field a higher score. I've tested
: that solution, but it's not working. I guess this is because there is no
: score for fq parameter (all my answers have same score)
both of tho
: We have a requirement to sort/boost documents returned for phrase
: matches depending on where the match was within the field, the nearer
: the beginning the better
Deja-Vu, see the reply i just sent to a similar thread...
http://www.lucidimagination.com/search/document/dfa18d52e7e8197c/gett
Otis,
I'm following up on this as solving my problem though the stopwords
mechanism would be great. *Do stopwords apply also to the url/id field?*
Continuing on the msn.com example, with "msn.com" as a stopword
msn.comwebpage may still actually be indexed if neither the title nor
the body
contain
No, not really. The administration becomes "interesting",
especially if the slaves are replicated.
One thing to be aware of is the "laggard shard" issue.
Essentially, your aggregated response is limited by the
slowest shard to respond. As you have more and more
shards, the odds that at least one o
On Sep 26, 2011, at 11:42 AM, Jamie Johnson wrote:
> Is there any limitation, be it technical or for sanity reasons, on the
> number of shards that can be part of a solr cloud implementation?
The loggly guys ended up hitting a limit somewhere. Essentially, whenever the
cloud state is updated,
: I am using SOLR 1.4.1. When I search for empty string in a string field,
: q=tag_facet:"", it return documents with values in tag_facet.
I can't reproduce the behavior you are describing.
when i query the Solr 1.4.1 example with the following URL...
http://localhost:8983/solr/select/?q=id:%
Hi Gabriele,
If you have a copy of Lucene in Action 2, that may be the easiest place to read
up on stopwords. In short, when something is a stopword, it is just that
stopword that gets removed and thus not indexed and thus when you search for
it, it will not find a document that originally had
Hi Erick,
I am using SolrQuery.setFields and following are my code :
query.setParam("fq", "type:Livescore");
query.addSortField("last_updated", ORDER.desc);
query.setRows(5);
I think solr is connecting to server because with same query I am
getting field value other than special char field val
40 matches
Mail list logo