The pb is that i don't handle fields name. It can be anything (i want to let
the developpers free for this)
Where and how can I change to fields name on the fly (to add "_i" for example)
before indexing?
Do i have to use a transformer? an UpdateRequestProcessor? ...
Which api suits for this?
> There are a lot of company names that
> people are uncertain as to the correct spelling. A few of
> examples are:
> 1. best buy, bestbuy
> 2. walmart, wal mart, wal-mart
> 3. Holiday Inn, HolidayInn
>
> What Tokenizer Factory and/or TokenFilterFactory should I
> use so that somebody typing "wal
Thanks! I'll test it ASAP!
Noble Paul wrote :
>
>https://issues.apache.org/jira/browse/SOLR-1421
>
On Fri, Sep 11, 2009 at 1:22 AM, Daniel Cohen <
daniel.michael.co...@gmail.com> wrote:
> *HI there-**
> *
> *I'm trying to get the dataimporthandler working to recursively parse the
> content of a root directory, which contain several other directories
> beneath
> it... The indexing seems to encou
Thanks. Maybe I'm misusing the dismax request handler, but the ability to
search all fields is just too good a feature.
I found the following description of how to do facetted queries with the
dismax. I have not tried it yet but will.
http://fisk.stjernesludd.net/archives/16-Solr-Using-the-dism
Hi Team ,
Can anyone please answer to this post
I have an issue while i am working with solr.
I am working on blog module,here the user will be creating more blogs,and he
can post on it and have several comments for post.For implementing this
module i am using solr 1.4.
When i get blog
Great! it works !
Thanks Paul. I appreciate your reactivity.
nourredine khadri wrote :
>
>Thanks! I'll test it ASAP!
>
>Noble Paul wrote :
>>
>>https://issues.apache.org/jira/browse/SOLR-1421
>>
thanks for reporting the issue.
On Fri, Sep 11, 2009 at 2:54 PM, nourredine khadri
wrote:
> Great! it works !
>
> Thanks Paul. I appreciate your reactivity.
>
>
>
> nourredine khadri wrote :
>>
>
>>Thanks! I'll test it ASAP!
>>
>
>
>
>>Noble Paul wrote :
>>>
>>>https://issues.apache.org/jira/brow
hey XpathEntityprocessor does not work with wildcard xpath like '//a...@class'
if you just wish to index htl use a PlaintextEntityProcessor with
HTMLStripTransformer
On Fri, Sep 11, 2009 at 1:22 AM, Daniel Cohen
wrote:
> *HI there-**
> *
> *I'm trying to get the dataimporthandler working to rec
"Not having any facet" and "Not using a filter cache" are two different
things. If you're not using query filters, you can still have facet
calculated and returned as part of the search result. The facet
component uses lucene's field cache to retrieve values for the facet field.
Jonathan Ariel
1,5 GB already seems like quite a bit but adding more just might solve it
... try something like 3GB (if your machine supports it) and see if that
helps?
If 3 GB still doesn't cut it then the problem is most likely somewhere else
and i'd suggest looking at the application with a memory profiler to
If you use DIH for indexing writing a transformer is the simplest
thing. You can even write it in javascript
On Fri, Sep 11, 2009 at 1:13 PM, nourredine khadri
wrote:
>
> The pb is that i don't handle fields name. It can be anything (i want to let
> the developpers free for this)
> Where and how
Ok, i'll try the transformer (javascript needs jdk1.6 i think)
Thanks again.
Noble Paul wrote :
>
>If you use DIH for indexing writing a transformer is the simplest
>thing. You can even write it in javascript
>
Cool, thanks a lot for sharing your experience and thoughts!
I will run a test like you suggested.
However, I've got some questions. The facet list I would retrieve for step 1
- it would 'only' contain the field values for what I faceted on such as
Hotel-ID, right?
How do I receive hotelname, desc
Oh... another question regard this. If I disabled the query cache and
document cache and I execute a query without filter cache (no facets, no
filter query, etc.). Why the first time I execute the query it takes around
400ms and the second time 10ms? It seems like it should always take the same
amo
yes of course. but in my case I'm not using filter queries nor facets.
it is a really simple query. actually the query params are like this:
?q=location_country:1 AND category:377 AND location_state:"CA" and
location_city:"Sacramento"
location_country is an integer
category is an integer
location_
I haven't experienced any such problems; it's just a query-parser plugin
that adds some behavior on top of the normal query parsing. In any case,
even if I use a custom request handler with my custom parser, can I get
facet-queries to use this custom parser by default as well?
-Stephen
On Thu, S
I would like to automatically calculate the boost factor of a document
based on the values of other fields. For example;
1.2
1.5
0.8
Document boost = 1.2*1.5*0.8
Is it possible to get SOLr to calculate the boost automatically upon
submission based on field values?
Cheers,
Gert.
Please hel
Hi all,
I've just got my geographic clustering component working (somewhat).
I've attached a sample resultset to this mail. It seems to work pretty
well and it's pretty fast. I have one issue I need help with concerning
the API though. At the moment my Hilbert field is a Sortable Integer,
and
What I want is the whole text of that field with every instance of the
search term high lighted, even if the search term only occurs in the
first line of a 300 page field. I'm not sure if mergeContinuous will
do that, or if it will miss everything after the last line that
contains the search term.
It's really just a matter of what you're intentions are. There are an awful
lot of highlighting params and so highlighting is very flexible and
customizable. Regarding snippets, as an example Google presents two snippets
in results, which is fairly common. I'd recommend doing a lot of
experimenting
Thank you, this worked perfectly.
Kevin Miller
Web Services
-Original Message-
From: caman [mailto:aboxfortheotherst...@gmail.com]
Sent: Thursday, September 10, 2009 9:48 PM
To: solr-user@lucene.apache.org
Subject: Re: An issue with using Solr Cell and multiple files
You are right.
Ahmet,
Thanks a lot. Your suggestion was really helpful. I tried using synonyms
before but for some reason it didn't work but this time around it worked.
On 09/11/2009 02:55 AM, AHMET ARSLAN wrote:
There are a lot of company names that
people are uncertain as to the correct spelling. A few of
On Thursday 10 September 2009 08:13:33 am Dan A. Dickey wrote:
> I'm posting documents to Solr using http (curl) from
> C++/C code and am seeing approximately 3.3 - 3.4
> documents per second being posted. Is this to be expected?
> Granted - I understand that this depends somewhat on the
> machine
At the Lucene level there is the term index and the norms too:
http://search.lucidimagination.com/search/document/b5eee1fc75cc454c/caching_in_lucene
But 50s? That would seem to indicate it's the OS disk cache and you're
waiting for IO. You should be able to confirm if you're IO bound by
simply lo
On Fri, Sep 11, 2009 at 8:50 AM, Jonathan Ariel wrote:
> Oh... another question regard this. If I disabled the query cache and
> document cache and I execute a query without filter cache (no facets, no
> filter query, etc.). Why the first time I execute the query it takes around
> 400ms and the se
The link to download kstem is not working.
Any other link please
Yonik Seeley-2 wrote:
>
> On Mon, Sep 7, 2009 at 2:49 AM, darniz wrote:
>> Does solr provide any implementation for dictionary stemmer, please let
>> me
>> know
>
> The Krovetz stemmer is dictionary based (english only):
> http
Try adding this param: &hl.fragsize=3
(obviously set the fragsize to whatever very high number you need for your
largest doc.)
-Jay
On Fri, Sep 11, 2009 at 7:54 AM, Paul Tomblin wrote:
> What I want is the whole text of that field with every instance of the
> search term high lighted, even
: datefield:[X TO* Y] for X to Y-0....1
:
: This would be backwards-compatible. {} are used for other things and lexing
You lost me there ... {} aren't used for "other things" in the query
parser -- they're used for range queries that are exclusive of their end
points. datefield:{X TO
I'd actually like to see a detailed wiki page on how all the parts of
a score are actually calculated and inter-related, but I'm not
knowledgeable enough to write it =\
Thanks for your time!
Matthew Runo
Software Engineer, Zappos.com
mr...@zappos.com - 702-943-7833
On Sep 9, 2009, at 3:00 P
Ok thanks, if it's the IO OS Disk cache, which would be my options? changing
the disk to a faster one?
On Fri, Sep 11, 2009 at 1:32 PM, Yonik Seeley <
yonik.see...@lucidimagination.com> wrote:
> At the Lucene level there is the term index and the norms too:
>
> http://search.lucidimagination.com/
What are the differences between specification version and implementation
version
I downloaded the nightly build for September 05 2009 and it has a spec
version of 1.3 and the implementation version states 1.4-dev
What does that mean?
--
"Good Enough" is not good enough.
To give anything less
: I haven't experienced any such problems; it's just a query-parser plugin
: that adds some behavior on top of the normal query parsing. In any case,
: even if I use a custom request handler with my custom parser, can I get
: facet-queries to use this custom parser by default as well?
if you cha
For the record: even if you're only going to have one SOlrCore, using the
multicore support (ie: having a solr.xml file) might prove handy from a
maintence standpoint ... the ability to configure new "on deck cores" with
new configs, populate them with data, and then swap them in place for your
Factor 1: idf
If you do a search on "blue whales" you are probably much more
interested in whales than you are in things that are blue. The idf
factor takes this term rarity into account. In your case, color:blue
appears in over 9000 documents, but productNameSearch:blue only
appears in 120 doc
At a high level, there's this:
http://wiki.apache.org/solr/SolrRelevancyFAQ#head-343e33b6472ca53afb94e1544ae3fcf7d474e5fc
-Yonik
http://www.lucidimagination.com
On Fri, Sep 11, 2009 at 1:05 PM, Matthew Runo wrote:
> I'd actually like to see a detailed wiki page on how all the parts of a
> scor
I think you and Shalin are having a vocabulary problem.
you used the term "function" which has a specific meaning in Solr, if you
want to write a new function, which works with the existing Function
syntax solr provides, and can be nested inside of other functions, then
what you want to write
On Fri, Sep 11, 2009 at 2:36 PM, Chris Hostetter
wrote:
>
> : I haven't experienced any such problems; it's just a query-parser plugin
> : that adds some behavior on top of the normal query parsing. In any case,
> : even if I use a custom request handler with my custom parser, can I get
> : facet
: I'm trying to delete using SolJ's "deleteByQuery", but it doesn't like
: it that I've added an "fq" parameter. Here's what I see in the logs:
the error ayou are getting is because deleteByQuery takes in a solr query
string ... if you include "&fq=" in that string, then you aren't passing a
q
: I am building Solr from source. During building it from source I am getting
: following error.
1) what ant targets are you running? ... there's no reason (i can think
of) for someone building from SVN to need to generate the maven artifacts ...
try
"ant dist"
2) I opened a bug for this: SOL
I'd like to propose a change to the facet response structure. Currently, it
looks like:
{'facet_fields':{'field1':[('value1',count1),('value2',count2),(null,missingCount)]}}
My immediate problem with this structure is that null is not of the same
type as the 'value's. Also, the meaning of the (
Thanks! I had to find this in the Lucene query parser syntax- it is
not mentioned anywhere in the Solr wiki. You are right [a TO z} and {a
TO z] are obvious improvements and solve the bucket-search problem the
right way. But this collides with wild-card range searches.
What does this mean?
This sounds like a memory-handling problem. The JVM could be too
small, forcing a lot of garbage collections during the first search.
It could be too big and choke off the OS disk cache. It could be too
big and cause paging.
Does this search query include a sort command? Sorting creates a large
da
Do you mean that it's been renamed, so this should work?
...
optimize
...
Noble Paul നോബിള് नोब्ळ्-2 wrote:
>
> before that backupAfter was called "snapshot"
>
--
View this message in context:
http://www.nabble.com/Backups-using-Replication-tp25350083p25407695.
I've verified that renaming backAfter to snapshot works (I should've checked
before asking). Thanks Noble!
wojtekpia wrote:
>
>
>
>
> ...
> optimize
> ...
>
>
>
>
>
--
View this message in context:
http://www.nabble.com/Backups-using-Replication-tp25350083p2540
On Fri, Sep 11, 2009 at 3:59 PM, Lance Norskog wrote:
> Thanks! I had to find this in the Lucene query parser syntax- it is
> not mentioned anywhere in the Solr wiki. You are right [a TO z} and {a
> TO z] are obvious improvements and solve the bucket-search problem the
> right way. But this collid
: 1. dismax query handler and filter query (fq)
:
: if query= coffee , fq= yiw_bus_city: san jose,
:
: I get 0 results for this query again, but this one works fine, If mention
: qt=standard query handler
with qt=standard this is matching whatever your defaultSearchField is
configured to be ..
The index is 8GB and I'm giving it 1,5 GB of RAM
On Fri, Sep 11, 2009 at 5:09 PM, Lance Norskog wrote:
> This sounds like a memory-handling problem. The JVM could be too
> small, forcing a lot of garbage collections during the first search.
> It could be too big and choke off the OS disk cache.
: This has to be done by an UpdateRequestProcessor
I think the SignatureUpdateProcessor does exactly what you want ... you
just need a Signature implementation that does a simple concat (instead of
an MD5)
so we have a simple identity signature? .. it seems like it would be
trivial.
-Hoss
: What are wild-card range searches?
i'm pretty sure we was just refering to open ended range searchers, like
the example he asked about...
: > What does this mean?
: >
: > {* TO *}
:
: Same thing as [* TO *] - not worth trying to make it different IMO.
...right, that's something the Sol
: Subject: where can i find solr1.4
: In-Reply-To: <13bed5c20909090154v4507e091k4fefeb073ff69...@mail.gmail.com>
http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing message, instea
Ah that makes more sense. It does seem that the coord would be a good
option especially in cases like this.
--
Jeff Newburn
Software Engineer, Zappos.com
jnewb...@zappos.com - 702-943-7562
> From: Yonik Seeley
> Reply-To:
> Date: Fri, 11 Sep 2009 14:44:50 -0400
> To:
> Subject: Re: Nonsensic
Is its possible to concatenate two fields and copy it to a new field, in the
schema.xml file
I am importing from two tables and both have numeric value as primary key.
If i copy just the primary key, which is a number, from both the tables, to
one field and make it primary key, records may get ov
: What I dont understand is whether a requesthandler and a queryparser is
: the same thing, i.e. The configuration contains a REQUESTHANDLER with
: the name 'dismax', but does not contain a QUERYPARSER with the name
: 'dismax'. Where does the 'dismax' queryparser come from? Do I have to
: conf
Hi,
I have a newbie question about the 'standard' requestHandler in
solrconfig.xml. What I like to know is where is the config information for
this requestHandler kept? When I go to http://localhost:8983/solr/admin, I
see the following info, but am curious where are the supposedly 'chained'
co
Sounds like you want to use a FunctionQuery: See http://wiki.apache.org/solr/FunctionQuery
. Either that or roll it up into the document boost, but that loses
some precision.
On Sep 11, 2009, at 10:05 AM, Villemos, Gert wrote:
I would like to automatically calculate the boost factor of a d
Hi
i want to get some answers to some of my questions.
Going by the Solr Wiki There are three approaches for Stemming
Porter or Reduction Algorithm
As far as i know there is "solr.EnglishPorterFilterFactory" and there is
"solr.SnowballPorterFilterFactory" Both uses the same stemming algorithm.
RequestHandlers are configured in solrconfig.xml. If no components are
explicitly declared in the request handler config the the defaults are used.
They are:
- QueryComponent
- FacetComponent
- MoreLikeThisComponent
- HighlightComponent
- StatsComponent
- DebugComponent
If you wanted to have a cus
: What are the differences between specification version and implementation
: version
those are concepts from the Java specification for jars and wars (more
info then you could ever possibly want in the URLs below)
: I downloaded the nightly build for September 05 2009 and it has a spec
: versi
On Sat, Sep 12, 2009 at 12:12 AM, Chris Hostetter
wrote:
>
> For the record: even if you're only going to have one SOlrCore, using the
> multicore support (ie: having a solr.xml file) might prove handy from a
> maintence standpoint ... the ability to configure new "on deck cores" with
> new config
On Fri, Sep 11, 2009 at 6:21 AM, darniz wrote:
>
> hello
> i have a task where my user is giving me 20 words of english dictionary and
> i have to run a program and generate a report with all stemmed words.
>
> I have to use EnglishPorterFilterFactory and SnowballPorterFilterFactory to
> check wh
On Thu, Sep 10, 2009 at 7:08 AM, Silent Surfer wrote:
> Hi ,
>
> Currently we are using Solr 1.3 and we have the following requirement.
>
> As we need to process very high volumes of documents (of the order of 400
> GB per day), we are planning to separate indexer(s) and searcher(s), so that
> the
On Fri, Sep 11, 2009 at 2:35 AM, Paul Rosen wrote:
> Hi again,
>
> I've mostly gotten the multicore working except for one detail.
>
> (I'm using solr 1.3 and solr-ruby 0.0.6 in a rails project.)
>
> I've done a few queries and I appear to be able to get hits from either
> core. (yeah!)
>
> I'm fo
On Fri, Sep 11, 2009 at 12:02 PM, dharhsana wrote:
>
> I am working on blog module,here the user will be creating more blogs,and
> he
> can post on it and have several comments for post.For implementing this
> module i am using solr 1.4.
>
> When i get blog details of particular user, it brings th
On Fri, Sep 11, 2009 at 11:23 AM, Christian Zambrano wrote:
> There are a lot of company names that people are uncertain as to the
> correct spelling. A few of examples are:
> 1. best buy, bestbuy
> 2. walmart, wal mart, wal-mart
> 3. Holiday Inn, HolidayInn
>
> What Tokenizer Factory and/or Token
: > For the record: even if you're only going to have one SOlrCore, using the
: > multicore support (ie: having a solr.xml file) might prove handy from a
: > maintence standpoint ... the ability to configure new "on deck cores" with
...
: Yeah, it is a shame that single-core deployments (n
On Sat, Sep 12, 2009 at 9:43 AM, Chris Hostetter
wrote:
>
> : > For the record: even if you're only going to have one SOlrCore, using
> the
> : > multicore support (ie: having a solr.xml file) might prove handy from a
> : > maintence standpoint ... the ability to configure new "on deck cores"
> wi
Jay, it would be great if you can add this example to the Solrj wiki:
http://wiki.apache.org/solr/Solrj
On Fri, Sep 11, 2009 at 5:15 AM, Jay Hill wrote:
> Set up the query like this to highlight a field named "content":
>
>SolrQuery query = new SolrQuery();
>query.setQuery("foo");
>
>
On Sat, Sep 12, 2009 at 12:18 AM, Stephen Duncan Jr <
stephen.dun...@gmail.com> wrote:
> >
> My experience (which is on a trunk build from a few weeks back of Solr
> 2.4),
> is that changing the default parser for the handler does NOT change it for
> facet.query. I had expected it would, but was
On Sat, Sep 12, 2009 at 1:20 AM, smock wrote:
>
> I'd like to propose a change to the facet response structure. Currently,
> it
> looks like:
>
> {'facet_fields':{'field1':[('value1',count1),('value2',count2),(null,missingCount)]}}
>
> My immediate problem with this structure is that null is not
On Sat, Sep 12, 2009 at 3:25 AM, Mohamed Parvez wrote:
> Is its possible to concatenate two fields and copy it to a new field, in
> the
> schema.xml file
>
> I am importing from two tables and both have numeric value as primary key.
>
> If i copy just the primary key, which is a number, from both
On Thu, Sep 10, 2009 at 1:02 PM, wrote:
>
> I tried MoreLikeThis (StandardRequestHandler with mlt arguments)
> with a single solr server and it works fine. However, when I tried
> the same query with sharded servers, I don't get the moreLikeThis
> key in the results.
>
> So my question is, Is Mor
>> So my question is, Is MoreLikeThis with StandardRequestHandler
>> supported on shards? If not, is MoreLikeThisHandler supported?
> No, MoreLikeThis does not work with distributed search currently.
> There is an issue open with a couple of patches though.
> See https://issues.apache.org/jira/bro
On Sat, Sep 12, 2009 at 11:03 AM, wrote:
> >> So my question is, Is MoreLikeThis with StandardRequestHandler
> >> supported on shards? If not, is MoreLikeThisHandler supported?
>
> > No, MoreLikeThis does not work with distributed search currently.
> > There is an issue open with a couple of patc
>On Fri, Sep 11, 2009 at 6:48 AM, venn hardy wrote:
>>
>> Hi Fergus,
>>
>> When I debugged in the development console
>> http://localhost:9080/solr/admin/dataimport.jsp?handler=/dataimport
>>
>> I had no problems. Each category/item seems to be only indexed once, and no
>> parent fields are avai
75 matches
Mail list logo