> > 1) hyphens - if user types "ema" or "e-ma" I want to
> > suggest "email"
> >
> > 2) accents - if user types "herme" want to suggest
> > "Hermès"
>
> Accents can be removed with using MappingCharFilterFactory
> before the tokenizer. (both index and query time)
>
> mapping="mapping-ISOLatin1
I think that is just sending 2 fq facet queries through. In Solr PHP I would do
that with, e.g.
$params['facet'] = true;
$params['facet.fields'] = array('Size');
$params['fq'] => array('sex' => array('Men', 'Women'));
but yes i think you'd have to send through what the current facet query is and
Hi Hasnain,
You'll need to apply POS (Part of Speech) on the input at/before indexing, then
store a payload with your adjective terms, and finally use of those payload
values to change the scoring at query time.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosyste
The closest you can get to debugging (without actually debugging...) is
to look at the logs and use
http://wiki.apache.org/solr/DataImportHandler#Interactive_Development_Mo
de
Ephraim Ofir
-Original Message-
From: Allistair Crossley [mailto:a...@roxxor.co.uk]
Sent: Monday, October 04, 2
Hi Otis,
Thank you for replying, unfortunately Im unable to fully grasp what
you are trying to say, can you please elaborate what is payload with
adjective terms?
also Im using stopwords.txt to stop adjectives, adverbs and verbs, now when
I search for "Blue hammers", solr searches for "
On Mon, 2010-10-04 at 00:25 +0530, Arunkumar Ayyavu wrote:
> I'm looking at setting up multiple masters for redundancy (for index
> updates). I found the thread in this link
> (http://www.lucidimagination.com/search/document/68ac303ce8425506/multiple_masters_solr_replication_1_4)
> discussed this
Hey,
Yes that tool doesn't work too well for me. I can load it up and get the forms
on the left, but when I run a debug the right hand side tells me that the page
is not found. I *think* this is because I use a custom query string parameter
in my DIH XML for use with delta querying and this bei
Make sure you're not running into a case sensitivity problem, some stuff
in DIH is case sensitive (and some stuff gets capitalized by the jdbc).
Try using listing.ID instead of listing.id.
On a side note, if you're using mysql, you might want to look at the
CONCAT_WS function.
You might also want t
Hi,
I was wondering if there's a way to display facet options based on
previous facet values. For example, I've seen many shopping sites where
a user can facet by "Mens" or "Womens" apparel, then be shown "sizes" to
facet by (for Men or Women only - whichever they chose).
Is this somethi
Allistair,
Indeed, I have a column that already works using a lower-case id. I wish
> I could debug it somehow - see the SQL? Something particular about this
> config it is not liking.
>
you may want to try the MySQL Query-Log, to check which Queries are
performed?
http://dev.mysql.com/doc/refman
> I got your point. You want to retrieve "electric吉他"
> with the query 吉他. That's why you don't want EdgeNGram.
> If this is the only reason for NGram, I think you can
> transform "electric吉他" into two tokens "electric"
> "吉他" in TokenFilter(s) and apply EdgeNGram approach.
>
What TokenFilters
Thanks Ephraim. I tried your suggestion with the ID but capitalising it did not
work.
Indeed, I have a column that already works using a lower-case id. I wish I
could debug it somehow - see the SQL? Something particular about this config it
is not liking.
I read the post you linked to. This i
Hello list,
I've been successful with DIH to a large extent but a seemingly simple extra
column I need is posing problems. In a nutshell I have 2 entities let's say -
Listing habtm Contact. I have copied the relevant parts of the configs below.
I have run my SQL for the sub-entity Contact and t
On 10/1/2010 11:33 PM, Lance Norskog wrote:
Yes. stream.file and stream.url are independent of the request handler.
They do their magic at the very top level of the request.
However, there are no unit tests for these features, but they are widely
used.
Sorry Lance, are you agreeing that I can
Very clever thinking indeed. Well, that's certainly revealed the problem ...
${listing.id} is empty on my sub-entity query ...
And this because I prefix the indexed ID with a letter
This appears to modify the internal value of $listing.id for subsequent uses.
Well, I can work around this now
> What TokenFilters would split "electric吉他" into
> "electric" & "吉他"?
Is it possible to write a regex to capture Chinese text? (Unicode range?)
If yes, you can use PatternReplaceFilter to transform electric吉他 into
electric_吉他.
After that WordDelimeterFilterFactory can produce two adjacent to
I suspect you're not actually including the path to those jars.
SolrException should be in your solrj jar file. You can test this
by executing "jar -tf apacheBLAHBLAH.jar" which will dump
all the class names in the jar file. I'm assuming that you're
really including the version for the * in the sol
> Does anyone know how to deal with these 2 issues when using
> NGramFilterFactory for autocomplete?
>
> 1) hyphens - if user types "ema" or "e-ma" I want to
> suggest "email"
>
> 2) accents - if user types "herme" want to suggest
> "Hermès"
Accents can be removed with using MappingCharFilterFa
Hi,
does the more like this search uses terms positions information in the
score formula ?
I updated the SolrJ JAR requirements to be clearer on the wiki page given how
many of these SolrJ emails I saw coming through since joining the list. I just
created a test java class and imported the removed JARs until I found out the
minimal set required.
On Oct 4, 2010, at 8:27 AM, Erick Eric
> I agree with the issues with NGramFilterFactory you pointed
> out and I really want to avoid using it. But the problem is
> that I have Chinese tags like "电吉他" and multi-lingual
> tags like "electric吉他".
I got your point. You want to retrieve "electric吉他" with the query 吉他. That's
why you don't
I have tried a more elaborate join also following the features example of the
DIH example but same result - SQL works fine directly but Solr is not indexing
the array of full_names per Listing, e.g.
Am I missin
Or, , this: http://www.sematext.com/products/autocomplete/index.html ,
which happens to use the same "bass" examples as the original poster. :)
You can see this Autosuggest in action on http://search-lucene.com/ .
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
- Origi
Hi,
It's a matter of the config. Have a look at the highlighter section of
solrconfig.xml.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: "efr...@gmail.com"
> To: solr-user@lucene.
On Mon, Oct 4, 2010 at 10:16 AM, Xavier Schepler <
xavier.schep...@sciences-po.fr> wrote:
> Hi,
>
> does the more like this search uses terms positions information in the
> score formula ?
>
no, it would be nice if it did use them though (based upon query terms),
seems like it would yield improve
Hi,
Would this help you:
http://wiki.apache.org/solr/SolrReplication#Setting_up_a_Repeater
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: Arunkumar Ayyavu
> To: solr-user@lucene.apa
On 04/10/2010 16:40, Robert Muir wrote:
On Mon, Oct 4, 2010 at 10:16 AM, Xavier Schepler<
xavier.schep...@sciences-po.fr> wrote:
Hi,
does the more like this search uses terms positions information in the
score formula ?
no, it would be nice if it did use them though (based upon qu
I asked the exact question the day before. If you or anyone else has
pointer to the solution, please share on the mail list. For now, I am
using Perl script instead to query Solr server.
Thanks,
Xin
-Original Message-
From: ankita shinde [mailto:ankitashinde...@gmail.com]
Sent: Saturday,
i rewrote the top jar section at
http://wiki.apache.org/solr/Solrj
and the following code then runs fine.
import java.net.MalformedURLException;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.SolrServerException;
Ok. Thanks for the quick response.
Vincent Vu Nguyen
Division of Science Quality and Translation
Office of the Associate Director for Science
Centers for Disease Control and Prevention (CDC)
404-498-6154
Century Bldg 2400
Atlanta, GA 30329
-Original Message-
From: Allistair Crossley [m
Thanks, Allistair. I will give it a try later today.
-Original Message-
From: Allistair Crossley [mailto:a...@roxxor.co.uk]
Sent: Monday, October 04, 2010 11:31 AM
To: solr-user@lucene.apache.org
Subject: Re: solrj
i rewrote the top jar section at
http://wiki.apache.org/solr/Solrj
an
Yes, by adding fq back into the main query you will get results increasingly
filtered each time.
You may run into an issue if you are displaying facet counts, as the facet part
of the query will also obey the increasingly filtered fq, and so not display
counts for other categories anymore from
Wow, this is probably the most annoying Solr issue I've *ever* dealt
with. First question: How do I debug Dismax, and its query handling?
Issue: When I query against this StrField, I am attempting to do an
*exact* match... Albeit one that is case-insensitive :). So, 90%
exact. It works in a maj
Hi,All
I have a problem about Solr Date Field.
The problem is like below.
SolrBean foo = new Bean();
// The type of pubDate property is "java.util.Date" and rs means
"java.sql.ResultSet" so rs.getDate("pub_date") retuns java.sql.Date
Object.
bean.pubDate = rs.getDate("pub_date");
the value of p
>
> name="idstr" class="solr.StrField">
>
> class="solr.PatternTokenizerFactory" pattern="(.*)"
> group="1"/>
> class="solr.LowerCaseFilterFactory"/>
>
This definition is invalid. You cannot use charfilter/tokenizer/tokenfilter
with solr.StrField.
But it
On Mon, Oct 4, 2010 at 10:24 PM, Kouta Osabe wrote:
> Hi,All
>
> I have a problem about Solr Date Field.
[...]
> the value of pub_date column comes from MySQL and actually value is
> "2010-10-05 00:00:00".
>
> I regist "foo" bean to Solr through SolrJ like "new
> CommonsHttpSolrServer().addBean(f
> I expected "2010-10-05 00:00:00" to display by Solr Admin
> but
> "2010-10-04T15:00:00Z" displayed on Solr Admin.
>
> is this timezone problem?(I live in Tokyo Japan).
Probably. Solr stores/converts dates in/to UTC timezone.
Wow, that's pretty infuriating. Thank you for the suggestion. I
added it to the Wiki, with the hope that if it contains misinformation
then someone will correct it and, consequently, save me from another
one of these experiences :) (...and to also document that, hey, there
is a tokenizer which t
Make sense. However, one of the reasons I was asking was that we've
configured Solr to use RAMDirectory and it appears that it loads the index
into memory twice. I suspect the first time is for warming firstSearcher and
the second time is for warming newSearcher. It makes our jvm memory
requiremen
Hi,
We are maintaining multiple SOLR index, one for each source (the source data
is too huge to be stored in a single index) and we are using shards to do a
distributed search across all the SOLR index.
We also update the SOLR documents (which was already indexed) using XML push
http://server:8
Hi,
I asked this question a month ago on lucene-user and was referred here.
I have content being analyzed in Solr using these tokenizers and filters:
Basically I want to be able to search against
I have made progress on this by writing my own Analyzer. I basically added
the TokenFilters that are under each of the solr factory classes. I had to
copy and paste the WordDelimiterFilter because, of course, it was package
protected.
On Mon, Oct 4, 2010 at 3:05 PM, Max Lynch wrote:
> Hi,
>
I'm fiddling around with DisMax again, my love/have relationship continues.
Basically I'm hunting for custom parser examples. But I've also got some
links that might be helpful to others.
Alternate XML Syntax:
Project: XML Query Syntax is being worked on, in SOLR-839
Link: https://issues.apache.
Hi Chris
Thanks. But do you have any suggest or work-around to deal with it?
Floyd
2010/10/2 Chris Hostetter
>
> : But when I issue the query with shard(two instances), the response XML
> will
> : be like following.
> : as you can see, that score has bee tranfer to a element of
>..
Hi there,
What is the difference between Lucid distribution of Solr and Apache
distribution?
And can I use Lucid distribution for free in my commercial project?
Hi,
Even with the sharded index you typically go to the master(s) that has/have all
your shards and update the appropriate shard (in your case, based on the doc
type).
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
-
Hi,
If you want "blue" to be used in search, then you should not treat it as a
stopword.
Re payloads: http://search-lucene.com/?q=payload+score
and http://search-lucene.com/?q=payload+score&fc_type=wiki (even better, look
at
hit #1)
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene
I think this is a bad idea. The tf.idf algorithm will already put a higher
weight on "hammers" than on "blue", because "hammers" will be more rare than
"blue". Plus, you are making huge assumptions about the queries. In a search
for "Canon camera", "Canon" is an adjective, but it is the importan
Hi,
I *think* this is not what Vincent was after. If I read the suggestions
correctly, you are saying to use &fq=x&fq=y -- multiple fqs.
But I think Vincent is wondering how to end up with something that will let him
create a UI with multi-level facets (with a single request), e.g.
Footwear (1
There are EdgeNGramFilterFactory & EdgeNGramTokenizerFactory.
Likewise there are StandardFilterFactory & StandardTokenizerFactory.
LowerCaseFilterFactory & LowerCaseTokenizerFactory.
Seems like they always come in pairs.
What are the differences between FilterFactory and TokenizerFactory? When
Hi all,
i updated my solr trunk to revision 1004527. when i go for compiling
the trunk with ant i get so many warnings, but the build is successful. the
warnings are here:::
common.compile-core:
[mkdir] Created dir:
/home/satya/temporary/trunk/lucene/build/classes/java
[javac] Compi
Hi,
We configured solr search for huge data(~6 million records).For increasing the
seach performance, is there any admin level configurations do we need to
do.Please suggest some of the admin level settings.
Regards,
Siva
Hello,
I have a string "Marsh 1" (no quotes while searching). If I put "Marsh 1" in
the search box with no quotes I get expected results back but when I search
for just "1" (again no quotes) I don't get any results back. I use
WorldDelimiterFactory as follow. Any idea?
--
Hi Tommaso,
I have register in the both sites and got the api keys.
But i am getting a new error.
Oct 4, 2010 6:15:04 PM
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl
callAnalysisComponentProcess(405)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEnginePr
Take a look at "Mastering the Power of Faceted Search with Chris
Hostetter"
(http://www.lucidimagination.com/solutions/webcasts/faceting). I think
there's an example of what you're looking for there.
Ephraim Ofir
-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.co
Hi there,
If I dont need Morelikethis, spellcheck, highlight.
Can I remove this configuration section in solrconfig.xml?
In other workd, does solr load and use these SearchComponet on statup and
suring runtime?
Remove this configuration will or will not speedup query?
Thanks
Hi,
I have 2 documents with following values.
Doc1
Subject: Weekly transport
Doc2
Subject: Week report on transportation
I need to search documents in 4 formats
1. Begins with “week”
It should return documents which has "week" as first word, i.e. doc1
2. Begins with “week*”
It shoul
57 matches
Mail list logo