Hey,
Thanks so much for your outstanding response. I have been buisy for
a few days so have not had a chance to try it out. I have now tried to
install trunc of solr and when i run 'ant test' I encounter the following:
[junit] Testsuite:
org.apache.lucene.facet.taxonomy.directory.Tes
Shyam,
The thing is that in order to use the leading wildcard, you don't
necessarily need to use ReversedWildcardFilterFactory, there is another way
to do this, which was turned off by default due to its inefficiency for the
case of big term dictionaries. Not sure if this has changed in solr 4.0
t
Hi, Ted Dunning,
Thank you for your reply. I can understand your point on putting a "language_s"
field and then keeping all the files together, which speed-up searching.
But then there occurs a problem of using analyzer in indexing. I assume files
encoded in different language should be ha
Hi.
I found the solutions for that.
You can apply a new filter for that field. It´s possible to define a type
text field with a new filter
**
That means you will generate the reverse of the phone number.
For instance
08774589 and after that the reverse is 98547780.
Becaouse * only works at the
i am using surround parser to perform span queries and getting the required
result ,but i want to highlight the term in result set and highlighter i
guess does not support surround query parser . Are their any plugin or
patches available to do the same .
i guess highlighting should use surround que
Hi all,
I want to search a string in the same way as mysql like does i.e.
if the word is say Sunrise then if i search rise then again sun rise should
come and if I choose sun then again sunrise should come and if I use sunrise
then again sunrise should come in search.
the search should no be cas
Hi,
Note that there is yet another option, the XSLT UpdateRequestHandler
http://wiki.apache.org/solr/XsltUpdateRequestHandler (which obviously needs
better documentation).
It can take arbirarty XML in, along with a stylesheet for transformation, and
voila :)
I made a stylesheet to import sear
Hi,
I'm using an edismax handler
All fields and queries are lower case (LowerCaseFilterFactory in schema.xml)
Queries for television, Television and televisio* lead to results.
But Televisio* has no result.
Is this a bug, a feature or a misconfiguration?
Kind Regards
Matthias
So erase my solr folder and started from scratch.
>From the example folder I "java -jar start.jar" but there was a
solrconfig.xml missing. I copied this file from Solr-3.4.0 to my Solr-3.5.0
folder.
Now http://localhost:8983/solr/admin works but
http://localhost:8983/solr/browse gives me this res
You'll get this same behavior with edismax or lucene QP. Wildcard queries
are not analyzed (not the lowercase filter nor any other).
2012/1/20 Matthias Müller
> Hi,
>
> I'm using an edismax handler
> All fields and queries are lower case (LowerCaseFilterFactory in
> schema.xml)
>
> Queries for
Hi,
I want to be able to tell when the document was indexed, so I could
re-index it if it has changed in the meantime. Is there an easy way to do
this? Or I have to manualy put the date in the document and add a new field
in schema?
Thanks,
Alex
Hi Alex,
you can create a field in the schema.xml of type date or tdate called
(something like) idx_timestamp and set its default option to NOW then you
won't have to add any extra fields to the documents because it will be
automatically created when documents are indexed.
Hope it helps.
Tommaso
2
Otis,
The DataImportHandler is not only for import data from database? I don't
wanna to import data from database. I just wanna to persist the object in
my database and after send this saved object to SOLR. When the user find
some document using the SOLR search, i need to return this persistent
ob
HI,
Could you please help me with a quick question - Is there a way to restrict
lucene/solr fuzzy search to only analyze words that have more than 5 characters
and to ignore words with less than that (i.e. less than 6 character words)?
Thanks
-
Lance
Erick,
yes, currently I have 6 shards, which accept writes and reads. Sometimes I
delete data from all 6 and try to balance them, fill them up respectively,
so they have approx. the same amount of data on it. So all 6 are 'in
motion' somehow. I would like that the writing would take place more oft
As Tommaso said, adding a field to the schema.xml gives you an automatic
timestamp set at index time. The default schema.xml with Solr 3.5.0 has a
commented example:
--
Hector
On Jan 20, 2012, at 8:15 AM, Tommaso Teofili wrote:
> Hi Alex,
> you can create a field in the schema.xml o
Hello,
You can accomplish this by using n-grams or edge n-grams, which you'll use as
field types for fields where you want such matching to occur and that you will
specify in schema.xml. I hope this helps.
Otis
Performance Monitoring SaaS for Solr -
http://sematext.com/spm/solr-performa
That's valuable info there. :)
So then I wonder which of the two, RAM or SSD, has a more favorable price/size
trajectory...
Otis
Performance Monitoring SaaS for Solr -
http://sematext.com/spm/solr-performance-monitoring/index.html
- Original Message -
> From: Ted Dunning
> To:
Ted, Otis,
Thanks for the info. I’ll take a stab at answering your question.
RAM:
Both of you are correct that if you were able to keep your index in RAM, that
would give you the fastest results. This works if you have a small enough
index. At ZoomInfo, the index was 600 GB (they have mu
Hi,
If you save all fields you want to display in search results, then you don't
need to go to the database at search time.
If you do not save all fields you want to display in search results, then you
will need to first query Solr, get IDs of all matches you want to display, and
then from your
Ni, Bing
I believe you will need to pre-define fields for all languages you want to
handle and specify an appropriate language-specific analyzer for each of those
fields.
This also means that if you encounter a new language, you will need to adjust
your schema to support a new language. Of cou
Hi,
I want to be able to tell when the document was indexed, so I could
re-index it if it has changed in the meantime. Is there an easy way to do
this? Or I have to manualy put the date in the document and add a new field
in schema?
Thanks,
Alex
Ok. I thought there was an easier way to do this using hibernate search. I
will make this manually.
Thanks for help
2012/1/20 Otis Gospodnetic
> Hi,
>
> If you save all fields you want to display in search results, then you
> don't need to go to the database at search time.
> If you do not sa
It sounds bad with a 600GB index, but the techniques in the UMass achieve a
substantial compression of the in-memory size (remember that only part of
the index needs to be memory resident).
If you assume that you get 2x compression from compression and elision then
you only need 3-5 fat-memory mac
Write a tokenizer that does language ID and then picks which tokenizer to
use. Then record the language in the language id field.
What is there to elaborate?
On Fri, Jan 20, 2012 at 1:58 AM, nibing wrote:
> But then there occurs a problem of using analyzer in indexing. I assume
> files encoded
Otis,
Can you say why there needs to be a field per language? Why not have a
polyglot analyzer?
On Fri, Jan 20, 2012 at 7:29 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
> Ni, Bing
>
> I believe you will need to pre-define fields for all languages you want to
> handle and specify a
Hi,
I'm trying to define an EdgeNGram field but for some reason it doesn't work.
My fieldType definition is:
Hello!
Do you use the 'text' field for searching or the 'name' field ?
Remember that, when you use copyField the data that is copied is the
original data, not the analyzed one.
--
Regards,
Rafał Kuć
> Hi,
> I'm trying to define an EdgeNGram field but for some reason it doesn't work.
> My fiel
This is fixed for many cases in 3.6 (i.e. current but unreleased 3.x
code line) and trunk, see:
https://issues.apache.org/jira/browse/SOLR-2438
Best
Erick
2012/1/20 Tomás Fernández Löbbe :
> You'll get this same behavior with edismax or lucene QP. Wildcard queries
> are not analyzed (not the lowe
Thank you for your reply. I think that's probably the problem then. Is there
any way I can do this:
I have a list of programs. Each program has a name, keywords, description
and username. When I perform a search, I need to search all of those fields
at once. This is why I used a copyfield to copy e
There will be some increase pressure on your resources when replication
happens to the slaves. That said, you can also allocate resources differently
between the two. For instance, you do not need any memory for the RAMBuffer
on the slaves since you're not indexing. On the master, you don't need an
Hi
The phonetic filters (DoubleMetaphone, Metaphone, Soundex, RefinedSoundex,
Caverphone) is only for english language or works for other languages? Have
some phonetic filter for portuguese? If dont have, how i can implement this?
Thanks
Hello!
Look at the dismax (http://wiki.apache.org/solr/DisMaxQParserPlugin)
query parser and the qf parameter. With dismax (or edismax) you can
make a query like:
q=user query&qf=name keywords description username
and Solr will make the query to all the fields specified by the qf
parameter.
--
Peter:
I admit I've just scanned the thread, but it sounds like what you're really
doing under the covers is configuring your system to utilize the SSDs
as where your pages go when it's swapped out of RAM, is this correct?
Which would certainly speed things up substantially if swapping was
happen
bq: Why not have a polyglot analyzer
That could work, but it makes some compromises and assumes that your
languages are "close enough", I have absolutely no clue how that would
work for English and Chinese say.
But it also introduces inconsistencies. Take stemming. Even though you
could easily st
That looks like a good solution. I'm pretty new with Solr, so I'm not sure
how I should implement it. I looked at the documentation and I *think* I
need to modify the search requestHandler in the solrconfi.xml file, is this
correct? If I define it like this:
explicit
10
edism
Hi Erick,
This is correct. An additional benefit to configuring the SSD as cache vs
primary storage is that you don't have to change anything to your existing
indexes (the cache will just give a performance boost).
In addition to configuring the system to utilize SSDs as the location where
Hello!
I think it should work with SolrJ, it shouldn't be a problem.
You don't have to modify the handler, you can specify those
parameters at query time. But if you won't change it and those
will be constant, you can modify the sorlconfig.xml file.
And you don't have to remove the defaultSearc
I think you misunderstood what I am suggesting.
I am suggesting an analyzer that detects the language and then "does the
right thing" according to the language it finds. As such, it would
tokenize and stem English according to English rules, German by German
rules and would probably do a sliding
Thanks a lot for your help. I'll try it out. I didn't see a defType on the
SolrQuery object, this is why I tought it had to be set in the config. Or is
queryType the same as defType?
--
View this message in context:
http://lucene.472066.n3.nabble.com/EdgeNGramTokenizer-not-working-tp3675926p36760
The tutorial works with Solr-3.4.0!
Should the tutorial be updated with newer versions?
Remi
On Friday, January 20, 2012, remi tassing wrote:
> So erase my solr folder and started from scratch.
> From the example folder I "java -jar start.jar" but there was a
solrconfig.xml missing. I copied th
Dear all,
I have a question when sorting retrieved data from Solr. As I know, Lucene
retrieves data according to the degree of keyword matching on text field
(partial matching).
If I search data by string field (complete matching), how does Lucene sort
the retrieved data?
If I add some filters,
I thought of a way you could do this with one query, if using edismax. If you
use "spellcheck.q" and insert "AND" between each keyword you'll make all the
terms required regardless of the "mm" parameter. I quickly tried this out and
it seems to work if you use "AND" but not if you prefix all t
SOLR reports the term occurrence for terms over all the documents. I am
having trouble making a query that returns the term occurrence in a
specific page field called, documentPageId.
I don't know how to issue a proper SOLR query that returns a word count for
a paragraph of text such as the term "
On Jan 20, 2012, at 13:23 , remi tassing wrote:
> The tutorial works with Solr-3.4.0!
It works for 3.5 too... via Jetty as prescribed by the tutorial. No?
> Should the tutorial be updated with newer versions?
Have you tried the instructions here?
http://www.lucidimagination.com/search/doc
Erik,
I've already backported SOLR-2718 - is that what you were referring to when you
said you would fix 3.6?
Steve
> -Original Message-
> From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
> Sent: Friday, January 20, 2012 4:23 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Just ca
Steve - sorry... yeah, that one. I missed your backport as of yesterday. I'll
give it a whirl, but I'm confident all is well. Thanks!
Erik
On Jan 20, 2012, at 16:34 , Steven A Rowe wrote:
> Erik,
>
> I've already backported SOLR-2718 - is that what you were referring to when
> you
Yes, that works! I had to boost the firstDestination field to have it well
sorted. Any ideas why the score might be equally for all the documents
returned?
Thanks a lot!
Federico
--
View this message in context:
http://lucene.472066.n3.nabble.com/Question-about-sorting-by-a-field-tp3673491p36768
Is there a way to parameterize the JDBC URL in the data import handler? I
tried this, but it did not insert the value of the property. I'm running Solr
3.3.0.
Hi All,
I ma using HTTP/JSON to search my documents in Solr. Now the client provides
the query on which the search is based.
What is a good way to validate the query string provided by the user.
On the other hand, if I want the user to build this query using some Solr api
instead of preparing a
On 1/20/2012 3:48 PM, Walter Underwood wrote:
Is there a way to parameterize the JDBC URL in the data import handler? I
tried this, but it did not insert the value of the property. I'm running Solr
3.3.0.
Here's what I've got in mine. I pass in dbHost and dbSchema parameters
(along wit
On Jan 20, 2012, at 3:34 PM, Shawn Heisey wrote:
> On 1/20/2012 3:48 PM, Walter Underwood wrote:
>> Is there a way to parameterize the JDBC URL in the data import handler? I
>> tried this, but it did not insert the value of the property. I'm running
>> Solr 3.3.0.
>>
>> >url="jdb
Weird. I can make it work with a request parameter and
$dataimporter.request.dbhost:
http://localhost:8983/solr/textbooks/dataimport?command=full-import&dbhost=mydbhost
Or I can make it work with a Java system property with no dots.
But when I use a Java system property with internal dots, it d
Has anyone had experience using frange with multi-valued fields? In solr 3.5
doing so results in the error: "can not use FieldCache on multivalued field"
Here's the use case. We have multiple years attached to each document and want
to be able to refine by a year range. We're currently usin
Another benefit with separate field per lang is that TF/IDF stats gets correct
for each individual language.
Also if you KNOW the query language, you can target THAT field alone, but if
you don't know, you can throw the query at multiple fields, which will each get
proper analysis (at the risk o
The TF-IDF argument is a reasonable one.
On Fri, Jan 20, 2012 at 5:33 PM, Jan Høydahl wrote:
> Another benefit with separate field per lang is that TF/IDF stats gets
> correct for each individual language.
> Also if you KNOW the query language, you can target THAT field alone, but
> if you don't
Dimitry,
I did not find the field "boolean allowLeadingWildcard" in the
org.apache.lucene.queryParser.QueryParser class file or anywhere else in the
source code
But setAllowLeadingWildcard() is being set to true in the
org.apache.solr.search.SolrQueryParser class file as shown below
public
On 1/20/2012 5:01 PM, Walter Underwood wrote:
Weird. I can make it work with a request parameter and
$dataimporter.request.dbhost:
http://localhost:8983/solr/textbooks/dataimport?command=full-import&dbhost=mydbhost
Or I can make it work with a Java system property with no dots.
But when I use
58 matches
Mail list logo