Hi :)
Thank you all for your answers. I'll try these solutions :)
Kind regards,
Gary
Le 27/04/2012 16:31, G.Long a écrit :
Hi there :)
I'm looking for a way to save xml files into some sort of database and
i'm wondering if Solr would fit my needs.
The xml files I want to save have a lot of
You need to add more memory to the JVM that is running Solr:
http://wiki.apache.org/solr/SolrPerformanceFactors#OutOfMemoryErrors
Dan
On Mon, Apr 30, 2012 at 9:43 AM, Yuval Dotan wrote:
> Hi Guys
> I have a problem and i need your assistance
> I get an exception when doing field cache faceting
Hello,
I am using solr 1.3 with jdk 1.5.0_14 and weblogic 10MP1 application server
on Solaris. I use embedded solr server. More details :
Number of docs in solr index : 1.4 million
Physical size of index : 640MB
Total number of fields in the index : 700 (99% of these are dynamic fields)
Total numbe
Hi Jan,
thanks for your response!
My "qf" parameter for edismax is: "title". My
"defaultSearchField=text" in schema.xml.
In my app i generate a query with "qf=title,text", so i think the
default parameters in config/schema should bei overridden, right?
I found eventually 2 reasons for this behavi
Thanks kuli, for your response. We tried to implement as per the
instruction. But the problem again is how to create index for every thirty
customers sepertaley. is there any programmatic way out to do or do we need
to create query in configuration file.
Thanks
Prabakarab.P
--
View this messa
It is seems to be working fine .
But i have few question abt indexing
1)i want do index to each customer as well as partner.
2 )how do i create index to each partner (30 customers) ?
As of now i am index all customer using data-config.xml
I will be out of the office starting 30/04/2012 and will not return until
01/05/2012.
Please email to itsta...@actionimages.com for any urgent issues.
Action Images is a division of Reuters Limited and your data will therefore be
protected
in accordance with the Reuters Group Privacy / Data P
Thanks for the fast answer
One more question:
Is there a way to know (some formula) what is the size of memory i need for
these actions?
Thanks
Yuval
On Mon, Apr 30, 2012 at 11:50, Dan Tuffery wrote:
> You need to add more memory to the JVM that is running Solr:
>
> http://wiki.apache.org/solr/
I continue to receive posts from the solr group even after submitting an
unsubscribe per the instructions from the ezmlm app. Is there perhaps a delay
after I confirm the unsubscribe request? 14 posts received so far today. At
this point I have a delete rule to auto trash any received but unnece
BTW,
The first request to unsubscribe was sent in February if that helps track
this down
Thx
From: Kevin Bootz
Sent: Friday, February 24, 2012 7:55 AM
To:
'solr-user-uc.1330079879.acnmkgjcnnlfgdhmmlkn-kbootz=caci@lucene.apache.org'
Subject: unsubscribe
There's a Lucene/Solr memory size estimator spreadsheet in the SVN:
http://svn.apache.org/repos/asf/lucene/dev/trunk/dev-tools/size-estimator-lucene-solr.xls
Dan
On Mon, Apr 30, 2012 at 11:39 AM, Yuval Dotan wrote:
> Thanks for the fast answer
> One more question:
> Is there a way to know (som
I tested it.
With default "qf=title text" in solrconfig and "mm=100%"
i get the same result(1) for "nascar AND author:serg*" and "+nascar
+author:serg*", great.
With "nascar +author:serg*" i get 3500 matches, in this case the
mm-parameter seems not to work.
Here are my debug params for "nascar AND
In the 3.6 world, LukeRequestHandler does some...er...really expensive
things when you click into the admin/schema browser. This is _much_
better in trunk BTW.
So, as Yonik says, LukeRequestHandler probably accounts for
one of the threads.
Does this occur when nobody is playing around with the ad
Your idea of using a Transformer will work just fine, you have a lot more
flexibility in a custom Transformer, see:
http://wiki.apache.org/solr/DIHCustomTransformer
You could also write a custom update handler that examined the
document on the server side and implemented your logic, or even
just a
I'd get to the root of why indexes are corrupt! This should
be very unusual. If you're seeing this at all frequently,
it indicates something is very wrong and starting bunches
of JVMs up is a band-aid over a much more serious
problem.
Are you, by chance, doing a kill -9? or other hard-abort?
Best
Consider writing a custom sort method or a custom function
that you use for sorting. Be _very_ careful that anything you
do here is very efficient, it'll be called a _lot_.
Best
Erick
On Mon, Apr 30, 2012 at 2:10 AM, solr user wrote:
> Hi,
>
> Any suggestions,
>
> Am I trying to do too much with
Try attaching &debugQuery=on to your query and seeing if that helps
you understand what's going on. If that doesn't help, also look at
admin/analysis. If all that doesn't help, post your schema definition
for the field type and the results of &debugQuery=on (you might
look at: http://wiki.apache.or
Hello,
Over the weekend I experimented with extracting HTML content via cURL and
just
wondering why the extraction/indexing process does not include the HTML
tags.
It seems as though the HTML tags either being ignored or stripped somewhere
in the pipeline.
If this is the case, is it possible to in
When WDF filters blackberry9810 it will treat it as a sequence of tokens but
as if it were a phrase, like "blackberry 9810", with the two terms adjacent,
at least with the edismax query parser. I'm not sure what the other query
parsers do.
If you are using edismax, you can set the QS (query sl
Hi,
Tell us more about:
* what you facet on
* how many facet values are in each facet
* how much RAM you have
* 32 or 64 bit
* -Xmx you are using
* faceting method you are using
* ...
Otis
Performance Monitoring for Solr -
http://sematext.com/spm/solr-performance-monitoring
>
hi Erick,
autoGeneratePhraseQueries="false" is set for field type. And it works fine
for standard query parser.
Problem seem to be when i start using dismax. As u suggested i checked
analysis tool and even after word delimiter is applied i see search term as
"blackberry 9801" so i dont think it s
See Jack's comments about phrases, all your parsed
queries are phrases, and your indexed terms aren't
next to each other.
Best
Erick
On Mon, Apr 30, 2012 at 10:54 AM, abhayd wrote:
> hi Erick,
> autoGeneratePhraseQueries="false" is set for field type. And it works fine
> for standard query parse
Hello all,
I'm facing this simple problem, yet impossible to resolve for me (I'm a
newbie in Solr).
I need to sort the results by score (it is simple, of course), but then
what I need is to take top 10 results, and re-order it (only those top 10
results) by a date field.
It's not the same as sort=
The &qs=1 request parameter should work for the dismax query parser as well
as edismax.
-- Jack Krupansky
-Original Message-
From: Erick Erickson
Sent: Monday, April 30, 2012 10:58 AM
To: solr-user@lucene.apache.org
Subject: Re: solr.WordDelimiterFilterFactory query time
See Jack's c
hi jack & erick,
Thanks
I do have qs set in solrconfig for query handler dismax settings.
10
Still does not work
abhay
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr-WordDelimiterFilterFactory-query-time-tp3950045p3951038.html
Sent from the Solr - User mailing list a
If by "extracting HTML content via cURL" you mean using SolrCell to parse
html files, this seems to make sense. The sequence is that regardless of the
file type, each file extraction "parser" will strip off all formatting and
produce a raw text stream. Office, PDF, and HTML files are all treated
hi jack,
tried &qs=10 but unfortunately it does not seem to help.
Not sure what else could be wrong
abhay
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr-WordDelimiterFilterFactory-query-time-tp3950045p3951082.html
Sent from the Solr - User mailing list archive at Nabbl
hi jack,
tried &qs=10 but unfortunately it does not seem to help.
Not sure what else could be wrong
abhay
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr-WordDelimiterFilterFactory-query-time-tp3950045p3951083.html
Sent from the Solr - User mailing list archive at Nabbl
Just to be clear, I used the Solr example schema and indexed two test
documents, one with "Blackberry 9810" and one with "Blackberry torch 9810"
in the sku field (which uses field type text_en_splitting_tight which uses
WDF) and the following query returns both documents:
http://localhost:8983
Thanks Erick.
I'm not concerned about the logic, all I want to achieve is sometimes
storing/indexing a multi-valued field and sometimes not (same field with
same name) based on some logic. In a transformer I cannot change the
schema dynamically to do that, not that I know of at least.
So if I defin
I was copying the indexes from webapp to cores ,when this happened .It
could have been an error from my end ,but just worried that an issue with
one core would reflect on webapp .
Regards
Sujatha
On Mon, Apr 30, 2012 at 7:20 PM, Erick Erickson wrote:
> I'd get to the root of why indexes are cor
Hi,
I'm trying to find a Solr logo in a vector or some other format suitable for
print. I found Lucene logo
at http://svn.apache.org/repos/asf/lucene/site/publish/images/logo.eps , but
can't find one for Solr. Does anyone know where to find it?
At the bottom of http://wiki.apache.org/solr/P
OK, I took another look at what you were trying to
accomplish and, I find the use-case kind of hard to
figure out, but that's my problem .
But it is true that there's really no good way to _change_ the
way the field is analyzed in Solr. Of course since Solr is
built on Lucene, you could to a lot o
One idea was to wrap the field with CDATA. Or base64 encode it.
On Fri, Apr 27, 2012 at 7:50 PM, Bill Bell wrote:
> We are indexing a simple XML field from SQL Server into Solr as a stored
> field. We have noticed that the & is outputed as & when using
> wt=XML. When using wt=JSON we get
Vazquez,
Sorry I don't have an answer but I'd love to know what you need this for :-)
I think the logic is going to have to bleed into your search app. In
short copy field and your app knows which to search in.
lee c
On 30 April 2012 20:41, Erick Erickson wrote:
> OK, I took another look at w
Try this one:
http://www.lucidimagination.com/sites/default/files/image/solr_logo_rgb.png
Dan
On Mon, Apr 30, 2012 at 8:38 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
> Hi,
>
> I'm trying to find a Solr logo in a vector or some other format suitable
> for print. I found Lucene lo
I am getting a post.jar failure when trying to post the following
CDATA field... It used to work on older versions. This is in SOlr 3.6.
SP2514N
Samsung SpinPoint P120 SP2514N - hard drive - 250
GB - ATA-133
Samsung Electronics Co. Ltd.
electronics
hard drive
7200RPM, 8MB cache, IDE
Otis,
I think there was some JIRA ticket (Logo contents or something like that)
which might have all the logo proposals, including the winning one,
attached.
Regards,
Lukas
On Mon, Apr 30, 2012 at 9:38 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
> Hi,
>
> I'm trying to find a Solr
Thanks wunder and Lance,
In the discussions I've seen of Japanese IR in the English language IR
literature, Hiragana is either removed or strings are segmented first by
character class. I'm interested in finding out more about why bigramming
across classes is desirable.
Based on my limited und
Is this possible in DataImportHandler
I want the following XML to all collapse into one Author field
Sørlie
T
T
Perou
C M
CM
Tibshirani
R
R
...
So my XPATH is like
hi jack,
thanks, i figured out the issue. It was settings during query and index time
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr-WordDelimiterFilterFactory-query-time-tp3950045p3951811.html
Sent from the Solr - User mailing list archive at Nabble.com.
Sorry hit send too soon. Continued the email below
On 4/30/12 4:46 PM, "Twomey, David" wrote:
>
>Is this possible in DataImportHandler
>
>I want the following XML to all collapse into one mult-valued Author field
>
>
>
> Sørlie
> T
> T
>
>
> Perou
> C M
> CM
>
>
> Tibshirani
> R
>
I've read some things in jira on the new functionality that was put into
caching in the DIH but I wouldn't think it should break the old behavior. It
doesn't look as though any errors are being thrown, it's just ignoring the
caching part and opening a ton of connections. Also I cannot find any
Thanks Lukas. Yeah, I looked there, but as far as I can tell, all attachments
are PNGs/GIFs/JPGs :(
Otis
Performance Monitoring for Solr - http://sematext.com/spm/index.html
>
> From: Lukáš Vlček
>To: solr-user@lucene.apache.org; Otis Gospodnetic
>Sen
You'll see katakana used with kanji in noun compounds where one of the words is
foreign.
In Japanese, "Rice University" is not written with the kanji word for "rice".
They use katakana for "rice" and kanji for "university", like this: ライス大学.
This is very common. I expect that "President Obama"
Great. But could you tell us all what settings you had wrong and how you
changed them so that somebody else with the problem searching the email
archive will be able to see your solution? Thanks.
-- Jack Krupansky
-Original Message-
From: abhayd
Sent: Monday, April 30, 2012 4:51 PM
T
Hi,
Can I set the constructor parameter "margin" of SimpleFragListBuilder
from within solrconfig.xml?
I would suspect that something has to be added to this configuration
element in solrconfig.xml:
But what and how exactly?
(I'm using solr 3.5 at the moment)
Thanks,
Tobi
If I have 40 writers all feeding the same index, do they all have to commit, or
just one of them?
Am I going to kill performance if they're all issuing individual commits, or
would it be better to not have the individual writers commit at all and just
have one process that does nothing but comm
Great, thank you for the input. My understanding of HTMLStripCharFilter is
that it strips HTML tags, which is not what I want ~ is this correct? I
want to keep the HTML tags intact.
On Mon, Apr 30, 2012 at 11:55 AM, Jack Krupansky wrote:
> If by "extracting HTML content via cURL" you mean using
Adam,
This is where autocommit (see solrconfig.xml) comes in handy. Don't have them
all commit, no. :)
Otis
Performance Monitoring for Solr - http://sematext.com/spm/index.html
>
> From: Adam Fields
>To: solr-user@lucene.apache.org
>Sent: Monday, Apr
Answering my own question: I think I can do this by writing a script that
concats the Lastname, Forname and Initials and adding that to xpath =
/AuthorList/Author
Yes?
On 4/30/12 4:49 PM, "Twomey, David" wrote:
>Sorry hit send too soon. Continued the email below
>
>On 4/30/12 4:46 PM, "Twome
I was thinking that you wanted to index the actual text from the HTML page,
but have the stored field value still have the raw HTML with tags. If you
just want to store only the raw HTML, a simple string field is sufficient,
but then you can't easily do a text search on it.
Or, you can have tw
I have a multicore solr with a lot of cores that contains a lot of data (~50M
documents), but are rarely used.
Can i load a core from configuration, but have keep it in sleep mode, where
is has all the configuration available, but it hardly consumes resources,
and based on a query or an update, it
Hello Christopher
I ran into the same problem. When I disable dedupe from the update handler,
things worked fine. The problem is when i enable dedupe that I run into the
multivalued error. I'm also using SolJ to add documents.
Were you able to resolve this?
If so, would you kindly post your solut
Thanks wunder,
I really appreciate the help.
Tom
http://svn.apache.org/viewvc?rev=1332444&view=rev
: At the bottom of http://wiki.apache.org/solr/PublicServers I found a
: link
:
to https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/site/src/documentation/content/xdocs/images/ ,
: but that leads to 404.
fixed.
-Hoss
Hi,
I see that you have already commented on SOLR-2649 "MM ignored in edismax
queries with operators". So let's continue the way towards resolution there...
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com
On 30. apr. 2012, at 14:28
: Is there a tokenizer that tokenizes the string as one token?
Using KeywordTokenizer at query time should do whta you want.
-Hoss
58 matches
Mail list logo