Hi Li,
Yes, you can issue a delete all by:
curl http://your_solr_server:your_solr_port/solr/update -H
"Content-Type: text/xml" --data-binary
'*:*';
Hope it helps.
Cheers,
Daniel
-Original Message-
From: Li Li [mailto:fancye...@gmail.com]
Sent: 28 June 2010 03:41
To: solr-user@lucene.a
Hi
I reported this issue a long time ago and if I remember it correctly
someone told me this issue no longer happens on 1.3 onwards. But as the
Jira issue hasn't been commented or changed states I'm writing to
confirm.
Regards,
Daniel
http://www.bbc.co.uk/
This e-mail (and any attachments) is co
Hi John,
Have you considered buying an existing commercial product that delivers
what you want (searching over log files / maybe monitoring)? It may be
cheaper than developing it... http://www.splunk.com/product
Just a disclaimer: I'm not related to the company or product so if you
need any infor
stable
since then:
-Xmx2048m -Xms2048m -XX:MinHeapFreeRatio=50 -XX:NewSize=1024m
-XX:NewRatio=2 -Dsun.rmi.dgc.client.gcInterval=360
-Dsun.rmi.dgc.server.gcInterval=360
I hope it helps.
Regards,
Daniel Alheiros
-Original Message-
From: Fuad Efendi [mailto:[EMAIL PROTECTED]
wn.
>
> I've changed my JVM startup params and it's working extremelly stable
> since then:
>
> -Xmx2048m -Xms2048m -XX:MinHeapFreeRatio=50 -XX:NewSize=1024m
> -XX:NewRatio=2 -Dsun.rmi.dgc.client.gcInterval=360
> -Dsun.rmi.dgc.server.gcInterval=360
>
>
Hi Sujatha.
I've developed a search system for 6 different languages and as it was
implemented on Solr 1.2 all those languages are part of the same index,
using different fields for each so I can have different analyzers for
each one.
Like:
content_chinese
content_english
content_russian
content_
Hi Leonardo,
I've been using the synonym filter at index time (expand = true) and it works
just fine. Also use OR as the default operator. Once you do it at index time
there is no point doing it at query time (which in fact is likely to be the
reason of your problems).
Have a look at the Wik
Hi Walter,
Has it always been there? Which version of Lucene are we talking about?
Regards,
Daniel
-Original Message-
From: Walter Underwood [mailto:wunderw...@netflix.com]
Sent: 16 July 2009 15:04
To: solr-user@lucene.apache.org
Subject: Re: Word frequency count in the index
Lucene us
Hi
Are you ever going to search for earlier revisions or only the latest?
If in your use cases you need the latest, just replace earlier revisions
with the latest on your index
Regards,
Daniel
-Original Message-
From: Reza Safari [mailto:r.saf...@lukkien.com]
Sent: 15 July 2009 12
Come on it's time to cut this release, folks! I'm just waiting for that
since it was forecasted for early summer. :)
Cheers
-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
Sent: 15 July 2009 02:18
To: solr-user@lucene.apache.org
Subject: Re: Solr 1.4 Release
for "season" and "2".
In my control theory course, my professor told me to only use
proportional control when on/off didn't work. Well, stop words don't
work and idf does.
For a longer list of movie titles entirely made of stop words, go here:
http://wunderwood.org/mo
I think in this case you can use a "bq" (Boost Query) so you can apply this
boost to the range you want.
your_date_field:[NOW/DAY-24HOURS TO NOW]^10.0
This example will boost your documents with date within the last 24h.
Regards,
Daniel
On 19/7/07 14:45, "climbingrose" <[EMAIL PROTECTED]> wrote
Sorry just correcting myself:
your_date_field:[NOW-24HOURS TO NOW]^10.0
Regards,
Daniel
On 19/7/07 15:25, "Daniel Alheiros" <[EMAIL PROTECTED]> wrote:
> I think in this case you can use a "bq" (Boost Query) so you can apply this
> boost to the range you wan
I'm using both the qb and the function, so the function gradually boost
fresher documents and the qb act as an extra boost for the most recent ones.
Good to now that it's interesting to avoid such precision, in fact I'm
rounding my times avoiding using NOW, so it's fine for me.
Regards,
Daniel
Hi
I've started using highlighting and there is something that I consider a bit
odd... It may be caused by the way I'm indexing or querying I'm sure, but
just to avoid doing a huge number of tests...
I'm querying for "butter" and only exact matches of butter are returning
highlighted, when I chan
rFilterFactory when indexing but not when
querying.
Regards,
Daniel
On 31/7/07 18:50, "Mike Klaas" <[EMAIL PROTECTED]> wrote:
>
> On 31-Jul-07, at 9:41 AM, Daniel Alheiros wrote:
>
>> Hi
>>
>> I've started using highlighting and there is something t
face so it will
have to query for it¹s specific fields (and I will need to make those
stored=true)?
Thanks again,
Daniel
On 1/8/07 10:43, "Daniel Alheiros" <[EMAIL PROTECTED]> wrote:
> Hi Mike.
>
> Thanks for your reply, but seems that I haven't expressed myself
Thanks Yonik.
Noted and fixed. I'll take extra care with this scenarios.
Regards,
Daniel
On 1/8/07 20:08, "Yonik Seeley" <[EMAIL PROTECTED]> wrote:
> On 8/1/07, Daniel Alheiros <[EMAIL PROTECTED]> wrote:
>> I'm using the PorterStemmerFilterFacto
Hi
I'm using the released version 1.2.
I was using HTMLStripWhitespaceTokenizerFactory to remove from my index some
rubish (html tags that are not relevant in my structure), but it is making
the highlighting fail in some conditions. It seems to me that it's not
keeping track of the proper positio
Hi Yonik.
Do you have any performance statistics about those changes?
Is it possible to upgrade to this new Lucene version using the Solr 1.2
stable version?
Regards,
Daniel
On 17/9/07 17:37, "Yonik Seeley" <[EMAIL PROTECTED]> wrote:
> If you want to see what performance will be like on the ne
Hi
I'm having problems trying to setup my schedulled tasks. Sorry if it's
something Linux related, as I'm not a Linux expert...
I created a scripts.conf file (for my slave server) containing:
user=solr
solr_hostname=10.133.132.159
solr_port=8080
rsyncd_port=20280
data_dir=/var/solr2-v1.2.0/home/d
Sorry I forgot to tell I'm running Solr 1.2.
On 21/9/07 13:02, "Daniel Alheiros" <[EMAIL PROTECTED]> wrote:
> Hi
>
> I'm having problems trying to setup my schedulled tasks. Sorry if it's
> something Linux related, as I'm not a Linux expert...
sions to 775 (so any user on the
root group should be able to do anything on any files)
Any other suggestion?
Regards,
Daniel
On 21/9/07 13:12, "Thorsten Scherler"
<[EMAIL PROTECTED]> wrote:
> On Fri, 2007-09-21 at 13:02 +0100, Daniel Alheiros wrote:
>> Hi
>>
&
is kind of information be present on the SOLR documentation? I'm
going to write it in my installation procedures, so I can contribute it back
to SOLR wiki if you think it's appropriate.
Regards,
Daniel
On 21/9/07 14:31, "Daniel Alheiros" <[EMAIL PROTECTED]> wrote:
> Hi Tho
and
it is very flexible and appropriate for each language.
I've also created for management simplicity a dismax handler that allows me
to query all documents no matter in which language it is. May be useful for
you too.
Regards,
Daniel Alheiros
On 29/9/07 03:29, "Lance Norskog"
Hi
I'm about to deploy SOLR in a production environment and so far I'm a bit
concerned about availability.
I have a system that is responsible for fetching data from a database and
then pushing it to SOLR using its XML/HTTP interface.
So I'm going to deploy N instances of my application so it's
tantly, will it run the
postOptimize/postCommit scripts generating snapshots and then possibly
propagating the bad index?
Thanks again,
Daniel
On 8/10/07 16:12, "Yonik Seeley" <[EMAIL PROTECTED]> wrote:
> On 10/8/07, Daniel Alheiros <[EMAIL PROTECTED]> wrote:
>>
ting it?
Thanks again,
Daniel
On 8/10/07 17:30, "Yonik Seeley" <[EMAIL PROTECTED]> wrote:
> On 10/8/07, Daniel Alheiros <[EMAIL PROTECTED]> wrote:
>> Well I believe I can live with some staleness at certain moments, but it's
>> not good as users are s
OK, I'll define it as a procedure in my disaster recovery plan.
That would be great. I'm looking forward to it.
Thanks,
Daniel
On 8/10/07 18:07, "Yonik Seeley" <[EMAIL PROTECTED]> wrote:
> On 10/8/07, Daniel Alheiros <[EMAIL PROTECTED]> wrote:
>> Hmm,
Hi Hoss,
Yes I know that, but I want to have a proper dummy backup (something that
could be kept in a very controlled environment). I thought about using this
approach (a slave just for this purpose), but if I'm using it just as a
backup node there is no reason I don't use a proper backup structur
If you do want more stopwords sources, there is this one too:
http://snowball.tartarus.org/algorithms/
And I would go for the language identification and then I would apply the
proper set.
Cheers,
Daniel
On 18/10/07 16:18, "Maria Mosolova" <[EMAIL PROTECTED]> wrote:
> Thanks a lot Peter!
> Mar
Hi
I experienced a very unpleasant problem recently, when my search indexing
adaptor was changed to add some new fields. The problem is my schema didn't
follow those changes (new fields added), and after that SOLR was silently
ignoring all documents I sent.
Neither SOLR Java client or SOLR server
g on the
server side...
Regards,
Daniel
On 28/11/07 15:40, "Erik Hatcher" <[EMAIL PROTECTED]> wrote:
>
> On Nov 28, 2007, at 8:41 AM, Daniel Alheiros wrote:
>> I experienced a very unpleasant problem recently, when my search
>> indexing
>> adaptor was change
Hi Hoss.
Well I'll enable this ignore options for fields that aren't declared in my
schema. Thanks.
Exactly, you can try it really easily, just remove one of your fields on the
example schema config and try to add content using the Java client API...
Well I'm using SOLRJ and it returns no error c
Hi Hoss.
I'm using Solr 1.2 and a SolrJ client built from the trunk some time ago
(21st of June 2007).
When a document is indexed I can see that INFO message on my logs showing
exactly what you said, but nothing is logged in this situation I've
described initially.
I'm using this logging conf:
?
Regards,
Daniel Alheiros
On 16/1/08 15:23, "Evgeniy Strokin" <[EMAIL PROTECTED]> wrote:
> Hello,..
> I have relatively large RAM (10Gb) on my server which is running Solr. I
> increased Cache settings and start to see OutOfMemory exceptions, specially on
> facet s
Hi,
I'm just starting to use Solr and so far, it has been a very interesting
learning process. I wasn't a Lucene user, so I'm learning a lot about both.
My problem is:
I have to index and search content in several languages.
My scenario is a bit different from other that I've already read in th
uage search seems to be with
> n-gram indexing. That usually produces a larger index and somewhat
> slower performance (because of the number of terms), but at least
> it works.
>
> wunder
>
> On 6/7/07 10:47 AM, "Daniel Alheiros" <[EMAIL PROTECTED]> wrote:
>
>
Hi
For my search use, the document freshness is a relevant aspect that should
be considered to boost results.
I have a field in my index like this:
How can I make a good use of this to boost my results?
I'm using the DisMaxRequestHandler to boost other textual fields based on
the query, but i
> You could either deploy that solution through multiple web-apps (one per
> lang) (or try the patch for issue Solr-215).
> Regards,
> Henri
>
>
> Daniel Alheiros wrote:
>>
>> Hi,
>>
>> I'm just starting to use Solr and so far, it has been a very int
This sounds OK.
I can create a field name mapping structure to change the requests /
responses in a way my client doesn't need to be aware of different fields.
Thanks for this directions,
Daniel
On 8/6/07 21:32, "Chris Hostetter" <[EMAIL PROTECTED]> wrote:
>
> : Can't I have the same index, u
Hi Henri,
Thanks again, your considerations will sure help on my decision.
Now I'll do my homework to check document volume / growth - expected index
sizes and query load.
Regards,
Daniel Alheiros
On 9/6/07 10:53, "Henrib" <[EMAIL PROTECTED]> wrote:
>
> Hi Daniel,
eated),1,1000,1000)
>
>
> Obviously you will need to modify the values a bit, more info here:
> http://wiki.apache.org/solr/FunctionQuery
>
> -Nick
>
> On 6/9/07, Daniel Alheiros <[EMAIL PROTECTED]> wrote:
>> Hi
>>
>> For my search use, the
Hi
And about the fields, if they are/aren't going to be present on the
responses based on the user group, you can do it in many different ways
(using XML transformation to remove the undesirable fields, implementing
your own RequestHandler able to process your group information, filtering
the data
Hi Yonik.
About how to handle with the index in query time:
I think that if you don't inform a language, you can return any document
matching the term, without considering different languages (if it's
possible) or if it's interesting for your solution, you can define a default
language to be used
Hi Hoss
One bad thing in having fields specific for your language (in my point of
view) is that you will have to re-index your content when you add a new
language (some will need to start with one language and in future will have
others added). But OK, let's say the indexing is done.
So using dyn
Which version of Weblogic are you trying?
Some old versions have a wrong javax.servlet.Filter interface definition...
Regards,
Daniel
On 13/6/07 15:59, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> Hi guys,
>
> I've tried deploying Solr on Weblogic and am gettting the following error
> in m
Sorry, probably it's not the case then...
On 13/6/07 16:07, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> sorry - I'm using weblogic 9.2
>
>
>
>
>
>
>
>
>
>
> Daniel Alheiros <[EMAIL PROTECTED]>
> 13/06/2007
Hi
I've been using one Java client I got from a colleague but I don't know
exactly its version or where to get any update for it. Base package is
org.apache.solr.client (where there are some common packages) and the client
main package is org.apache.solr.client.solrj.
Is it available via Maven2 c
Thanks Martin.
I'm using one of them which the optimize command doesn't work properly
Have you seen the same problem?
Regards,
Daniel
On 14/6/07 13:07, "Martin Grotzke" <[EMAIL PROTECTED]> wrote:
> On Thu, 2007-06-14 at 11:32 +0100, Daniel Alheiros wrote:
>
tures for both searching and indexing and will be moving
>>> into the main distribution soon as the standard java client library.
>>>
>>> - will
>>>
>>>
>>>
>>>
>>>
>>> -Original Message-
>>> Fr
at client, but I didn't get any good results while
>>>>>> searching
>>>>>> for worst with special characters. I have also searched for
>>>>> documentation
>>>>>> for that client, but didn't find any.
>>>>>>
Hi Hoss.
Yes, the idea is indexing each document independently (in my scenario they
are not translations, they are just documents with the same structure but
different languages). So that considerations you did about queries in a
range wouldn't be a problem in this case. The real issue I can see i
Hi Hoss
Thanks again for your attention.
Looks like after your last instructions I thought the same way as you :)
What I did yesterday:
1. Created the schema with the fields with language variations (created as
concrete fields anyway because in this case, using dynamic it wouldn't be
better for
can define a boost function:
recip(rord(numberField),1,1000,1000)
I hope it helps.
Regards,
Daniel Alheiros
On 20/6/07 16:47, "David Xiao" <[EMAIL PROTECTED]> wrote:
> Hello folks,
>
>
>
> I am using solr to index web contents. I want to know is that possible to
Hi Hoss.
I've tried that yesterday using the same approach you just said (I've
created the base fields for any language with basic analyzers) and it worked
alright.
Thanks again for you time.
Regards,
Daniel
On 20/6/07 21:00, "Chris Hostetter" <[EMAIL PROTECTED]> wrote:
>
> : So far it sound
"\" is correct and is available in the classpath.",
e);}//return new CJKTokenizer(input);} }
Regards,
Daniel Alheiros
On 19/6/07 18:57, "Mike Klaas" <[EMAIL PROTECTED]> wrote:
>
> On 18-Jun-07, at 10:28 PM, Toru Matsuzawa wrote:
>
&
Hi
I'm now considering how to improve query results on a set of languages and
would like to hear considerations based on your experience in that.
I'm using the tokenizer HTMLStringWhitespaceTokenizerFactory with the
WordDelimiterFilterFactory, LowerCaseFilterFactory and
RemoveDuplicatesTokenFilte
Hi Hoss.
I've done a few tests using reflection to instantiate a simple object and
the results will vary a lot depending on the JVM. As the JVM optimizes code
as it is executed it will vary depending on the usage, but I think we have
something to consider:
If done 1,000 samples (5 clean X loop of
Sorry I've confused things a bit... The thread safeness have to be
considered only on the Tokenizers, not on the factories. So are the
Tokenizers thread safe?
Regards,
Daniel
On 22/6/07 11:36, "Daniel Alheiros" <[EMAIL PROTECTED]> wrote:
> Hi Hoss.
>
> I'v
Or
I think this way, the config terms are a bit clearer... What do you think?
Regards,
Daniel
On 22/6/07 20:45, "Chris Hostetter" <[EMAIL PROTECTED]> wrote:
>
> : What would be the best way to not hide their use?
> :
> :
>
> How about just...
>
>
>
>
>
> -Hoss
>
http://www.bbc.co.
Hi
I've configured my Solr instance using autocommit in the following way:
1000
6
But it¹s only considering the maxtime now. I¹ve used the maxdocs before and
it worked, but after I defined both, only the maxtime is being considered.
Is it a known b
Thanks Mike.
Regards,
Daniel Alheiros
On 25/6/07 20:15, "Mike Klaas" <[EMAIL PROTECTED]> wrote:
> On 25-Jun-07, at 8:02 AM, Daniel Alheiros wrote:
>> I've configured my Solr instance using autocommit in the following
>> way:
>>
>>
>&
Hi Hoss.
Yes, it's the tricky part when re-structuring configs...
One possible solution is, when you create a new schema, you offer a
conversion tool... Other is to define a "version" on the config and
depending on the version, the expected structure will be different.
I'm sure you know this all
OK Hoss.
I agree with you.
Regards,
Daniel
On 26/6/07 19:14, "Chris Hostetter" <[EMAIL PROTECTED]> wrote:
>
> : conversion tool... Other is to define a "version" on the config and
> : depending on the version, the expected structure will be different.
>
> FYI: schema.xml does have this ... i
Hi
I'm in trouble now about how to issue queries against Solr using in my "q"
parameter content in Russian (it applies to Chinese and Arabic as well).
The problem is I can't send any Russian special character in URL's because
they don't fit in ASCII domain, so I'm doing a POST to accomplish that.
Thanks a lot!
Now it is working. It was the Tomcat connector setup
Regards,
Daniel
On 28.06.2007 17:19, "Chris Hostetter" <[EMAIL PROTECTED]> wrote:
>
> : You can also ensure the browser sends an utf8 encoded post by
> : : It works even if the page the form is in is not an UTF-8 page.
>
ome large Internet shops to the crawler, from Russia...
>
> Quoting Daniel Alheiros:
>
>> Hi
>>
>> I'm in trouble now about how to issue queries against Solr using in my "q"
>> parameter content in Russian (it applies to Chinese and Arabic as well).
>
Hi
I'm developing a search application using SOLR/Lucene and I think I found a
bug.
I was trying to index more documents and the total document number wasn't
changing, but for each document batch I was sending to update the index, the
numbers shown by the console in the update handler section was
Hi Ard
After I removed manually it worked correctly and I've restarted a few times
since the "lost lock" was there... Isn't that lock removal on start-up
optional?
Regards,
Daniel
On 9/7/07 13:50, "Ard Schrijvers" <[EMAIL PROTECTED]> wrote:
> Hello Daniel,
>
> it sounds strange to me because
Hi Andrew.
I'm using the RussianAnalyzer (part of the Lucene analyzers) and it reduces
списки to списк.
Do you want to try this other Analyzer?
Regards,
Daniel
On 9/7/07 16:06, "Andrew Stromnov" <[EMAIL PROTECTED]> wrote:
> списки arrondissement turvallisuuden
http://www.bbc.co.uk/
This e-m
t in Solr config?
>
> Thank you.
>
>
> Daniel Alheiros wrote:
>>
>> Hi Andrew.
>>
>> I'm using the RussianAnalyzer (part of the Lucene analyzers) and it
>> reduces
>> списки to списк.
>>
>> Do you want to try this other Ana
Hi Hoss
Yes, no error and that strange behaviour on the numbers shown by the admin
console. I'll try an see how to make my SOLR logging better, because so far
it's not that good.
Regards,
Daniel
On 9/7/07 19:16, "Chris Hostetter" <[EMAIL PROTECTED]> wrote:
>
> : After I removed manually it wo
ard filter factory
> word delimeter filter factory
> lowercase filter factory
> stop filter factory (with hardcoded stopwords)
> russian stem filter
>
>
> Regards,
> Andrew
>
>
> Daniel Alheiros wrote:
>>
>> Hi Andrew
>>
>>
Hi Thierry.
I'm not sure this is the best approach. What I've adopted an so far is
working really well is to have one field per language (like text_french and
text_dutch) and in your schema you declare both plus one that just receives
the copy of them.
Your index/query analysis have to be compati
Hi Matthew.
It¹s probably caused by the way you are processing this field. As you have
defined it as a ³text² that has a Whitespace tokenizer and a set of filters
related to it. You could create a new field type or just use a numeric type
(like sfloat) for that.
Anyway you can always see how your
/7/07 11:34, "Andrew Stromnov" <[EMAIL PROTECTED]> wrote:
>
> Hi Daniel
>
> How to implement custom Russian factory with various Tokenizers and Filters?
>
> Can you provide some code examples?
>
> Regards,
> Andrew
>
>
> Daniel Alheiros wro
77 matches
Mail list logo