Hello everyone,
I have a index that contains text (several fileds) that can be in English or
in Greek. I have found the corresponding filters
solr.GreekLowerCaseFilterFactory
solr.GreekStemFilterFactory
for the greek language along with the special type text_greek included to
the default schema.
Thanks for the response. Finally I have decided to build access intelligence
into the Solr to pre filter the results by storing required attributes in
the index to determine the access.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Custom-request-handler-plugin-tp2673822p26
Here is an example of doing this in DIH.
Say you have a field foobar, that is a string type and has | between the
strings that you want to put into a multiValued list. This is fairly easy
to do with Regex feature of DIH. But say you also want to take the field
and grab the lowest value and store i
Our tests showed, in our situation, the "compressed oops" flag caused our minor
(ParNew) generation time to decrease significantly. We're using a larger heap
(22gb) and our index size is somewhere in the 40's gb total. I guess with any
of these jvm parameters, it all depends on your situation
will UseCompressedOops be useful? for application using less than 4GB
memory, it will be better that 64bit reference. But for larger memory
using application, it will not be cache friendly.
"JRocket the definite guide" says: "Naturally, 64 GB isn't a
theoretical limit but just an example. It was me
: Hi all, I am trying to use a custom search filter
: (org.apache.lucene.search.Filter) but I am unsure of where I should configure
: this.
:
: Would I have to create my own SearchHandler that would wrap this logic in? Any
: example/suggestions out there?
the easiest way to plugin a custom Filte
: In order to paint "Next" links app would have to know total number of
: records that user is eligible for read. getNumFound() will tell me that
: there are total 4K records that Solr returned. If there wasn't any
: entitlement rules then it could have been easier to determine how many
: "Next"
hi, Arslan!
By object, I was saying an instance of [org.apache.lucene.search.Query].
For performance purposes, I'm wanting rewrite a fuzzy query in a field and,
then, query in another.
Thank you!
On Thu, Mar 17, 2011 at 18:43, Ahmet Arslan wrote:
> > Given a Query object "(name:firefox
> > nam
> I thought I read that you had to have Solr 4.0 for the
> LatLon field
> type, but isn't 1.4 = 4.0? Do I need some type of patch or
> different
> version of Solr to use that field type?
No, 1.4 and 4.0 are different. You can checkout trunk
http://wiki.apache.org/solr/HowToContribute#Getting_the_
I am using Solr 1.4.1 (Solr Implementation Version: 1.4.1 955763M - mark -
2010-06-17 18:06:42) to be exact.
I'm trying to implement that GeoSpacial field type by adding to the schema:
but I get the following errors:
org.apache.solr.common.SolrException: Unknown fieldtype 'location'
spec
On Thu, Mar 17, 2011 at 5:50 PM, Geeta Subramanian
wrote:
> Here is the attached xml.
> In our xml, maxBufferedDocs is commented. I hope that's not causing any issue.
> The ramBufferSizeMB is 32Mb, will changing this be of any use to me?
Nope... your index settings are fine.
Perhaps something in
> Given a Query object "(name:firefox
> name:opera)", is it possible 'rename'
> the fields names to, for example, "(content:firefox
> content:opera)"?
By saying object you mean solrJ?
Anyway, it that helps, with &df parameter you can change fields.
&q=firefox opera&df=name will be parsed into
On Thu, Mar 17, 2011 at 3:55 PM, Geeta Subramanian
wrote:
> Hi Yonik,
>
> I am not setting the ramBufferSizeMB or maxBufferedDocs params...
> DO I need to for Indexing?
No, the default settings that come with Solr should be fine.
You should verify that they have not been changed however.
An olde
Hi,
I am looking for the way to retrieve a ranking (or position) of the
document matched in the result set.
I can get the data, then parse it to find the position of the document
matched, but am looking for the way if there is a feature.
Thanks,
Jae
I am a newbie to solr I have an issue with DIH but unable to pinpoint what is
causing the issue. I am using the demo jetty installation of Solr and tried
to create a project with new schema.xml, solrconfig.xml and data-config.xml
files. when I run
"http://131.187.88.221:8983/solr/dataimport?command
On 3/17/2011 5:02 PM, Jonathan Rochkind wrote:
&defType=lucene
&q=*:* AND NOT _query_:"{!dismax} foo bar baz"
Oops, forgot a part, for anyone reading this and wanting to use it as a
solution.
You can transform:
$defType=dismax
&q=-foo -bar -baz
To:
&defType=lucene
&q=*:* AND NOT _query_:
Hi All,
Thanks for the help... I am now able to debug my solr. :-)
-Original Message-
From: pkeegan01...@gmail.com [mailto:pkeegan01...@gmail.com] On Behalf Of Peter
Keegan
Sent: 17 March, 2011 3:33 PM
To: solr-user@lucene.apache.org
Subject: Re: Info about Debugging SOLR in Eclipse
The
Yeah, looks to me like two or more negated terms does the same thing,
not just one.
q=-foo -bar -baz
Also always returns zero hits. For the same reason. I understand why
(sort of), although at the same time there is a logical answer to this
question "-foo -bar -baz", and oddly, 1.4.1 _lucene_
purely negative queries work with Solr's default ("lucene") query parser. But
don't with dismax. Or so that seems with my experience testing this out just
now, on trunk.
In chatting with Jonathan further off-list we discussed having the best of both
worlds
&q={!lucene}*:* AND NOT _q
Oh i see, i overlooked your first query. A query with one term that is negated
will yield zero results, it doesn't return all documents because nothing
matches. It's, if i remember correctly, the same as when you're looking for a
field that doesn't have a value: q=-field:[* TO *].
> My fault fo
My fault for putting in the quotes in the email, I actually don't have
tests in my quotes, just tried again to make sure.
And I always get 0 results on a pure negative Solr 1.4.1 dismax query. I
think it does not actually work?
On 3/17/2011 3:52 PM, Markus Jelsma wrote:
Hi,
It works just as
Hi,
It works just as expected, but not in a phrase query. Get rid of your quotes
and you'll be fine.
Cheers,
> Should 1.4.1 dismax query parser be able to handle pure negative queries
> like:
>
> &q="-foo"
> &q="-foo -bar"
>
> It kind of seems to me trying it out that it can NOT. Can anyone
Hi Yonik,
I am not setting the ramBufferSizeMB or maxBufferedDocs params...
DO I need to for Indexing?
Regards,
Geeta
-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: 17 March, 2011 3:45 PM
To: Geeta Subramanian
Cc: solr-user@lucene.ap
In your solrconfig.xml,
Are you specifying ramBufferSizeMB or maxBufferedDocs?
-Yonik
http://lucidimagination.com
On Thu, Mar 17, 2011 at 12:27 PM, Geeta Subramanian
wrote:
> Hi,
>
> Thanks for the reply.
> I am sorry, the logs from where I posted does have a Custom Update Handler.
>
> But I h
Hi all,
When I installed Solr, I downloaded the most recent version (1.4.1) I
believe. I wanted to implement the Suggester (
http://wiki.apache.org/solr/Suggester). I copied and pasted the information
there into my solrconfig.xml file but I'm getting the following error:
Error loading class 'org.
The instructions refer to the 'Run configuration' menu. Did you try 'Debug
configurations'?
On Thu, Mar 17, 2011 at 3:27 PM, Peter Keegan wrote:
> Can you use jetty?
>
>
> http://www.lucidimagination.com/developers/articles/setting-up-apache-solr-in-eclipse
>
> On Thu, Mar 17, 2011 at 12:17 PM,
Can you use jetty?
http://www.lucidimagination.com/developers/articles/setting-up-apache-solr-in-eclipse
On Thu, Mar 17, 2011 at 12:17 PM, Geeta Subramanian <
gsubraman...@commvault.com> wrote:
> Hi,
>
> Can some please let me know the steps on how can I debug the solr code in
> my eclipse?
>
> I
Should 1.4.1 dismax query parser be able to handle pure negative queries
like:
&q="-foo"
&q="-foo -bar"
It kind of seems to me trying it out that it can NOT. Can anyone else
verify? The documentation I can find doesn't say one way or another.
Which is odd because the documentation for straight
On Thu, Mar 17, 2011 at 2:12 PM, Chris Hostetter
wrote:
> As the code stands now: we fail fast and let the person building hte index
> make a decision.
Indexing two fields when one could work is unfortunate though.
I think what we should support (eventually) is a max() function will also
work on
Awesome, very helpful. Do you maybe want to add this to the Solr wiki
somewhere? Finding some advice for JVM tuning for Solr can be
challenging, and you've explained what you did and why very well.
On 3/17/2011 2:59 PM, Dyer, James wrote:
We're on the final stretch in getting our product data
We're on the final stretch in getting our product database in Production with
Solr. We have 13m "wide-ish" records with quite a few stored fields in a
single index (no shards). We sort on at least a dozen fields and facet on
20-30. One thing that came up in QA testing is we were getting full
Hi,
I am getting OOM after posting a 100 Mb document to SOLR with trace:
Exception in thread "main" org.apache.solr.common.SolrException: Java heap
space java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.lang.Abstr
Hi folks, I ran into problem today where I am no longer able to execute any
queries :( due to Out of Memory issues.
I am in the process of investigating the use of different mergeFactors, or
even different merge policies all together.
My question is if I have many segments (i.e. smaller sized segm
: But if lucene now can sort a multi-valued field without crashing when there
: are 'too many' unique values, and with easily described and predictable
: semantics (use the minimal value in the multi-valued field as sort key) --
: then it probably makes more sense for Solr to let you do that if yo
Hi Markus,
Thanks, I had already followed the steps of this site.
But I am not able to DEBUG the SOLR classes though I am able to run the solr.
I want to see the code flow from the server side, especially the point where
solr calls tika and it gets the content from tika.
Thanks for the time & h
Hi,
Thanks for the reply.
I am sorry, the logs from where I posted does have a Custom Update Handler.
But I have a local setup, which does not have a custome update handler, its as
its downloaded from SOLR site, even that gives me heap space.
at java.util.Arrays.copyOf(Unknown Source)
On Thu, Mar 17, 2011 at 12:12 PM, Geeta Subramanian
wrote:
> at
> com.commvault.solr.handler.extraction.CVExtractingDocumentLoader.load(CVExtractingDocumentLoader.java:349)
Looks like you're using a custom update handler. Perhaps that's
accidentally hanging onto memory?
-Yonik
http://lu
http://www.lucidimagination.com/developers/articles/setting-up-apache-solr-in-
eclipse
On Thursday 17 March 2011 17:17:30 Geeta Subramanian wrote:
> Hi,
>
> Can some please let me know the steps on how can I debug the solr code in
> my eclipse?
>
> I tried to compile the source, use the jars
Hi,
25*100MB=2.5GB will most likely fail with just 4GB of heap space. But
consecutive single `pushes` as you call it, of 25MB documents should work
fine. Heap memory will only drop after the garbage collector comes along.
Cheers,
On Thursday 17 March 2011 17:12:46 Geeta Subramanian wrote:
> Hi
Hi,
Can some please let me know the steps on how can I debug the solr code in my
eclipse?
I tried to compile the source, use the jars and place in tomcat where I am
running solr. And do remote debugging, but it did not stop at any break point.
I also tried to write a sample standalone java clas
Hi,
I am very new to SOLR and facing a lot of issues when using SOLR to push large
documents.
I have solr running in tomcat. I have allocated about 4gb memory (-Xmx) but I
am pushing about twenty five 100 mb documents and gives heap space and fails.
Also I tried pushing just 1 document. It went
Yeah, I'm not sure how you could do it with DIH, if that's what you're
using. (If it's XML, maybe you can XSLT it before it even goes to DIH,
and have the XSLT take the greatest/least value in the input and stick
it in additional XML element?)
I do my indexing simply in an external procedural/
The standard answer, which is a kind of de-normalizing, is to index
tokens like this:
red_10 red_11orange_12
in another field, you could do these things with size first:
10_red 11_red 12_orange
Now if you want to see what sizes of red you have, you can do a facet
query with facet.prefi
Hi Bill,
> You could always rsync the index dir and reload (old scripts).
I used them previously but was getting problems with them. The
application querying the Solr doesn't cause enough load on it to
trigger the issue. Yet.
> But this is still something we should investigate.
Indeed :-)
> Se
> Better to fix it up so it's predictable and reliable instead, no?
Yes, you are absolutely right. Thats why I'm looking into this.
But how would i stuff, say always author_1, from a multi-valued field
into a single-valued (string or text) field?
Ok, another solution comes up to my mind.
Writin
Rewriting fuzzy queries in spellchecker index is a good practice?
When I rewrite these queries in the main index, the rewriting time is about
3.5 - 4 secs. Now, this rewrites takes a few milliseconds.
On Thu, Mar 17, 2011 at 11:21 AM, Otis Gospodnetic
wrote:
> Hi,
>
>
>
> - Original Message
>> From: Yonik Seeley
>> Subject: Re: Parent-child options
>>
>> On Thu, Mar 17, 2011 at 1:49 AM, Otis Gospodnetic
>> wrote:
>> > The dreaded parent-child without denormalization question. What
Perhaps easiest thing for you right now, that you can do in any version
of Solr, is translate your data at indexing time so you don't have to
sort on a multi-valued field. Put the stuff in an additional field for
sorting, where at index time you only put the greatest or least value
(your choic
Rahul,
Go to your Solr Admin Analysis page, enter sci/tech, check appropriate check
boxes, and see how sci/tech gets analyzed. This will lead you in the right
direction.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
Hi,
- Original Message
> From: Yonik Seeley
> Subject: Re: Parent-child options
>
> On Thu, Mar 17, 2011 at 1:49 AM, Otis Gospodnetic
> wrote:
> > The dreaded parent-child without denormalization question. What are one's
> > options for the following example:
> >
> > parent: shoe
You could always rsync the index dir and reload (old scripts). But this is
still something we should investigate. I had this same issue on high load and
never really found a solution. Did you try another Nic card? See if the Nic is
configured right? Routing? Speed of transfer?
Bill Bell
Sent fr
Hi Bill,
yes DIH is in use.
Thanks,
Bernd
Am 17.03.2011 16:09, schrieb Bill Bell:
Do you use Dih handler? A script can do this easily.
Bill Bell
Sent from mobile
On Mar 17, 2011, at 9:02 AM, Bernd Fehling
wrote:
Good idea.
Was also just looking into this area.
Assuming my input record
Hi Yonik,
actually some applications "misused" sorting on a multiValued field,
like VuFind. And as a matter oft fact also FAST doesn't support this
because it doesn't make sense.
FAST distinguishes between multiValue and singleValue by just adding
the seperator-FieldAttribute to the field. So I m
On Mar 17, 2011, at 3:19 PM, Shawn Heisey wrote:
On 3/17/2011 3:43 AM, Vadim Kisselmann wrote:
Unfortunately, this doesn't seem to be the problem. The queries
themselves are running fine. The problem is that the replications is
crawling when there are many queries going on and that the replication
Do you use Dih handler? A script can do this easily.
Bill Bell
Sent from mobile
On Mar 17, 2011, at 9:02 AM, Bernd Fehling
wrote:
>
> Good idea.
> Was also just looking into this area.
>
> Assuming my input record looks like this:
>
>
>author_1 ; author_2 ;
> author_3
>
>
>
> Do
Aha, oh well, not quite as good/flexible as I hoped.
Still, if lucene is now behaving somewhat more predictably/rationally
when sorting on multi-valued fields, then I think, in response to your
other email on a similar thread, perhaps SOLR-2339 is now a mistake.
When lucene was returning com
By the way, this could be done automatically by Solr or Lucene behind the
scenes.
Bill Bell
Sent from mobile
On Mar 17, 2011, at 9:02 AM, Bill Bell wrote:
> Here is a work around. Stick the high value and low value into other fields.
> Use those fields for sorting.
>
> Bill Bell
> Sent fro
Here is a work around. Stick the high value and low value into other fields.
Use those fields for sorting.
Bill Bell
Sent from mobile
On Mar 17, 2011, at 8:49 AM, Yonik Seeley wrote:
> On Wed, Mar 16, 2011 at 6:08 PM, Jonathan Rochkind wrote:
>> Also... if lucene is already capable of sortin
Good idea.
Was also just looking into this area.
Assuming my input record looks like this:
author_1 ; author_2 ;
author_3
Do you know if I can use something like this:
...
...
To just double the input and make author multiValued and author_sort a string
field?
Regards
Bernd
On Thu, Mar 17, 2011 at 10:34 AM, Bernd Fehling
wrote:
>
> Is there a way to have a kind of "casting" for copyField?
>
> I have author names in multiValued string field and need a sorting on it,
> but sort on field is only for multiValued=false.
>
> I'm trying to get multiValued content from one f
On Wed, Mar 16, 2011 at 6:08 PM, Jonathan Rochkind wrote:
> Also... if lucene is already capable of sorting on multi-valued field by
> choosing the largest value largest vs. smallest is presumably just
> arbitrary there, there is presumably no performance implication to choosing
> the smallest
Given a Query object "(name:firefox name:opera)", is it possible 'rename'
the fields names to, for example, "(content:firefox content:opera)"?
On Thu, Mar 17, 2011 at 8:04 PM, Bernd Fehling
wrote:
>
> Is there a way to have a kind of "casting" for copyField?
>
> I have author names in multiValued string field and need a sorting on it,
> but sort on field is only for multiValued=false.
>
> I'm trying to get multiValued content from one fi
Is there a way to have a kind of "casting" for copyField?
I have author names in multiValued string field and need a sorting on it,
but sort on field is only for multiValued=false.
I'm trying to get multiValued content from one field to a
non-multiValued text or string field for sorting.
And th
On 3/17/2011 3:43 AM, Vadim Kisselmann wrote:
Unfortunately, this doesn't seem to be the problem. The queries themselves are
running fine. The problem is that the replications is crawling when there are
many queries going on and that the replication speed stays low even after the
load is gone.
On Thu, Mar 17, 2011 at 1:49 AM, Otis Gospodnetic
wrote:
> The dreaded parent-child without denormalization question. What are one's
> options for the following example:
>
> parent: shoes
> 3 children. each with 2 attributes/fields: color and size
> * color: red black orange
> * size: 10 11 12
Pierre
That is a very good point, I have been caught in the past by poor xml (RSS
feeds) that included control characters before the ' I do have the xml preamble in my
> config file in conf/Catalina/localhost/ and solr starts ok with Tomcat 7.0.8.
> Haven't try with 7.0.11 yet.
>
> I wonder w
On Mar 16, 2011, at 14:53 , Jonathan Rochkind wrote:
> Interesting, any documentation on the PathTokenizer anywhere? Or just have to
> find and look at the source? That's something I hadn't known about, which may
> be useful to some stuff I've been working on depending on how it works.
hi,
We have found that 'EnglishPorterFilterFactory' causes that issue. I believe
that is used for stemming words. Once we commented that factory, it works
fine.
And another thing, currently I am checking about how the word 'sci/tech'
will be indexed in solr. As mentioned in my previous email, if
Yes, pivot faceting is committed to trunk. But is not part of upcoming 3.1
release.
Erik
On Mar 16, 2011, at 15:00 , McGibbney, Lewis John wrote:
> Hi Erik,
>
> I have been reading about the progression of SOLR-792 into pivot faceting,
> however can you expand to comment on
> where i
Hi François,
Thank you for your reply. I had made a simple mistake of including comments
before
'', therefore I was getting a SAX error.
As you have correctly pointed out, it is not essential to include the snippet
as above in the context file (if using one), however it might be useful to know
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) Server VM (build 17.0-b16, mixed mode)
--
View this message in context:
http://lucene.472066.n3.nabble.com/SOLR-building-problems-tp2692916p2693574.html
Sent from the Solr - User mailing list archive at
I do have the xml preamble in my config
file in conf/Catalina/localhost/ and solr starts ok with Tomcat 7.0.8. Haven't
try with 7.0.11 yet.
I wonder why your exception point to line 4 column 6, however. Shouldn't it
point to line 1 column 1 ? Do you have some blank lines at the start of your
Without even looking at the different segment files, things look odd:
You say that you optimize every day, yet I see segments up to 4 days old.
Also look at all the segments_??? files... each represents a commit
point of the index.
So it looks like you have 16 snapshots (or commit points) of the in
Lewis
My update from tomcat 7.0.8 to 7.0.11 went with no hitches, I checked my
context file and it does not have the xml preamble your has, specifically:
'',
Here is my context file:
---
Hope this helps.
Cheers
François
On Mar 16, 2011, at 2:38 PM, McGibbney, Lewis John wrote:
Hi Yonik,
I have another question related to fieldValueCache.
When we uninvert a facet field, and if the termInstances = 0 for a
particular field, then also it gets added to the FieldValueCache.
What is the reason for caching facet fields with termInstances=0?
In our case, a lot of time is being
Hi,
I want to enquire the patch for
namedistinct(SOLR-2242-distinctFacet.patch) available with solr4.0 trunk
On Monday 14 March 2011 08:05 PM, Jonathan Rochkind wrote:
It's not easy if you have lots of facet values (in my case, can even
be up to a million), but there is no way built-in
What Java version do you have installed? (java -version)
Best
Erick
On Thu, Mar 17, 2011 at 6:30 AM, royr wrote:
> Hello,
>
> The apache wiki gives me this information:
>
> Skip this section if you have a binary distribution of Solr. These
> instructions will building Solr from source, if you ha
Hi, Eric!
I suspect that the problem resides in Tomcat. I think that the connection
server-client times out.
What happens if you submit the 9th batch first? I'm wondering if the
> 9th batch is just mal-formed and has nothing to do with the
> previous batches.
The 9th batch is ok, like the o
This page: http://lucene.apache.org/java/3_0_2/fileformats.html#file-names,
when combined with what Yonik said may help you figure it out...
And if you're still stumped, please post the and
definitions you used
Best
Erick
On Wed, Mar 16, 2011 at 5:10 PM, Robert Petersen wrote:
> OK I have
What happens if you submit the 9th batch first? I'm wondering if the
9th batch is just mal-formed and has nothing to do with the
previous batches.
As to the time, what merge factor are you using? And how are you
committing? Via autocommit parameters or explicitly or not at all?
Best
Erick
On
On Thursday 17 March 2011 03:18 AM, Ahmet Arslan wrote:
I am using Solr 4.0 api
to search from index (made using solr1.4 version). I
am
getting error Invalid version (expected 2, but 1) or
the
data in not in 'javabin' format. Can anyone help me to
Hello,
The apache wiki gives me this information:
Skip this section if you have a binary distribution of Solr. These
instructions will building Solr from source, if you have a nightly tarball
or have checked out the trunk from subversion at
http://svn.apache.org/repos/asf/lucene/dev/trunk. Assume
On Wed, 2011-03-16 at 18:36 +0100, Erik Hatcher wrote:
> Sorry, I missed the original mail on this thread
>
> I put together that hierarchical faceting wiki page a couple
> of years ago when helping a customer evaluate SOLR-64 vs.
> SOLR-792 vs.other approaches. Since then, SOLR-792 morphed
>
Hello Shawn,
Primary assumption: You have a 64-bit OS and a 64-bit JVM.
>Jepp, it's running 64-bit Linux with 64-bit JVM
It sounds to me like you're I/O bound, because your machine cannot
keep enough of your index in RAM. Relative to your 100GB index, you
only have a maximum of 14G
Okay.
When I use the map function with ..&sort=map(price, 0, 0, 0, 1) desc then
solr output an error:
17.03.2011 09:42:58 org.apache.solr.common.SolrException log
SCHWERWIEGEND: org.apache.solr.common.SolrException: Missing sort order.
at
org.apache.solr.search.QueryParsing.parseSort(Quer
Hi,
One more query.
Currently in the autosuggestion Solr returns words like below:
googl
googl _
googl search
googl chrome
googl map
The last letter seems to be missing in autosuggestion. I have send the query
as
"?qt=/terms&terms=true&terms.fl=mydata&terms.lower=goog&terms.prefix=goog".
The f
That's the job your analyzer should concern
2011/3/17 Andy :
> Hi,
>
> For my Solr server, some of the query strings will be in Asian languages such
> as Chinese or Japanese.
>
> For such query strings, would the Standard or Dismax request handler work? My
> understanding is that both the Stand
88 matches
Mail list logo