Hi,
which version do you use? 1.4.1 is highly recommended since previous versions
contained some
bugs related to memory usage that could lead to memory leaks. i had this gc
overhead limit
in my setup as well. only workaround that helped was a dayly restart of all
instances.
with 1.4.1 this iss
This is in solrconfig.xml:::
default
solr.IndexBasedSpellChecker
spell
./spellchecker
0.7
true
true
jarowinkler
lowerfilt
org.apache.lucene.search.spell.JaroWinklerDistance
./spellchecker
Do you have any custom code, or is this stock solr (and which version,
and what is the request)?
-Yonik
http://www.lucidimagination.com
On Tue, Jul 27, 2010 at 12:30 AM, Manepalli, Kalyan
wrote:
> Hi,
> I am stuck at this weird problem during querying. While querying the solr
> index I am get
Hi,
I am stuck at this weird problem during querying. While querying the solr
index I am getting the following error.
Index: 52, Size: 16 java.lang.IndexOutOfBoundsException: Index: 52, Size: 16 at
java.util.ArrayList.RangeCheck(ArrayList.java:547) at
java.util.ArrayList.get(ArrayList.java:32
i think the search log will require a lot of storage which may make indexes
size unreasonable large if store in solr.
and the aggregration results may not really fixed in lucene index structure.
:)
kiwi
happy hacking !
On Tue, Jul 27, 2010 at 7:47 AM, Tommy Chheng wrote:
> Alternatively, hav
Man, what types of fields is StatsComponent actually known to work with?
With an sint, it seems to have trouble if there are any documents with null
values for the field. It appears to decide that a null/empty/blank value is
-1325166535, and is thus the minimum value.
At least if I'm interpret
> Short answer: "GC overhead limit exceeded" means "out of memory".
Aha, thanks. So the answer is just "raise your Xmx/heap size, you need more
memory to do what you're doing", yeah?
Jonathan
I want a cache to cache all result of a query(all steps including
collapse, highlight and facet). I read
http://wiki.apache.org/solr/SolrCaching, but can't find a global
cache. Maybe I can use external cache to store key-value. Is there any
one in solr?
On Mon, Jul 26, 2010 at 7:17 PM, Jonathan Rochkind wrote:
> I am now occasionally getting a Java "GC overhead limit exceeded" error in
> my Solr. This may or may not be related to recently adding much better (and
> more) warming querries.
When memory gets tight, the JVM kicks of a garbage collect
See below:
On Mon, Jul 26, 2010 at 11:49 AM, Pramod Goyal wrote:
> Hi,
> I have a requirement where i need to keep updating certain fields in
> the schema. My requirement is to change some of the fields or add some
> values to a field ( multi-value field ). I understand that i can use Solr
>
: Sorry, like the subject, I mean the total number of terms.
it's not stored anywhere, so the only way to fetch it is to actually
iteate all of the terms and count them (that's why LukeRequestHandler is
slow slow to compute this particular value)
If i remember right, someone mentioned at one p
It's almost impossible to analyze this kind of thing without seeing your
schema and debug output. You might want to review:
http://wiki.apache.org/solr/UsingMailingLists
Best
Erick
On Mon, Jul 26, 2010 at 9:56 AM, satya swaroop wrote:
> hi all,
>i am a new one to solr and able to implem
I need much more detailed information before I can make sense of your use
case.
Could you provide some sample?
MoreLikeThis sounds in the right neighborhood, but I'm guessing.
Best
Erick
On Mon, Jul 26, 2010 at 9:02 AM, wrote:
>
> Hi,
>
> I would like to implement a similar search feature...
I'm having trouble getting my head around what you're trying to accomplish,
so if this is off base you know why .
But what it smells like is that you're trying to do database-ish things in
a SOLR index, which is almost always the wrong approach. Is there a
way to index redundant data with each doc
We have an index around 25-30G w/ 1 master and 5 slaves. We perform
replication every 30 mins. During replication the disk I/O obviously
shoots up on the slaves to the point where all requests routed to that
slave take a really long time... sometimes to the point of timing out.
Is there any lo
Alternatively, have you considered storing(or i should say indexing)
the search logs with Solr?
This lets you text search across your search queries. You can perform
time range queries with solr as well.
@tommychheng
Programmer and UC Irvine Graduate Student
Find a great grad school based on
On 7/26/10 4:43 PM, Mark wrote:
We are thinking about using Cassandra to store our search logs. Can
someone point me in the right direction/lend some guidance on design?
I am new to Cassandra and I am having trouble wrapping my head around
some of these new concepts. My brain keeps wanting to g
We are thinking about using Cassandra to store our search logs. Can
someone point me in the right direction/lend some guidance on design? I
am new to Cassandra and I am having trouble wrapping my head around some
of these new concepts. My brain keeps wanting to go back to a RDBMS design.
We wi
: However, when I'm trying this very URL with curl within my (perl) script, I
: receive a NullPointerException:
: CURL-COMMAND: curl -sL
:
http://localhost:8983/solr/select?indent=on&version=2.2&q=*&fq=ListId%3A881&start=0&rows=0&fl=*%2Cscore&qt=standard&wt=standard
it appears you aren't quoting
I am now occasionally getting a Java "GC overhead limit exceeded" error
in my Solr. This may or may not be related to recently adding much
better (and more) warming querries.
I can get it when trying a 'commit', after deleting all documents in my
index, or in other cases.
Anyone run into thi
Sorry, like the subject, I mean the total number of terms.
On Mon, Jul 26, 2010 at 4:03 PM, Jason Rutherglen
wrote:
> What's the fastest way to obtain the total number of docs from the
> index? (The Luke request handler takes a long time to load so I'm
> looking for something else).
>
What's the fastest way to obtain the total number of docs from the
index? (The Luke request handler takes a long time to load so I'm
looking for something else).
Hi *,
I'd like to see how many documents I have in my index with a certain ListId,
in this example ListId 881.
http://localhost:8983/solr/select?indent=on&version=2.2&q=*&fq=ListId%3A881&start=0&rows=0&fl=*%2Cscore&qt=standard&wt=standard
In the browser, the output looks perfect, I indeed have 3
Hi Savannah,
I have just answered this question over on drupal.org.
http://drupal.org/node/811062
Response number 5 and 11 will help you. On the solrconfig.xml side of things
you will only really need Drupal's version.
Although still in alpha my Nutch module will help you out with integration
I am using Drupal ApacheSolr module to integrate solr with drupal. I already
integrated solr with nutch. I already moved nutch's solrconfig.xml and
schema.xml to solr's example directory, and it work. I tried to append
Drupal's
ApacheSolr module's own solrconfig.xml and schema.xml into the s
Hello all,
I’m working on a project with Solr. I had 1.4.1 working OK using
ExtractingRequestHandler except that it was crashing on some PDFs. I noticed
that Tika bundled with 1.4.1 was 0.4, which was kind of old. I decided to try
updating to 0.7 as per the directions here:
http://wiki.apac
ah okay thx =)
the class "SolrInputDocuments" is only for indexing an document and
"SolrDocuement" for the search ?
when Solr index an document first step is to create an SolrInputDocument.
then in class "DocumentBuilder" creates solr in function "Document
toDocument (SolrInputDoc, Schema)"
an L
: No I didn't. I thought you aren't supposed to run optimize on slaves. Well
correct, you should make all changes to the master.
: but it doesn;t matter now, as I think its fixed now. I just added a dummy
: document on master, ran a commit call and then once that executed ran an
: optimize call.
: where is a Jar, containing org.apache.solr.client.solrj.embedded?
Classes in the embedded package are useless w/o the rest of the Solr
internal "core" classes, so they are included directly in the
apache-solr-core-1.4.1.jar.
(i know .. the directory structure doesn't make a lot of sense)
:
: i want to learn more about the technology.
:
: exists an issue to create really an solrDoc ? Or its in the code only for a
: better understanding of the lucene and solr border ?
There is a real and actual class named "SolrDocument". it is a simpler
object then Lucene's "Document" class becua
Every so often I need to index new batches of scanned PDFs and occasionally
Adobe's OCR can't recognize the text in a couple of these documents. In these
situations I would like to type in a small amount of text onto the document and
have it be extracted by Solr CELL.
Adobe Pro 9 has a numbe
I still assume that what you mean by "search queries data" is just some
other form of document (in this case containing 1 seach-request per
document)
I'm not sure what you intend to do by that actually, but yes indexing stays
the same (you probably want to mark field "type" as required so you don't
Isn't it always one of these three? (from most likely to least likely,
generally)
Memory
Disk Speed
WebServer and it's code
CPU.
Memory and Disk are related, as swapping occurs between them. As long as memory
is high enough, it becomes:
Disk Speed
WebServer and it's code
CPU
If the WebServer
If it's not the data that's being searched, you can alway encode it before
inserting it. You either have to either fruther encode it to base64 to make it
printable before storing it, OR use a binary field.
You probably could also set up an external process that cycles through every
document in
Thanks for you answer! That's great.
Now to index search quieries data is there something special to do? or it stay
as usual?
-Original Message-
From: Geert-Jan Brits
To: solr-user@lucene.apache.org
Sent: Mon, Jul 26, 2010 4:57 pm
Subject: Re: 2 type of docs in same schema?
Hi,
I have a requirement where i need to keep updating certain fields in
the schema. My requirement is to change some of the fields or add some
values to a field ( multi-value field ). I understand that i can use Solr
update for this. If i am using Solr update do i need to publish the entire
i want to learn more about the technology.
exists an issue to create really an solrDoc ? Or its in the code only for a
better understanding of the lucene and solr border ?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p99.html
Sent from the
as far as i know this is not needed, the optimized index is automatically
replicated to the
slaves. therefore something seems to be really wrong with your setup. maybe the
slave index
got corrupted for some reason? did u try deleting the data dir + slave restart
for a fresh
replicated index? may
No I didn't. I thought you aren't supposed to run optimize on slaves. Well
but it doesn;t matter now, as I think its fixed now. I just added a dummy
document on master, ran a commit call and then once that executed ran an
optimize call. This triggered snapshooter to replicate the index, which
some
You can easily have different types of documents in 1 core:
1. define searchquery as a field(just as the others in your schema)
2. define type as a field (this allows you to decide which type of documents
to search for, e.g: "type_normal" or "type_search")
now searching on regular docs becomes:
q
did you try an optimize on the slave too?
> Yes I always run an optimize whenever I index on master. In fact I just ran
> an optimize command an hour ago, but it didn't make any difference.
>
Hi Chantal,
did you tried to write a http://wiki.apache.org/solr/DIHCustomFunctions
custom DIH Function ?
If not, I think this will be a solution.
Just check, whether "${prog.vip}" is an empty string or null.
If so, you need to replace it with a value that never can response anything.
So the vi
I need you expertise on this one...
We would like to index every search query that is passed in our solr engine
(same core)
Our docs format are like this (already in our schema):
title
content
price
category
etc...
Now how to add "search queries" as a field in our schema? Know that the sea
DataImportHandler (DIH) is an add-on to Solr. It lets you import documents
from a number of sources in a flexible way. The only connection DIH has to
Lucene is that Solr uses Lucene as the index engine.
When you work with Solr you naturally talk about Solr Documents, if you were
working with Luce
I have just fixed it.
Problem was related with operating system value - they were different that
solr expected with incoming datastream.
Regards,
Rafal Zawadzki
On Mon, Jul 26, 2010 at 3:20 PM, Chantal Ackermann <
chantal.ackerm...@btelligent.de> wrote:
> On Mon, 2010-07-26 at 14:46 +0200, Raf
I just checked my config file, and I do have exact same values for
deletionPolicy tag, as you attached in your email, so I dont really think it
could be this.
--
View this message in context:
http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996373.html
Sent fr
Yes I always run an optimize whenever I index on master. In fact I just ran
an optimize command an hour ago, but it didn't make any difference.
--
View this message in context:
http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996364.html
Sent from the Solr - Us
hi all,
i am a new one to solr and able to implement indexing the documents
by following the solr wiki. now i am trying to add the spellchecking. i
followed the spellcheck component in wiki but not getting the suggested
spellings. i first build it by spellcheck.build=true,...
here i give u
Hi,
are u calling on the master to finally remove deleted documents and
merge the index files?
once a day is recommended:
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations
cheers
-Ursprüngliche Nachricht-
Von: Muneeb Ali [mailto:muneeba...@hotmail.com]
G
Hi,
I think that you may be using a Lucene/Solr IndexDeletionPolicy that does
not remove old commits (and you aren't propagating solr-config via
replication).
You can configre this feature on the solr-config.xml inside the
tag:
*
1
0
*
I hope this can be help
Hi,
I am using Solr 1.4 version, with master-slave setup. We have one master
slave and two slave servers. It was all working fine, but lately solr slaves
are behaving strange. Particularly during replicating the index, the slave
nodes die and always need a restart. Also the index size of slave no
On Mon, 2010-07-26 at 14:46 +0200, Rafal Bluszcz Zawadzki wrote:
> EEE, d MMM HH:mm:ss z
not sure but you might want to try with an uppercase 'Z' for the
timezone (surrounded by single quotes, alternatively). The rest of your
pattern looks fine. But if you still run into problems try differen
Hi,
I would like to implement a similar search feature... but not relative to the
initial search query but relative to each resuts documents.
The structure of each doc is:
id
title
content
price
etc...
Then we have a database of global search seach queries, i'm thinking to
integrate this in
btw , i want to put all the requestHandlers(more than 1) in 1 xml file and i
want to use this in my solrConfig.xml
i have used xinclude but it didnt work ..
please suggest me any thing
Thanks,
Prasad
--
View this message in context:
http://lucene.472066.n3.nabble.com/2-solr-dataImport-requ
Tq very Much ..
--
View this message in context:
http://lucene.472066.n3.nabble.com/2-solr-dataImport-requests-on-a-single-core-at-the-same-time-tp978649p996190.html
Sent from the Solr - User mailing list archive at Nabble.com.
I am using also others dateFormat string, also in same data handler and they
works. But not this one.
And this data are fetching from the external source, so I don't have
possibility to modify them (well, theoritacly i can save them, edit etc but
this is not the way). Why this is not working with
Hi,
I think there is an open bug for it at:
https://issues.apache.org/jira/browse/SOLR-1902
Using Solr 1.4.1 and upgrading Tika libraries to 0.8 snapshot I had also to
upgrade pdfbox, fontbox and jembox to 1.2.1; I got no errors and it seems
it's able to index PDFs without any errors (I can query t
I uses format like -MM-ddThh:mm:ssZ. it works
2010/7/26 Rafal Bluszcz Zawadzki :
> Hi,
>
> I am using Data Import Handler from Solr 1.4.
>
> Parts of my data-config.xml are:
>
>
> processor="XPathEntityProcessor"
> stream="false"
> forEach="
Hi,
I am using Data Import Handler from Solr 1.4.
Parts of my data-config.xml are:
.
During full-import I got message:
WARNING: Error creating document :
SolrInputDocument[{SearchableText=SearchableText(1.0)={phrase},
parentPaths=parentPaths(1.0)={/site
Hello experts,
where is a Jar, containing org.apache.solr.client.solrj.embedded?
I miss this package in 'apache-solr-solrj-1.4.[01].jar'.
Also I can't find any other sources than
>http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/webapp/src/org/apache/solr/client/solrj/embedded/ , which
... but in the code is the talk about of, SolrDocuments. these are higher
level docs, used to construct the lucene doc to index ... !!?!?!?!?
and in wiki is the talk about "Build Solr documents by aggregating data from
multiple columns and tables according to configuration"
http://wiki.apache.or
Hi,
my use case is the following:
In a sub-entity I request rows from a database for an input list of
strings:
/* multivalued, not required */
The root entity is "prog" and it has an optional multivalued field
called "vip". When the list of "vip" val
Stockii,
Solr's index is a Lucene Index. Therefore, Solr documents are Lucene
documents.
Kind regards,
- Mitch
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p995968.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hello.
I write a little text about SOLR and LUCENE by using the DIH.
what documents are creating and inserting DIH ? in wiki is the talk about
"solr documents" but i thought that, solr uses lucene to do this and so that
DIH creates Lucnee Documents, not Solr Documents !?
what are doing the D
Hi ,
There is no required fields except u specify any fields to required.U can
remove or add as many fields u want.
That is an example schema which shows how feilds are configured
--
View this message in context:
http://lucene.472066.n3.nabble.com/schema-xml-tp995696p995800.html
Sent from the S
Hi Girish,
I am not aware of such a thing.
But you could use a middleware to avoid certain fields from being
retrieved via the 'fl' parameter:
http://wiki.apache.org/solr/CommonQueryParameters#fl
E.g. for your customers the query looks like q=hello&fl=title and for
your admin the query looks like
Hi,
I haven't read everything thoroughly but have you considered creating
fields for each of your (I think what you call) "party value"?
So that you can query like "client:Pramod".
You would then be able to facet on client and supplier.
Cheers,
Chantal
On Fri, 2010-07-23 at 23:23 +0200, Geert
Hi everybody,
since a while i'm working with solr and i have integrated it with
liferay 6.0.3. So every search request from liferay is processed by solr
and its index.
But i have to integrate another system, this system offers me a
webservice. the results of these webservice should be in the resul
68 matches
Mail list logo