Nicolae,
You may be able to figure things out from the heap dump. You'll need to start
the JVM like this, for example:
java -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heap ...
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hado
Robi,
Solr is indeed very stable. However, it can crash and I've seen it crash. Or
rather, I should say I've seen the JVM that runs Solr crash. For instance, if
you have a servlet container with a number of webapps, one of which is Solr,
and one of which has a memory leak, I believe all weba
Joe,
Maybe we can take a step back first. Would it be better if your index was
cleaner and didn't have flagged duplicates in the first place? If so, have you
tried using http://wiki.apache.org/solr/Deduplication ?
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene,
Stephen,
Yes, *:* will work, or at least it did last time I tried it a few months ago.
This should quickly warm up your OS disk cache.
Yes, if searcher warming takes too long, you may need to commit less frequently
to avoid searcher overlap.
Otis
--
Sematext is hiring -- http://sematext.com/a
Hi All,
Is there any plan to have Solr supported on Google App Engine. I saw a patch
for SolrJ submitted by Noble Paul. I think it would be good if we can
support Solr on App Engine.
Warm Regards,
Allahbaksh
did you take a look at https://issues.apache.org/jira/browse/SOLR-1293
which already handles this
On Sat, Aug 1, 2009 at 2:40 AM, danben wrote:
>
> And, re-examining the URL, this is clearly my fault for improper use of
> SolrJ. Please ignore.
>
>
> danben wrote:
>>
>> Hi,
>>
>> I'm developing a
You might also look at Mahout, and specifically Taste: http://lucene.apache.org/mahout/taste.html
. Of course, it is a far different approach from MLT.
-Grant
On Jul 31, 2009, at 8:08 AM, Andrew Ingram wrote:
Hi all,
I'm trying various methods of building a user-specific product
recommendati
On Fri, Jul 31, 2009 at 5:23 PM, Yonik Seeley wrote:
> > Ok, so that was the curiosity question. More critical:
> >
> > When we first ask for facets for multi-valued fields, it can take up to
> 25
> > seconds to get the response, although after that it's very fast (1.5
> seconds
> > or less even
On Fri, Jul 31, 2009 at 5:06 PM, Stephen Duncan
Jr wrote:
> I have a couple more questions on the FieldValueCache. I see that the
> number of items in the cache is basically the number of multi-valued fields
> facets have been requested for. What does each entry in the cache actually
> contain?
And, re-examining the URL, this is clearly my fault for improper use of
SolrJ. Please ignore.
danben wrote:
>
> Hi,
>
> I'm developing an application that requires a large number of cores, and
> since lazy loading / LRU caching won't be available until 1.5, I decided
> to modify CoreContainer
I have a couple more questions on the FieldValueCache. I see that the
number of items in the cache is basically the number of multi-valued fields
facets have been requested for. What does each entry in the cache actually
contain? How does it's size grow as the number of total documents
increases
hello all, i have a collection of a few million documents; i have many
duplicates in this collection. they have been clustered with a simple
algorithm, i have a field called 'duplicate' which is 0 or 1 and a
fields called 'description, tags, meta', documents are clustered on
different criteria and
Hi,
I'm developing an application that requires a large number of cores, and
since lazy loading / LRU caching won't be available until 1.5, I decided to
modify CoreContainer to hold me over.
Another requirement is that multiple Solr instances can access the same
cores (on NAS, for instance), so
Having a large number of fields is not the same as having a large number of
facets. To facets are something you would display to users as aid for query
refinement or navigation. There is no way for a user to use 3700 facets at
the same time. So it more of question on how to determine what facets t
>
> So if I search for "id:1 OR id:2 OR id:3", I want the MLT result to be a
> single list of items, rather than 3 lists.
>
I did not understand this. Isn't the "q" parameter in MLT handler supposed
to serve the same objective. "/mlt?q=(id:1 OR id:2 OR
id:3)&mlt.fl=mlt-field&mlt.mintf=1" just works
Hi Mark,
You're right - a custom request handler sounds like the right option.
I've created a handler as you suggested, but I'm having problems on Solr
startup (my class is LiveCoresHandler):
Jul 31, 2009 5:20:39 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.ClassCastException: Liv
Hi all,
My solr project powers almost all the pages in our site and so needs to
be up period. My question is what can I do to ensure that happens?
Does solr ever crash, assuming reasonable load conditions and no extreme
index sizes?
I saw some comments about running solr under daemontools in ord
The CSVLoader is very fast but it doesn't support document or field boosting
at index time. If you don't need that you can also generate input data to
Solr into file(s) to be loaded by the CSVLoader. Just reload whenever you
change the schema. You will need to regenerate data if you add/remove
f
Grant said:
> I thought Apple promoted Leopard as being faster than Tiger...
I won't comment on what Apple thinks, but yes this was my understanding that
each version of the OS was getting faster, and then what they showed about
more thorough 64 bit support in Snow Leopard I'd expect the trend to
Use the mlt handler and then add the facet parameters after that:
/solr/mlt?q=tille:A&mlt.fl=author&facet-true&facet.field=topic
Bill
On Fri, Jul 31, 2009 at 11:11 AM, Jérôme Etévé wrote:
> Hi all,
>
> Is there a way to enable faceting when using a more like this handler?
> I'd like to have f
Hi again!
Thanks for the answer, Grant.
> It could very well be the case that you aren't seeing any merges with
> only 20K docs. Ultimately, if you really want to, you can look in
> your data.dir and count the files. If you have indexed a lot and have
> an MF of 100 and haven't done an optimiz
Hi all,
Is there a way to enable faceting when using a more like this handler?
I'd like to have facets from my similar documents.
Cheers !
J.
--
Jerome Eteve.
Chat with me live at http://www.eteve.net
jer...@eteve.net
We are using 1.3.0. Thanks for the suggestion. Will see if I can try one of
the ngihtly builds.
On Fri, Jul 31, 2009 at 7:49 PM, Erik Hatcher wrote:
> What version of Solr? Try a nightly build if you're at Solr 1.3 or
> earlier and you'll be amazed at the difference.
>
>Erik
>
>
> On Ju
What version of Solr? Try a nightly build if you're at Solr 1.3 or
earlier and you'll be amazed at the difference.
Erik
On Jul 31, 2009, at 10:00 AM, Rahul R wrote:
In a production environment, having the caches enabled makes a lot
of sense.
And most definitely we will be enabling
On Jul 31, 2009, at 8:04 AM, Chantal Ackermann wrote:
Dear all,
I want to find out which settings give the best full index
performance for my setup.
Therefore, I have been running a small index (less than 20k
documents) with a mergeFactor of 10 and 100.
In both cases, indexing took about
All that you need to do is paste the contents of your your
data-config.xml and hit the button. It shows up the details of the RHS
pane.
I should recommend you to use a recent nightly so that the line
numbers make sense to us
On Fri, Jul 31, 2009 at 6:50 PM, ahammad wrote:
>
> I looked at the DIH
In a production environment, having the caches enabled makes a lot of sense.
And most definitely we will be enabling them. However, the primary idea of
this exercise is to verify if limiting the number of facets will actually
improve the performance.
An update on this. I did verify and looks like
I thought Apple promoted Leopard as being faster than Tiger, so that
would be my guess. Also, are they the same versions of 1.5? Are you
exercising them in the same way? (same queries, docs, etc.?)
On Jul 30, 2009, at 5:10 PM, Mark Bennett wrote:
As far as our NOC guys know the machines a
Simple but effective ;-)
On Fri, Jul 31, 2009 at 3:23 PM, Erik Hatcher wrote:
> There certainly could be some intermediate storage of documents prior to
> indexing, but as far as the Lucene index goes it is inherently a one-way
> process. Solr could facilitate this pretty easily... with an updat
You don't have to create a new "handler" for this... just do some
preprocessing on the resultset that comes back on your first "id:1 OR id:2
OR id:3" query.
So
- post your query
- get the relevant text-nodes from the resultset (XSL-processing is great
for that).
- Combine the text
- Send that text
There certainly could be some intermediate storage of documents prior
to indexing, but as far as the Lucene index goes it is inherently a
one-way process. Solr could facilitate this pretty easily... with an
update processor that wrote the documents coming in to some other
storage (one opti
I looked at the DIH debug page to to be honest I'm not sure how to use it
well and get something out of it.
I am using a solr 1.4 nightly from March.
Cheers
Noble Paul നോബിള് नोब्ळ्-2 wrote:
>
> you can try going to the DIH debug page. BTW which version of DIH are you
> using?
>
> On Fri,
Hi Edwin,
what prevents you of storing the data (possibly formatted in SOLR xml
input format) yourself on some disk?
Cheers,
Chantal
Edwin Stauthamer schrieb:
That is a shame. I have much experience with Autonomy IDOL and the
possibility of quickly reindexing the content without making a cal
On Fri, Jul 31, 2009 at 6:29 PM, Erik Hatcher wrote:
>
> On Jul 31, 2009, at 7:01 AM, Vannia Rajan wrote:
>
> On Fri, Jul 31, 2009 at 3:22 PM, Erik Hatcher > >wrote:
>>
>> You'll have to reindex your documents from scratch. Such is the nature
>>> of
>>> changing the schema of an index. It's al
you can try going to the DIH debug page. BTW which version of DIH are you using?
On Fri, Jul 31, 2009 at 6:31 PM, ahammad wrote:
>
> Hello,
>
> I tried it using the debug and verbose parameters in the address bar. This
> is what appears in the logs:
>
> INFO: Starting Full Import
> Jul 31, 2009 8:
That is a shame. I have much experience with Autonomy IDOL and the
possibility of quickly reindexing the content without making a call to the
original source is great. Just Export, update the config, and import
(=reindex) to see if, for instance the performance is better or just to
transport the in
On Jul 31, 2009, at 7:17 AM, Rahul R wrote:
Erik,
I understand that caching is going to improve performance. Infact we
did a
PSR run with caches enabled and we got awesome results. But these
wouldn't
be really representative because the PSR scripts will be doing the
same
searches again an
Hello,
I tried it using the debug and verbose parameters in the address bar. This
is what appears in the logs:
INFO: Starting Full Import
Jul 31, 2009 8:54:40 AM org.apache.solr.handler.dataimport.SolrWriter
readIndexerProperties
INFO: Read dataimport.properties
Jul 31, 2009 8:54:40 AM org.apach
On Jul 31, 2009, at 7:01 AM, Vannia Rajan wrote:
On Fri, Jul 31, 2009 at 3:22 PM, Erik Hatcher >wrote:
You'll have to reindex your documents from scratch. Such is the
nature of
changing the schema of an index. It's always a great idea (in
fact, I'd say
mandatory) to have a full reindex
Hi all,
I'm trying various methods of building a user-specific product
recommendation system and one idea is to use solr's MLT functionality.
For each customer I have a list of items they've bought, and I want to find
similar items that are new to the site.
The problem is that MLT operates on eac
Dear all,
I want to find out which settings give the best full index performance
for my setup.
Therefore, I have been running a small index (less than 20k documents)
with a mergeFactor of 10 and 100.
In both cases, indexing took about 11.5 min:
mergeFactor: 10
0:11:46.792
mergeFactor: 100
/ad
: If you make your EventListener implements SolrCoreAware you can get
: hold of the core on inform. use that to get hold of the
: SolrIndexWriter
Implementing SolrCoreAware I can get hold of the core and easy get hold of A
SolrIndexSearcher and so a reader. But I can't see the way to get hold of
Erik,
I understand that caching is going to improve performance. Infact we did a
PSR run with caches enabled and we got awesome results. But these wouldn't
be really representative because the PSR scripts will be doing the same
searches again and again. These would be cached and there would be virt
On Fri, Jul 31, 2009 at 3:22 PM, Erik Hatcher wrote:
> You'll have to reindex your documents from scratch. Such is the nature of
> changing the schema of an index. It's always a great idea (in fact, I'd say
> mandatory) to have a full reindex process handy.
>
>
Thank you for your response. Yes,
On Fri, Jul 31, 2009 at 3:17 PM, Tim Sell wrote:
> Are you using solr as a data store?
>
No, data comes from somewhere else, solr is just for indexing giving back
query results.
>
> It is not possible via solr to change existing documents in a solr
> index. It would be a nice feature though.
>
On Jul 31, 2009, at 2:35 AM, Rahul R wrote:
Hello,
We are trying to get Solr to work for a really huge parts database.
Details
of the database
- 55 million parts
- Totally 3700 properties (facets). But each record will not have
value for
all properties.
- Most of these facets are defined
You'll have to reindex your documents from scratch. Such is the
nature of changing the schema of an index. It's always a great idea
(in fact, I'd say mandatory) to have a full reindex process handy.
Erik
On Jul 31, 2009, at 2:37 AM, Vannia Rajan wrote:
Hi,
We are using solr-se
That really is the only way, it would be far easier if you were
importing from another source.
Are you using solr as a data store?
It is not possible via solr to change existing documents in a solr
index. It would be a nice feature though.
~Tim.
2009/7/31 Vannia Rajan :
> Hi,
>
> We are using s
Thanks Noble and Shalin.
Cheers
Avlesh
On Fri, Jul 31, 2009 at 1:23 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> On Fri, Jul 31, 2009 at 11:53 AM, Avlesh Singh wrote:
>
> > Thanks for the revert Noble. A few questions are still open:
> >
> > 1. Can I pass parameters to DIH and
On Fri, Jul 31, 2009 at 11:53 AM, Avlesh Singh wrote:
> Thanks for the revert Noble. A few questions are still open:
>
> 1. Can I pass parameters to DIH and be able to use them inside the
> "query" attribute of an entity inside the data-config file?
>
Yes. Use ${dataimporter.request.X} or ${
On Fri, Jul 31, 2009 at 1:43 AM, ahammad wrote:
>
> Hello all,
>
> I've been having this issue for a while now. I am indexing a Sybase
> database. Everything is fantastic, except that there is 1 column that I can
> never get back. I don't have direct database access via Sybase client, but I
> was a
51 matches
Mail list logo