Right. Of course, in most cases you'd run out of hardware resources before you
run out of Integers.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
From: Norberto Meijome <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Wednesday,
On Thu, Nov 13, 2008 at 3:52 AM, Chris Hostetter
<[EMAIL PROTECTED]> wrote:
>
> : You need to modify the schema which came with Solr to suit your data. There
>
> If i'm understanding this thread correctly, DIH ran "successfully", docs
> were created, some fields were stored and indexed (because the
the fact that it got committed in the end suggests there was no error in between
look at the status url and see the no:of rows returned etc.
It gives a clue as to what would have really happened. or you can
paste your dataconfig and status xmls and we may be able to suggest
something
On Thu, Nov
The JdbcDataSource can run any query even updates and deletes
On Thu, Nov 13, 2008 at 9:27 AM, Noble Paul നോബിള് नोब्ळ्
<[EMAIL PROTECTED]> wrote:
> DIH can delete rows from the index. look at the 'deletedPkQuery' option .
> http://wiki.apache.org/solr/DataImportHandler#head-70d3fdda52de9ee4fdb54
DIH can delete rows from the index. look at the 'deletedPkQuery' option .
http://wiki.apache.org/solr/DataImportHandler#head-70d3fdda52de9ee4fdb54e1c6f84199f0e1caa76
Deleting from the DB is not possible for DIH . but you can write a
transformer or Entityprocessor which can do that.
On Wed, Nov 12
Hi Noble,
thanks for reply, my comments are below
>>why is the id field multivalued?
I was just trying various options, yes, this ID is unique, and I check for
duplicates, when I did a distinct (id) query to the MySQL database, it
returned almost 2 million.
>> look at the status host:post/dataim
On Thu, Nov 13, 2008 at 3:52 AM, Chris Hostetter
<[EMAIL PROTECTED]>wrote:
>
> : You need to modify the schema which came with Solr to suit your data.
> There
>
> If i'm understanding this thread correctly, DIH ran "successfully", docs
> were created, some fields were stored and indexed (because t
It is implemented. We used this feature to ingest data from a REST API quite
similar to Solr's own.
Our use-case was that the first call to the API returned a token in the xml
response. To get to the next set of results, the value of the token in the
last response needs to be passed as a request p
On Tue, 11 Nov 2008 20:39:32 -0800 (PST)
Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
> With Distributed Search you are limited to # of shards * Integer.MAX_VALUE.
yeah, makes sense. And i would suspect since this is PER INDEX , it applies to
each core only ( so you could have n cores in m shards
On Tue, 11 Nov 2008 10:25:07 -0800 (PST)
Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
> Doc ID gaps are zapped during segment merges and index optimization.
>
thanks Otis :)
b
_
{Beto|Norberto|Numard} Meijome
"I didn't attend the funeral, but I sent a nice letter saying
Hi!
I have a similar problem but I don't have the solution for now. I will send
my progress.
Marc Sturlese wrote:
>
> Hey there,
> Since few weeks ago I am trying to migrate my lucene core app to Solr and
> many questions are coming to my mind...
> Before being in ApacheCon I thought that my L
: The reason I brought the question back up is that hossman said:
...
: I tried it and it didn't work, so I was curious if I was still doing
: something wrong.
no ... i'm just a foolish foolish man who says things with a lot of
authority even though i clearly don't know what i'm talking
: I effectively need to use a multiplication in the sorting of the items.
: Something like score*popularity.
: It seems the only way to do this is to use a bf parameter.
: However how do you use bf in combination with the standard requestHandler?
functions are understood by the standard query par
: How about create a new core, index data, then swap the core? Old core
: is still available to handle queries till new core replaces it.
a new SolrCore shouldn't help in a situation like this ... with
snapshots and commits on a single SolrCore you at least get the benefits
of autowarming and
In http://wiki.apache.org/solr/DataImportHandler there is this
paragraph:
If an API supports chunking (when the dataset is too large) multiple
calls need to be made to complete the process. XPathEntityprocessor
supports this with a transformer. If transformer returns a row which
contains a fi
: You need to modify the schema which came with Solr to suit your data. There
If i'm understanding this thread correctly, DIH ran "successfully", docs
were created, some fields were stored and indexed (because they did exist
in the schema) but other fields the user was attempting to create didn
: I get the exception when accessing http://localhost:7001/solr/admin but
: http://localhost:7001/solr/admin/luke works fine.
i don't have time to really dig into the code right now, but out of
curiosity what happens when you hit http://localhost:7001/solr/admin/
and/or http://localhost:7001/so
On Wed, Nov 12, 2008 at 3:53 PM, Feak, Todd <[EMAIL PROTECTED]> wrote:
> Is support for setting the FSDirectory this way built into 1.3.0
> release? Or is it necessary to grab a trunk build.
It's not in 1.3, you need a very recent trunk build.
-Yonik
Another way to handle this is not to run commit script at peak
time(still pull snapshot periodically). Keeping track of the number of
requests, resource utilization, etc.. If the number of request exceeds
the threshold, don't commit.
Also, how many segments do you see under index dir? High numb
Well we never had 1.2 deployed, so I don't know if it's a new issue or not...
Yonik Seeley wrote:
>
> Warming only uses one CPU, so it shouldn't have that much of an impact
> on a multi-CPU box.
>
> Did this issue begin with Solr 1.3? Perhaps it has something to do
> with our use of reopen()
Is support for setting the FSDirectory this way built into 1.3.0
release? Or is it necessary to grab a trunk build.
-Todd Feak
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Wednesday, November 12, 2008 11:59 AM
To: solr-user@lucene.ap
NIO support in the latest Solr development versions does not work yet
(I previously advised that some people with possible lock contention
problems try it out). We'll let you know when it's fixed, but in the
meantime you can always set the system property
"org.apache.lucene.FSDirectory.class" to
"
Warming only uses one CPU, so it shouldn't have that much of an impact
on a multi-CPU box.
Did this issue begin with Solr 1.3? Perhaps it has something to do
with our use of reopen() (to share parts of the index that are not in
use). This can lead to greater lock contention while reading from th
And you have searcher warming set up?
Does it use sort and do your queries use sort?
What do your cache settings look like?
How big is your index, how much RAM does your machine have, how much heap does
the JVM have, what does vmstat output look like during warm-up?
...
Otis
--
Sematext -- http:
How about create a new core, index data, then swap the core? Old core
is still available to handle queries till new core replaces it.
-Original Message-
From: Lance Norskog [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 12, 2008 11:16 Joe
To: solr-user@lucene.apache.org
Subject: RE
two general comments on this thread as a whole...
1) it's hard to compare the timing of a query with no synonyms and a query
with a lot of synonyms since the number of terms increases and (most
likely) the number of documents matched in increases as well.
the more clauses in the query, the mor
Yonik Seeley wrote:
>
> On Wed, Nov 12, 2008 at 2:06 PM, oleg_gnatovskiy
> <[EMAIL PROTECTED]> wrote:
>> The rsync seems to have nothing to do with slowness, because while the
>> rsync
>> is going on, there isn't any reload occurring, once the files are on the
>> system, it tries a curl request
Yes, this is the cache autowarming.
We turned this off and staged separate queries that pre-warm our standard
queries. We are looking at pulling the query server out of the load balancer
during this process; it is the most effective way to give fixed response
time.
Lance
-Original Message---
On Wed, Nov 12, 2008 at 2:06 PM, oleg_gnatovskiy
<[EMAIL PROTECTED]> wrote:
> The rsync seems to have nothing to do with slowness, because while the rsync
> is going on, there isn't any reload occurring, once the files are on the
> system, it tries a curl request to reload the searcher, which at th
The rsync seems to have nothing to do with slowness, because while the rsync
is going on, there isn’t any reload occurring, once the files are on the
system, it tries a curl request to reload the searcher, which at that point
causes the delays. The file transfer probably has nothing to do with thi
On Tue, Nov 11, 2008 at 9:31 PM, oleg_gnatovskiy
<[EMAIL PROTECTED]> wrote:
> Hello. We have an index with 15 million documents working on a distributed
> environment, with an index distribution setup. While an index on a slave
> server is being updated, query response times become extremely slow (
: I am using Solr Lucene - 2.0
Hmmm that doesn't exist.
what do you see when you view the /admin/registry.jsp page in your
browser, and you look at these values...
Solr Specification Version
Solr Implementation Version
Lucene Specification Version
Lucene Imp
You could use function query with standardRequestHandler to influence
the final score and sort result by score. If you want to control how
much the function query would affect the original score, you could use
the linear function.
-Original Message-
From: lajkonik86 [mailto:[EMAIL PROTECT
: Cannot parse ' +i_subjects:"Film': Lexical error at line 1, column 19.
: Encountered: after : "\"Film"
: i do not want it splitting commas and replacing them with fq, but completely
: matching on i_subjects:"film,media,mass communication"
i'm having trouble interpreting the formating of you
Hey there,
Since few weeks ago I am trying to migrate my lucene core app to Solr and
many questions are coming to my mind...
Before being in ApacheCon I thought that my Lucene Index works fine with my
Solr Search Engine but after my conversation with Erik in the Solr BootCamp
I understood that the
Could you collaborate further? 20 synonyms would translated to 20
booleanQueries. Are you saying each booleanQuery requires a disk
access?
-Original Message-
From: Walter Underwood [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 12, 2008 7:46 Joe
To: solr-user@lucene.apache.org
Sub
hymmm -- if it does not come out with debugQuery, I don't think there
is a way to get it easily
Can you create a JIRA issue for this? Adding the 'explain' info for
each MLT result should be relatively easy.
ryan
On Nov 12, 2008, at 11:43 AM, Jeff Newburn wrote:
I have also tried de
Yes there is a querycomponent which checks if there are any results
based on a query and if the results are not present then modify the
Boolean query.
So this queryComponent is does call the process().
Thanks,
Kalyan Manepalli
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTE
Ahh shoot!
Ok, I copied the original thread at the bottom for context.
Basically what I need is the bq functionality with the
StandardRequestHandler. I can't use dismax because that requires using qf
and doesn't offer as much flexibility as we need. I have used Erik's
technique of appending an
I have also tried debugQuery=true. It outputs a large amount of data but
none of it appears related to moreLikeThis information. Continuing to work
on it but not sure how it is going to be possible to debug the
functionality. Does anybody have any other suggestions on how to extract
information
Thanks for the reply Koji.
The reason why I asked is because I have a user who wants to post their
own updates.
When the postCommit is active, after he posts his documents, it appears
that the job has stalled because there is a long period with no output.
After speaking with me, he now realizes
Jerry,
> I would like to see the output from snapshooter
snapshooter outputs snapshooter.log. But,
> Is there a way to send snapshooter's output to stdout of the terminal
> which I executed the commit command?
I don't think it's possible.
(You can modify RunExecutableListener to redirect stdou
If there are twenty synonyms, then a one term query becomes a
twenty term query, and that means 20X more disk accesses.
wunder
On 11/12/08 7:08 AM, "Erik Hatcher" <[EMAIL PROTECTED]> wrote:
>
> On Nov 12, 2008, at 9:41 AM, Manepalli, Kalyan wrote:
>> I did the index time synonyms and results do
On Nov 12, 2008, at 9:41 AM, Manepalli, Kalyan wrote:
I did the index time synonyms and results do look much better
than the query time indexing.
But is there a reason for the searches to be that slow. I understand
that we have a pretty long list of synonyms (one word contains atleast
20
See https://issues.apache.org/jira/browse/LUCENE-1417 and
http://lucene.markmail.org/message/sktohlgqxcpmpf7z?q=list:org%2Eapache%2Elucene%2Esolr-user+spellchecker+Rennie
In short, frequency is the second order sort level. I think it should
be made pluggable.A patch would be most welcome.
Hi Erik,
I did the index time synonyms and results do look much better
than the query time indexing.
But is there a reason for the searches to be that slow. I understand
that we have a pretty long list of synonyms (one word contains atleast
20 words as synonyms). Does this have such an adv
On Nov 12, 2008, at 9:12 AM, Kashyap, Raghu wrote:
{quote}It's hard to tell where exactly the bottleneck is without
looking
at the server and a few other things. {quote}
Can you suggest some areas where we can start looking into this issue?
Using &debugQuery=true will output the timings of
Hi Otis,
{quote}It's hard to tell where exactly the bottleneck is without looking
at the server and a few other things. {quote}
Can you suggest some areas where we can start looking into this issue?
-Raghu
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Tue
I'm experiencing the same java.lang.StackOverflowError problem with solr
1.3.0 on Weblogic 10.3 when accessing the admin page.
I'm using the distributed war but have added a weblogic.xml file to the
WEB-INF directory.
I get the exception when accessing http://localhost:7001/solr/admin but
http:/
I effectively need to use a multiplication in the sorting of the items.
Something like score*popularity.
It seems the only way to do this is to use a bf parameter.
However how do you use bf in combination with the standard requestHandler?
hossman wrote:
>
>
> : Now I need to know whether the
bq only works with dismax (&defType=dismax). To get the same effect
with the lucene/solr query parser, append a clause to the original
query (OR'ing it in).
Erik
On Nov 11, 2008, at 11:52 PM, Otis Gospodnetic wrote:
Hi,
It's hard to tell what you are replying to since you remove
tried that and managed to get no results. cheers for the help
&fq=i_subjects:Anesthesia&fq=i_subjects:Intensive+Care&fq=i_subjects:Pain+Management
ryantxu wrote:
>
>>
>> tried removing the plusses i am inserting but now shows too many
>> results
>>
>> &fq=+i_subjects:Film+i_subjects:+media+
Hi,
I want to use a SolrIndexSearcher for some special searches in my app...
I startup my Solr with two cores in it (core_de & core_uk).
But when I try this then my Solr Server generates a complete new cory
instead of
using the existing one...
After 5-6 searches I run out of memory :-(
Examp
53 matches
Mail list logo