Hi,
The easiest solution would be to have timestamp indexed. Is there any issue
in doing re-indexing?
If you want to process records in batch then you need a ordered list and a
bookmark. You require a field to sort and maintain a counter / last id as
bookmark. This is mandatory to solve your probl
Hi,
I have a slow storage machine and non sufficient RAM for the whole index to
store all the index. This causes the first queries (~5000) to be very slow
(they are read from disk and my cpu is most of time in iowait), and after
that the readings from the index become very fast and read mainly from
The Lucene PMC is pleased to announce the release of the Apache Solr
Reference Guide for Solr 4.4.
This 431 page PDF serves as the definitive users manual for Solr 4.4. As
the first document of it's kind released by the Lucene project, this
release demonstrates a major milestone in the growt
Basically, I was thinking about running a range query like Shawn suggested
on the tstamp field, but unfortunately it was not indexed. Range queries
only work on indexed fields, right?
On Sun, Jul 28, 2013 at 9:49 PM, Joe Zhang wrote:
> I've been thinking about tstamp solution int the past few d
I've been thinking about tstamp solution int the past few days. but too
bad, the field is avaialble but not indexed...
I'm not familiar with SolrJ. Again, sounds like SolrJ is providing the
counter value. If yes, that would be equivalent to an autoincrement id. I'm
indexing from Nutch though; don'
unsubscribe
On Sun, Jul 28, 2013 at 5:59 PM, Jan Høydahl wrote:
> Hi,
>
> Looking at the code, you are right. Whitelist processing is only done on
> detected languages, not on the fallback or fallbackFields languages, since
> these are assumed to be correct. Thus you should not pass in a fallba
You should be able to attach a patch, wonder if there was some
temporary glitch in the JIRA. Is this persisting.
Let us know if this continues...
Erick
On Sun, Jul 28, 2013 at 12:11 PM, Elran Dvir wrote:
> Hi,
>
> I have created an issue: https://issues.apache.org/jira/browse/SOLR-5084
> I trie
Hi,
Looking at the code, you are right. Whitelist processing is only done on
detected languages, not on the fallback or fallbackFields languages, since
these are assumed to be correct. Thus you should not pass in a fallback
language, either in the input document or with langid.fallback which ca
start is a window into the sorted, matched documents.
So, whether the second query matches a lot less documents, and hence has
less to sort, depends once again on where X lies in the distribution of
documents. If X if the first term in the field, the second query would match
all documents (exc
Actually I have to rewrite my question:
Query 1:
q=*:*&rows=row_count&sort=id asc&start=X
and
Query2:
q={X TO *}&rows=row_count&sort=id asc&start=0
2013/7/29 Jack Krupansky
> The second query excludes documents matched by [* TO X], while the first
> query matches all documents.
>
> Relati
The second query excludes documents matched by [* TO X], while the first
query matches all documents.
Relative performance will depend on relative match count and the sort time
on the matched documents. Sorting will likely be the dominant factor - for
equal number of documents. So, it depends
What is the difference between:
q=*:*&rows=row_count&sort=id asc
and
q={X TO *}&rows=row_count&sort=id asc
Does the first one trys to get all the documents but cut the result or they
are same or...? What happens at underlying process of Solr for that two
queries?
Hi,
Yes, it can be done, if you search the mailing list for 'two solr instances
same datadir', you will a post where i am describing our setup - it works
well even with automated deployments
how do you measure performance? I am asking before one reason for us having
the same setup is sharing the O
Maybe you're right.
The problem is that with the different types of queries it is hard to
properly size document- and queryResultCaches (one query requests 10
results per page, others up to 12000).
We tried different approaches, cache sizes and spend hours with JVM
configuration (OutOfMemory proble
Hi,
I have created an issue: https://issues.apache.org/jira/browse/SOLR-5084
I tried to attach my patch, but it failed: " Cannot attach file
Solr-5084.patch: Unable to communicate with JIRA."
What am I doing wrong?
Thanks.
-Original Message-
From: Erick Erickson [mailto:erickerick...@g
Shawn had an interesting idea on another thread. It depends
on having basically an identity field (which I see how to do
manually, but don't see how to make work as a new field type
in a distributed environment). And it's brilliantly simple, just
a range query identity:{ TO *]&sort=identity asc
Why wouldn't a simple timestamp work for the ordering? Although
I guess "simple timestamp" isn't really simple if the time settings
change.
So how about a simple counter field in your documents? Assuming
you're indexing from SolrJ, your setup is to query q=*:*&sort=counter desc.
Take the counter f
You'd probably start in CloudSolrServer in SolrJ code,
as far as I know that's where the request is sent out.
I'd think that would be better than changing Solr itself
since if you found that this was useful you wouldn't
be patching your Solr release, just keeping your client
up to date.
Best
Eric
My first guess is that you have old jars in your classpath. Try a
fresh install first outside of your current setup as a first test. If that
works, then you'll need to track down where your old jars are
Best
Erick
On Fri, Jul 26, 2013 at 7:26 PM, Mingfeng Yang wrote:
> I am trying to upgrade
http://leden.lionsclubdehaan.be/ahqig/bikisjprkowoft
Frank Apap
7/28/2013 9:12:39 AM
20 matches
Mail list logo