Is it possible to join documents from different collections through Parallel
SQL?
In addition to the LIMIT feature on Parallel SQL, can we do use OFFSET to
implement paging?
Thanks,
Imran
Sent from Mail for Windows 10
Thanks for your help, Joel and Susheel.
Regards,
Edwin
On 6 July 2017 at 05:49, Susheel Kumar wrote:
> Hello Joel,
>
> Opened the ticket
>
> https://issues.apache.org/jira/browse/SOLR-11017
>
> Thanks,
> Susheel
>
> On Wed, Jul 5, 2017 at 2:46 PM, Joel Bernstein wrote:
>
> > There are a number
Hi,
Would like to check, how can we place the indexed files of different
collections on different hard disk/folder, but they are in the same node?
For example, I want collection1 to be placed in C: drive, collection2 to be
placed in D: drive, and collection3 to be placed in E: drive.
I am using
Hi erik.
What i want to said is that we have enough memory to store shards, and
furthermore, JVMs heapspaces
Machine has 400gb of RAM. I think we have enough.
We have 10 JVM running on the machine, each of one using 16gb.
Shard size is about 8gb.
When we have query or indexing peaks our proble
Hello Joel,
Opened the ticket
https://issues.apache.org/jira/browse/SOLR-11017
Thanks,
Susheel
On Wed, Jul 5, 2017 at 2:46 PM, Joel Bernstein wrote:
> There are a number of functions that are currently being held up because of
> conflicting duplicate function names. We haven't come to an agre
How are you planing to manual route? What key(s) are you thinking to use.
Second the link i shared was collection aliasing and if you use that, you
will end up with multiple collections. Just want to clarify as you said
above "...manual routing and creating alias"
Again until the join feature is
This should be fixed in Solr 6.4:
https://issues.apache.org/jira/browse/SOLR-9077
Joel Bernstein
http://joelsolr.blogspot.com/
On Wed, Jul 5, 2017 at 2:40 PM, Lewin Joy (TMS)
wrote:
> ** PROTECTED 関係者外秘
>
> Have anyone faced a similar issue?
>
> I have a collection named “solr_test”. I created
There are a number of functions that are currently being held up because of
conflicting duplicate function names. We haven't come to an agreement yet
on the best way forward for this yet. I think we should open a separate
ticket to discuss how best to handle this issue.
Joel Bernstein
http://joel
** PROTECTED 関係者外秘
Have anyone faced a similar issue?
I have a collection named “solr_test”. I created an alias to it as “solr_alias”.
This alias works well when I do a simple search:
http://localhost:8983/solr/solr_alias/select?indent=on&q=*:*&wt=json
But, this will not work when used in a stre
We are working on a search application for large pdfs (~ 10 - 100 Mb), there
are been correctly indexed.
However we want to make some training in the pipeline, so we are
implementing some spark mllib algorithms.
But now, some requirements are to split documents into either paragraphs or
pages. So
bq: We have enough physical RAM to store full collection and 16Gb for each JVM.
That's not quite what I was asking for. Lucene uses MMapDirectory to
map part of the index into the OS memory space. If you've
over-allocated the JVM space relative to your physical memory that
space can start swapping
Hi Erik! thanks for your response!
Our soft commit is 5 seconds. Why generates I/0 a softcommit? first notice.
We have enough physical RAM to store full collection and 16Gb for each
JVM. The collection is relatively small.
I've tried (for testing purposes) disabling transactionlog (commenting
Use the _route_ field and put in "day_1" or "day_2". You've presumably
named the shards (the "shard" parameter) when you added them with the
CREATESHARD command so use the value you specified there.
Best,
Erick
On Wed, Jul 5, 2017 at 6:15 PM, wrote:
> I am trying out the document routing featur
Bad Things Can Happen. Solr (well, Lucene in this case) tries very
hard to keep disk full operations from having repercussions., but it's
kind of like OOMs. What happens next?
It's not so much the merge/optimize, but what happens in the future
when the _next_ segment is written...
The merge or op
Some aggregations are supported by combining stats with pivot facets? See:
https://lucidworks.com/2015/01/29/you-got-stats-in-my-facets/
Don't quite think that works for your use case though.
the other thing that _might_ help is all the Streaming
Expression/Streaming Aggregation work.
Best,
Eri
Hi all, I am curious to know what happens when solr begins a merge/optimize
operation, but then runs out of physical disk space. I havent had the
chance to try this out yet but I was wondering if anyone knows what the
underlying codes response to the situation would be if it happened. Thanks
-Dav
What is your soft commit interval? That'll cause I/O as well.
How much physical RAM and how much is dedicated to _all_ the JVMs on a
machine? One cause here is that Lucene uses MMapDirectory which can be
starved for OS memory if you use too much JVM, my rule of thumb is
that _at least_ half of the
I really have no idea what "to ignore the prefix and check of the type" means.
When? How? Can you give an example of inputs and outputs? You might
want to review:
https://wiki.apache.org/solr/UsingMailingLists
And to add to what Furkan mentioned, in addition to schemaless you can
use "managed sch
Hi Furkan,
No, In the schema we also defined some static fields such as uri and geo
field.
On 5 July 2017 at 17:07, Furkan KAMACI wrote:
> Hi Thaer,
>
> Do you use schemeless mode [1] ?
>
> Kind Regards,
> Furkan KAMACI
>
> [1] https://cwiki.apache.org/confluence/display/solr/Schemaless+Mode
>
thanks Markus!
We already have SSD.
About changing topology we probed yesterday with 10 shards, but system
goes more inconsistent than with the current topology (5x10). I dont know
why... too many traffic perhaps?
About merge factor.. we set default configuration for some days... but when
a
Hi Thaer,
Do you use schemeless mode [1] ?
Kind Regards,
Furkan KAMACI
[1] https://cwiki.apache.org/confluence/display/solr/Schemaless+Mode
On Wed, Jul 5, 2017 at 4:23 PM, Thaer Sammar wrote:
> Hi,
> We are trying to index documents of different types. Document have
> different fields. fields
Try mergeFactor of 10 (default) which should be fine in most cases. If you got
an extreme case, either create more shards and consider better hardware (SSD's)
-Original message-
> From:Antonio De Miguel
> Sent: Wednesday 5th July 2017 16:48
> To: solr-user@lucene.apache.org
> Subject: R
Thnaks a lot alessandro!
Yes, we have very big physical dedicated machines, with a topology of 5
shards and10 replicas each shard.
1. transaction log files are increasing but not with this rate
2. we 've probed with values between 300 and 2000 MB... without any
visible results
3. We don't us
On 6/30/2017 1:30 AM, Jacques du Rand wrote:
> I'm not quite sure I understand the deep paging / cursorMark internals
>
> We have implemented it on our search pages like so:
>
> http://mysite.com/search?foobar&page=1
> http://mysite.com/search?foobar&page=2&cmark=djkldskljsdsa
> http://mysite.com/s
Does "uniq" expression sounds good to use for UniqueMetric class?
Thanks,
Susheel
On Tue, Jul 4, 2017 at 5:45 PM, Susheel Kumar wrote:
> Hello Joel,
>
> I tried to create a patch to add UniqueMetric and it works, but soon
> realized, we have UniqueStream as well and can't load both of them (lik
Point 2 was the ram Buffer size :
*ramBufferSizeMB* sets the amount of RAM that may be used by Lucene
indexing for buffering added documents and deletions before they
are
flushed to the Directory.
maxBufferedDocs sets a limit on the number of documents buffered
Is the phisical machine dedicated ? Is a dedicated VM on shared metal ?
Apart from this operational checks I will assume the machine is dedicated.
In Solr a write to the disk does not happen only on commit, I can think to
other scenarios :
1) *Transaction log* [1]
2)
3) Spellcheck and
Thanks Erick for the answer. Function Queries are great, but for my use
case what I really do is making aggregations (using Json Facet for example)
with this functions.
I have tried using Function Queries with Json Facet but it does not support
it.
Any other idea you can imagine?
2017-07-03
Hi,
We are trying to index documents of different types. Document have different
fields. fields are known at indexing time. We run a query on a database and we
index what comes using query variables as field names in solr. Our current
solution: we use dynamic fields with prefix, for example feat
I am trying out the document routing feature in Solr 6.4.1. I am unable to
comprehend the documentation where it states that
“The 'implicit' router does not
automatically route documents to different
shards. Whichever shard you indicate on the
indexing request (or within each document) will
be u
From the fact that someone has tried to access /etc/passwd file via
your Solr (see all those WARN messages), it seems you have it exposed to
the world, unless of course it's a security scanner you use internally.
Internet is a hostile place, and the very first thing I would do is
shield Solr fr
Hi
I'm not sure if any of you have had a chance to see this email yet.
We had a reoccurrence of the Issue Today, and I'm attaching the Logs from today
as well inline below.
Please let me know if any of you have seen this issue before as this would
really help me to get to the root of the probl
Hi,
We are implementing a solrcloud cluster (6.6 version) with NRT requisites.
We are indexing 600 docs/sec with 1500 docs/sec peaks, and we are serving
about 1500qps.
Our documents has 300 fields with some doc values, about 4kb and we have 3
million of documents.
HardCommit is set to 15 minutes
Thanks, Joel. I just wanted to confirm, as I was having trouble tracking down
when the change occurred.
-R
On 04/07/2017, 23:51, "Joel Bernstein" wrote:
In the very early releases (5x) the /export handler had a different format
then the /search handler. Later the /export handler was ch
On 04/07/17 18:10, Erick Erickson wrote:
> I think you'll get what you expect by something like:
> (*:* -someField:Foo) AND (otherField: (Bar OR Baz))
Yeah that's what I figured. It's not a big deal since we generate Solr
syntax using a parser/generator on top of our own query syntax. Still a
litt
Thanks for your reply,
but it only works when i got no response. But as i said im working on
arrays. As soon as i get an array it doesnt matter if array's length is 1 or
105 it returns what i get earlier.
#1 json response
"detailComment",
[
"100.01",
null,
"102.01",
null
]
return as
36 matches
Mail list logo