Hi,
Is it possible to do nested faceting on both records in parent and child in
a single query?
For example, I want to facet both author_s and book_s. Author is indexed as
a parent, whereas Book is indexed as a child.
I tried the following JSON Facet query, which is to do a facet of all the
list
Hi Walter, unfortunately I use pagination so that would not be possible..
Thanks
2016-10-04 0:51 GMT-03:00 Walter Underwood :
> How about sorting them after you get them back from Solr?
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/ (my blog)
>
>
> > On
Dropping ngrams also makes the index 5X smaller on disk.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Oct 3, 2016, at 9:02 PM, Walter Underwood wrote:
>
> I did not believe the benchmark results the first time, but it seems to hold
> up.
> Nobo
I did not believe the benchmark results the first time, but it seems to hold up.
Nobody gets a speedup of over a thousand (unless you are going from that
Oracle search thing to Solr).
It probably won’t help for most people. We have one service with very, very long
queries, up to 1000 words of free
How about sorting them after you get them back from Solr?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Oct 3, 2016, at 6:45 PM, Lucas Cotta wrote:
>
> I actually could also use a custom similarity class that always returns 1.0
> then I could use
Thanks Eric.
FirstSearcher and newSearcher events open with two separate searchers. For
the external file field case at least, the cache created with the
firstSearcher is not being used after the newSearcher creates another cache
(with same values)
I believe the warming is also per searcher. So,
You can have as many clients indexing to Solr (either Cloud or
stand-alone) as you want, limited only by the load you put
on Solr. I.e. if your indexing throughput is so great that it makes
querying too slow then you have to scale back...
I know of setups with 100+ separate clients all indexing to
Walter:
What did you change? I might like to put that in my bag of tricks ;)
Erick
On Mon, Oct 3, 2016 at 6:30 PM, Walter Underwood wrote:
> That approach doesn’t work very well for estimates.
>
> Some parts of the index size and speed scale with the vocabulary instead of
> the number of docum
the very easiest way is to re-index. 10M documents shouldn't take
very long unless they're no longer available...
When you say you tried to use the index upgrader, which one? You'd
have to use the one distributed with 5.x to upgrade from 4.x->5.x, then
use the one distributed with 6x to go from 5.
firstSearcher and newSeacher are definitely per core, they have to be since they
are intended to warm searchers and searchers are per core.
I don't particularly see the benefit of firing them both either. Not
sure which one makes
the most sense though.
Best,
Erick
On Mon, Oct 3, 2016 at 7:10 PM,
Hello All
We are trying to upgrade our production solr with 10 million documents from
solr cloud (5 shards, 5 nodes, one collection, 3 replica) 4.1 to 6.2
How to upgrade the lucene index created by solr. Should I go into indexes
created by each shard and upgrade and replicate it manually ? Also
I am using external file fields with larger external files and I noticed
Solr Core Reload loads external files twice: firstSearcher and nextSearcher
event.
Does it mean the Core Reload triggers both events? What is the
benefit/reason of triggering both events at the same time? I see this on
V. 4
Hi,
Would like to check, how can we list out all the fields that are available
in the index?
I'm using dynamic fields, so the Schema API is not working for me, as it
will only list out things like *_s, *_f and not the full field name.
Also, as I'm using the Block Join Parent Query Parser, it wil
I actually could also use a custom similarity class that always returns 1.0
then I could use small boost factors such as ^1, ^2, ^3, etc.
But I want to do this only in some specific queries (that may contain other
fields besides studentId)
How could I do this, use the custom similarity class only
That approach doesn’t work very well for estimates.
Some parts of the index size and speed scale with the vocabulary instead of the
number of documents.
Vocabulary usually grows at about the square root of the total amount of text
in the index. OCR’ed text
breaks that estimate badly, with huge v
In short, if you want your estimate to be closer then run some actual
ingestion for say 1-5% of your total docs and extrapolate since every
search product may have different schema,different set of fields, different
index vs. stored fields, copy fields, different analysis chain etc.
If you want t
After some more debugging, I think putting the dataDir in the
Map of properties is actually working, but still running
into a couple of issues with the setup...
I created an example project that demonstrates the scenario:
https://github.com/bbende/solrcore-datdir-test/blob/master/src/test/java/org
Yea I'll try to put something together and report back.
On Mon, Oct 3, 2016 at 6:54 PM, Alan Woodward wrote:
> Ah, I see what you mean. Putting the dataDir property into the Map
> certainly ought to work - can you write a test case that shows what’s
> happening?
>
> Alan Woodward
> www.flax.co.
Ah, I see what you mean. Putting the dataDir property into the Map certainly
ought to work - can you write a test case that shows what’s happening?
Alan Woodward
www.flax.co.uk
> On 3 Oct 2016, at 23:50, Bryan Bende wrote:
>
> Alan,
>
> Thanks for the response. I will double-check, but I be
Alan,
Thanks for the response. I will double-check, but I believe that is going
to put the data directory for the core under coreHome/coreName.
What I am trying to setup (and did a poor job of explaining) is something
like the following...
- Solr home in src/test/resources/solr
- Core home in sr
Hello,
I'm new in Solr (4.7.2) and I was given the following requirement:
Given a query such as:
studentId:(875141 OR 873071 OR 875198 OR 108142 OR 918841 OR 870688 OR
107920 OR 870637 OR 870636 OR 870635 OR 918792 OR 107721 OR 875078 OR
875166 OR 875151 OR 918829 OR 918808)
I want the results
We store some events data such as *accountId, startTime, endTime, timeSpent*
and some other searchable fields.We want to get all acountIds that spend
more than xhours between startTime and endTime and some other criteria which
are not important here.We can use facet and stats query like
below:*stat
Hi everyone,
I'm up to speed about Solr on how it can be setup to provide high
availability (if one Solr server goes down, the backup one takes over). My
question is how do I make my custom crawler to play "nice" with Solr in
this environment.
Let us say I setup Solr with 3 servers so that if on
Thanks Kevin, this worked for me.
On Mon, Oct 3, 2016 at 11:48 AM, Kevin Risden
wrote:
> You need to have the hadoop pieces on the classpath. Like core-site.xml and
> hdfs-site.xml. There is an hdfs classpath command that would help but it
> may have too many pieces. You may just need core-site
Below is some further testing. This was done in an environment that had no
other queries or updates during testing. We ran through several scenarios
so I pasted this with HTML formatting below so you may view this as a
table. Sorry if you have to pull this out into a different file for
viewing,
Hi Andy,
WordDelimeterFilter has "types" option. There is an example file named
wdftypes.txt in the source tree that preserves #hashtags and @mentions. If you
follow this path, please use Whitespace tokenizer.
Ahmet
On Monday, October 3, 2016 9:52 PM, "Whelan, Andy" wrote:
Hello,
I am guess
This doesn't answer your question, but Erick Erickson's blog on this topic is
invaluable:
https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
-Original Message-
From: Vasu Y [mailto:vya...@gmail.com]
Sent: Monday, October 3, 2016
Hello,
I am guessing that what I am looking for is probably going to require extending
StandardTokenizerFactory or ClassicTokenizerFactory. But I thought I would ask
the group here before attempting this. We are indexing documents from an
eclectic set of sources. There is, however, a heavy inter
You need to have the hadoop pieces on the classpath. Like core-site.xml and
hdfs-site.xml. There is an hdfs classpath command that would help but it
may have too many pieces. You may just need core-site and hdfs-site so you
don't get conflicting jars.
Something like this may work for you:
java -c
Hello,
My SolrCloud 5.5 installation has Kerberos enabled. The CheckHdfsIndex test
fails to run. However, without Kerberos, I am able to run the test with no
issues.
I ran the following command:
java -cp
"./server/solr-webapp/webapp/WEB-INF/lib/*:./server/lib/ext/*:/hadoop/hadoop-client/lib/serv
Hi,
I am trying to estimate disk space requirements for the documents indexed
to SOLR.
I went through the LucidWorks blog (
https://lucidworks.com/blog/2011/09/14/estimating-memory-and-storage-for-lucenesolr/)
and using this as the template. I have a question regarding estimating
"Avg. Document Si
This should work:
SolrCore solrCore
= coreContainer.create(coreName, Paths.get(coreHome).resolve(coreName),
Collections.emptyMap());
Alan Woodward
www.flax.co.uk
> On 3 Oct 2016, at 18:41, Bryan Bende wrote:
>
> Curious if anyone knows how to create an EmbeddedSolrServer in Solr 6.
Hi, thank you.
1. So why do I get those back? They are not even 'legitimate' grandchildren
2. If I do
localhost:8983/solr/nested_object_testing/query?debug=query&q=otype:pf&fl=*,[docid],[child
parentFilter=otype:pf limit=500 childFilter='otype:(a p ap)'] --> I get other
children except name doc
Curious if anyone knows how to create an EmbeddedSolrServer in Solr 6.x,
with a core where the dataDir is located somewhere outside of where the
config is located.
I'd like to do this without system properties, and all through Java code.
In Solr 5.x I was able to do this with the following code:
Hello,
you can strip grandchildren with
[child parentFilter=otype:pf limit=500 childFilter='otype:(a p ap)']
If you need to get three level nesting you might probably check [subquery],
but I suppose it's easier to recover hierarchy from what you have rigth
now.
On Mon, Oct 3, 2016 at 7:38 PM, Juan
I am fairly new to Solr, so is possible I am writing the query wrong (I have
Solr 4.10)
On this data:
[{
"id": -1666,
"otype": "ao",
"parent_id": -1,
"parent_type": "root",
"name": "JOSHUA N AARON MD PA",
"account_number": "002812300",
"tax_id": "50042772325",
"group_npi": 134630688333,
"
So if i cannot use allBuckets since its not filtering, how can I achieve
this?
On Fri, Sep 30, 2016 at 7:19 PM, Yonik Seeley wrote:
> On Tue, Sep 27, 2016 at 12:20 PM, Karthik Ramachandran
> wrote:
> > While performing json faceting with "allBuckets" and "mincount", I not
> sure if I am expecti
Ok, I'll test this out.
Joel Bernstein
http://joelsolr.blogspot.com/
On Mon, Oct 3, 2016 at 4:40 AM, Markko Legonkov wrote:
> here is the stacktrace
>
> java.io.IOException: Unable to construct instance of
> org.apache.solr.client.solrj.io.stream.ComplementStream
> at org.apache.solr.cl
here is the stacktrace
java.io.IOException: Unable to construct instance of
org.apache.solr.client.solrj.io.stream.ComplementStream
at
org.apache.solr.client.solrj.io.stream.expr.StreamFactory.createInstance(StreamFactory.java:323)
at
org.apache.solr.client.solrj.io.stream.expr.S
Thanks for quick response
Here is what i tried
complement(
search(
products,
qt="/export",
q="*:*",
fq="product_id_i:15940162",
fl="id, product_id_i, product_name_s,sale_price_d",
sort="product_id_i asc"
),
select(
search(
products,
qt="/export",
q="*:*",
fq="product_id_i:15940162",
fl="id, produ
40 matches
Mail list logo