Thanks, Shawn!
We are doing index on the same http endpoint. But as we have shardnum=1 and
replicafactor=1, so each collection only has one core. So there should no
distributed update/query, as we are using solrj's CloudSolrClient which will
get the target URL of the solrnode when requesting to ea
Hi,
I uploaded (upconfig) config (schema and solrconfig XMLs) to Zookeeper and then
linked (linkconfig) the confname to a collection name.
When I attempt to create a collection using the API like this
.../solr/admin/collections?action=CREATE&name=abc&numShards=1&collection.configName=abc
... it
On Tue, 2016-12-13 at 16:07 -0700, Chris Hostetter wrote:
> ** "warming" happens i na single threaded executor -- so if there
> are multiple ondeck searchers, only one of them at a time is ever a
> "warming" searcher
> ** multiple ondeck searchers can be a sign of a potential performance
> problem
Hello folks,
I'm about to set up a Web service I created with PHP/Apache <--> Solr Cloud
I'm hoping to index a bazillion documents.
I'm thinking about using Linode.com because the pricing looks great. Any
opinions??
I envision using an Apache/PHP round robin in front of a solr cloud
My thought
See replies inline:
On Wed, Dec 14, 2016 at 11:16 AM, GW wrote:
> Hello folks,
>
> I'm about to set up a Web service I created with PHP/Apache <--> Solr Cloud
>
> I'm hoping to index a bazillion documents.
>
ok , how many inserts/second ?
>
> I'm thinking about using Linode.com because the pric
Hi
We have solr 6.2.1.
One of the collection is causing lots of updates.
We see the next logs:
/INFO org.apache.solr.core.SolrDeletionPolicy :
SolrDeletionPolicy.onCommit: commits: num=2
commit{dir=/opt/solr-6.2.1/server/solr/collection_shard1_replica2/data/index,segFN=segments_qbmv,generation
On 12/14/2016 5:55 AM, moscovig wrote:
> We have solr 6.2.1.
> One of the collection is causing lots of updates.
> We see the next logs:
>
> /INFO org.apache.solr.core.SolrDeletionPolicy :
> SolrDeletionPolicy.onCommit: commits: num=2
>
> commit{dir=/opt/solr-6.2.1/server/solr/collection_shard1
On 12/13/2016 10:55 PM, vasanth vijayaraj wrote:
> We are building an e-commerce mobile app. I have implemented Solr search and
> autocomplete.
> But we like the Amazon search and are trying to implement something like
> that. Attached a screenshot
> of what has been implemented so far
>
> The
On 12/14/2016 1:36 AM, Sandeep Khanzode wrote:
> I uploaded (upconfig) config (schema and solrconfig XMLs) to Zookeeper
> and then linked (linkconfig) the confname to a collection name. When I
> attempt to create a collection using the API like this
> .../solr/admin/collections?action=CREATE&name=a
On 12/14/2016 1:28 AM, forest_soup wrote:
> We are doing index on the same http endpoint. But as we have shardnum=1 and
> replicafactor=1, so each collection only has one core. So there should no
> distributed update/query, as we are using solrj's CloudSolrClient which will
> get the target URL of
Shawn, thanks for the reply
Please take a look at that post. It's describing the same issue with ES
They describe the issue as "dentry cache is bloating memory"
https://discuss.elastic.co/t/memory-usage-of-the-machine-with-es-is-continuously-increasing/23537/5
Thanks
Gilad
--
View this me
Thanks,
I understand accessing solr directly. I'm doing REST calls to a single
machine.
If I have a cluster of five servers and say three Apache servers, I can
round robin the REST calls to all five in the cluster?
I guess I'm going to find out. :-) If so I might be better off just
running Apac
In the mean time I am removing all the explicit commits we have in the code.
Will update if it got better
--
View this message in context:
http://lucene.472066.n3.nabble.com/High-increasing-slab-memory-solr-6-tp4309708p4309718.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
In my project I have one leader and one replica architecture.
I am using custom code( using DocumentUpdateProcessorFactory) for merging
old documents with incoming new documents.
eg. 1. if 1st document have 10 fields, all 10 fields will be indexed.
2. if 2nd document have 8 fields, 5 of
First off I'm a bit confused. You say you're working with an
UpdateProcessorFactory but then want to use SolrJ to get
a leader. Why do this? Why not just work entirely locally and
reach into the _local_ index (note, you have to do this
after the doc has been routed to the correct shard)? Once there
: In a situation where searchers A-E are queued in the states
: A: Current
: B: Warming
: C: Ondeck
: D: Ondeck
: E: Being created with newSearcher
:
: wouldn't it make sense to discard C before it gets promoted to Warming,
: as the immediate action after warming C would be to start warming D?
:
Hi all,
this is about using a function in nested facets, specifically the „sum()“
function inside a „terms“ facet using the json.facet api.
My json.facet parameter looks like this:
json.facet={shop_cat: {type:terms, field:shop_cat, facet:
{cat_pop:"sum(popularity)"}}}
A snippet of the res
We would like to enable queries for a specific term that doesn't appear as a
part of a given expression. Negating the expression will not help, as we
still want to return items that contain the term independently, even if they
contain full expression as well.
For example, we would like to search fo
Hi everyone,
I am running Solr 5.5.0 on HDFS. It is a solrCloud of 50 nodes and I have
the following config.
maxShardsperNode: 1
replicationFactor: 1
I have been ingesting data into Solr for the last 3 months. With increase
in data, I am observing increase in the query time. Currently the size of
Hi,
Do you have a common list of phrases that you want to prohibit partial match?
You can index those phrases in a special way, for example,
This is a new world hello_world hot_dog tap_water etc.
ahmet
On Wednesday, December 14, 2016 9:20 PM, deansg wrote:
We would like to enable queries for
Thanks!
Running the same code in cloud mode worked nicely almost right away. Getting it
to work in non-cloud mode is still non-trivial. I can get the DocList in
process(), but AFAIK it just provides Lucene docIds, not a nice DocumentList we
could work with.
The use-case is straightforward, the
We are using Solr Cloud 6.2
We have been noticing an issue where the index in a core shows as current =
false
We have autocommit set for 15 seconds, and soft commit at 2 seconds
This seems to cause two replicas to return different hits depending upon
which one is queried.
What would lead to the
The commit points on different replicas will trip at different wall
clock times so the leader and replica may return slightly different
results depending on whether doc X was included in the commit on one
replica but not on the second. After the _next_ commit interval (2
seconds in your case), doc
Hello - I just spotted an oddity with all two custom DocTransformers we
sometimes use on Solr 6.3.0. This particular transformer in the example just
transforms a long (or int) into a sequence of bits. I just use it as an
convenience to compare minhashes with my eyeballs. First example is very
s
On 12/14/2016 7:12 AM, moscovig wrote:
> Shawn, thanks for the reply
>
> Please take a look at that post. It's describing the same issue with ES
>
> They describe the issue as "dentry cache is bloating memory"
>
> https://discuss.elastic.co/t/memory-usage-of-the-machine-with-es-is-continuously-incr
Fairly certain you aren't overridding getExtraRequestFields, so when your
DocTransformer is evaluated it can'd find the field you want it to
transform.
By default, the ResponseWriters don't provide any fields that aren't
explicitly requested by the user, or specified as "extra" by the
DocTran
Hello - i just looked up the DocTransformer Javadoc and spotted the
getExtraRequestFields method.
What you mention makes sense, so i immediately tried:
solr/search/select?omitHeader=true&wt=json&indent=true&rows=1&sort=id
asc&q=*:*&fl=minhash,minhash:[binstr]
{
"response":{"numFound":97895,"s
Thanks for the quick feedback.
We are not doing continuous indexing, we do a complete load once a week and
then have a daily partial load for any documents that have changed since
the load. These partial loads take only a few minutes every morning.
The problem is we see this discrepancy long afte
That should work... what version of Solr are you using? Did you
change the type of the popularity field w/o completely reindexing?
You can try to verify the number of documents in each bucket that have
the popularity field by adding another sub-facet next to cat_pop:
num_pop:{query:"popularity:[*
Let's back up a bit. You say "This seems to cause two replicas to
return different hits depending upon which one is queried."
OK, _how_ are they different? I've been assuming different numbers of
hits. If you're getting the same number of hits but different document
ordering, that's a completely d
Hi,
The list of phrases wil be relatively dynamic, so changing the indexing
process isn't a very good solution for us.
We also considered using a PostFilter or adding a SearchComponent to filter
out the "bad" results, but obviously a true query-time support would be a
lot better.
On Wed, Dec 14,
31 matches
Mail list logo