I have an existing solrcloud 4.4 configured with zookeeper.
The current setting is 3 shards, each shard has a leader and replica. All
are mapped to the same collection1.
{"collection1":{
"shards":{
"shard1":{
"range":"8000-d554",
"state":"active",
"replicas":{
"core_n
Hi Edwin,
you are limiting the portion of the document analyzed for highlighting in your
solrconfig.xml by
100
Thus, snippets are only produced correctly if the query was found in the first
100 characters of the document.
If you set this parameter to
-1
the original highlighter us
Exacly, In Solr there is no concept of "nested fields" .
But there's the concept of nested documents ( via Query time join and Index
time (block) join ) .
You can have a "flat" schema which actually will be used to model nested
documents at index and query time.
There is plenty of documentation abo
Is there any seperate api available in solrj 5.2.1 for setting version=true
while adding or updating a solr doc?
On Dec 13, 2015 8:03 AM, "Debraj Manna" wrote:
> Thanks Alex. This is what I was looking for. One more query how to set
> this from solrj while calling add() ? Do I have to make a curl
> 1) "read" should cover all the paths
This is very fragile. If all paths were closed by default, forgetting to
configure a path would not result in a security breach like today.
/Jan
what about UpdateRequest().getParam().add("versions","true") ?
On Mon, Dec 14, 2015 at 1:15 PM, Debraj Manna
wrote:
> Is there any seperate api available in solrj 5.2.1 for setting version=true
> while adding or updating a solr doc?
> On Dec 13, 2015 8:03 AM, "Debraj Manna" wrote:
>
> > Thanks
Hi Mikhail,
I'm having a little bit problem to construct the query for solr when I have
been trying to use block join query. As you said, i can't use + or
in front of block join query, so I have to put *{**!parent
which="doctype:200"} *in front. and after this, all fields are child
document, so
In addition to the link in the previous response,
http://blog.griddynamics.com/2013/09/solr-block-join-support.html provides
an example of such combination. From my experience fq doen't participate in
highlighting nor scoring.
On Mon, Dec 14, 2015 at 2:45 PM, Novin Novin wrote:
> Hi Mikhail,
>
>
". If all paths were closed by default, forgetting to configure a path
would not result in a security breach like today."
But it will still mean that unauthorized users are able to access,
like guest being able to post to "/update". Just authenticating is not
enough without proper authorization
O
You don't need to submit a sha256, Solr will do itself. Just use the
provided commands
please refer this
https://cwiki.apache.org/confluence/display/solr/Basic+Authentication+Plugin
On Mon, Dec 14, 2015 at 6:56 AM, soledede_w...@ehsy.com <
soledede_w...@ehsy.com> wrote:
> I want to restrict Admi
Hello,
I am using solr 4.10.1. I have a field with stopwords
And I use pf2 pf3 on that field with a slop of 0.
If the request is "Gare Saint Lazare", and I have a document "Gare de Saint
Lazare", "de" being a stopword, this document doesn't get the pf3 boost,
because of "de".
I was wondering
14 December 2015, Apache Solr™ 5.4 available
Solr is the popular, blazing fast, open source NoSQL search platform
from the Apache Lucene project. Its major features include powerful
full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, rich document (e.g., W
Thanks Man.
On Mon, 14 Dec 2015 at 12:19 Mikhail Khludnev
wrote:
> In addition to the link in the previous response,
> http://blog.griddynamics.com/2013/09/solr-block-join-support.html provides
> an example of such combination. From my experience fq doen't participate in
> highlighting nor scori
This isn't a bug. During pf3 matching, since your query has only three
tokens, the entire query will be treated as a single phrase, and with slop
= 0, any word that comes in the middle of your query - 'de' in this case
will cause the phrase to not be matched. If you want to get around this,
try se
Moreover, the stopword de will work on your queries and not on your
documents, meaning if you query 'Gare de Saint Lazare', the terms actually
searched for will be Gare Saint and Lazare, 'de' will be filtered out.
On Mon, Dec 14, 2015 at 8:49 PM Binoy Dalal wrote:
> This isn't a bug. During pf3
Wait a second. There are other sorts of ways to secure Solr that don't work
with any sort role-based security control. What I do is place a reverse-proxy
in front of Apache Solr on port 80, and have that reverse proxy use CAS
authentication. I also have a list of "valid-users" who may use
On Sun, Dec 13, 2015 at 8:26 PM,
wrote:
>
> I want to define nested fileds in SOLR using schema.xml.
Us too (using Solr 5.3.1). And doco is not jumping out at me. My approach
is (please suggest a better way)
1/ create a blank core
2/ add a few nested docs using bin/post
3/ use the schema browse
On Sun, Dec 13, 2015 at 6:40 PM, santosh sidnal
wrote:
> Hi All,
>
> I want to define nested fileds in SOLR using schema.xml. we are using Apache
> Solr 4.7.0.
>
> i see some links which says how to do, but not sure how can i do it in
> schema.xml
> https://cwiki.apache.org/confluence/display/solr
Hello,
I am trying to index a very large file in Solr (around 5GB). However, I
get out of memory errors using Curl. I tried using the post script and I
had some success with it. After indexing several hundred thousand records
though, I got the following error message:
*SimplePostTool: FATAL: I
On a quick glance those look OK, what commands did you use _exactly_
to create your new collection? The names are a bit odd and it's not
clear how they could have gotten that way. how many documents have you
tried to index to your new collection? Any errors in the logs?
And how many documents are
Well, this usually means the maximum packet size has been exceeded,
there are several possibilities here that I'm going to skip over
because I have to ask the purpose of indexing a 5G file.
Indexing such a huge file has several problems from a user's perspective:
1> assuming the bulk of it is text
We have a use case in which there are multiple clients writing concurrently
to solr. Each of the doc is having an 'timestamp' field which indicates
when these docs were generated.
We also have to ensure that any old doc doesn't overwrite any new doc in
solr. So to achieve this we were thinking if
Hi Debraj,
I think this nice article [1] from Yonik could be helpful.
Andrea
[1] http://yonik.com/solr/optimistic-concurrency/
2015-12-14 18:17 GMT+01:00 Debraj Manna :
> We have a use case in which there are multiple clients writing concurrently
> to solr. Each of the doc is having an 'timesta
At the first glance, this sounds like a perfect match to
https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents#UpdatingPartsofDocuments-DocumentCentricVersioningConstraints
Just make sure your "timestamps" are truly atomic and not local
clock-based. The drift could cause int
The _version_ field used to optimistic concurrency can't be user supplied
-- it's not just a record of the *document's* version, but actually a
record of the *update command* version -- so even deleteByQuery commands
have one -- and the order must (internally) increase across all types of
upda
Hi,
I just want some feedback on best practice to run incremental DIH. During
last years I always preferred to have dedicated application that pushes data
inside ElasticSearch / Solr, but now I have a situation where we are forced
to use DIH.
I have several SQL Server database with a column
Antelmo Aguilar wrote:
> I am trying to index a very large file in Solr (around 5GB). However, I
>get out of memory errors using Curl. I tried using the post script and I
> had some success with it. After indexing several hundred thousand records
> though, I got the following error message:
Th
Hi all
We're currently in the process of migrating our distributed search
running on 5.0 to SolrCloud running on 5.4, and setting up a test
cluster for performance testing etc.
We have several cores/collections, and in each core's solrconfig.xml,
we were specifying an empty , and specifying the s
Has anyone looked at this issue? I'd be willing to take a stab at it if
someone could provide some high level design guidance. This would be a
critical piece preventing us from moving to version 5.
Jamie
On 12/14/2015 10:49 AM, Tom Evans wrote:
> When I tried this in SolrCloud mode, specifying
> "-Dsolr.data.dir=/mnt/solr/" when starting each node, it worked fine
> for the first collection, but then the second collection tried to use
> the same directory to store its index, which obviously failed.
On Mon, Dec 14, 2015, at 06:20 PM, Jamie Johnson wrote:
> Has anyone looked at this issue? I'd be willing to take a stab at it if
> someone could provide some high level design guidance. This would be a
> critical piece preventing us from moving to version 5.
Just start working on it, Jamie.
Ma
In my use case, I have a number of shards where a query would run as
distributed search. I am not using Solr Cloud, I have just a Solr server. Now,
when the search runs, I see one entry for each shard query as well as the
finally collective search query response. As the results, I am ending u
What is the nature of the file? Is it Solr XML, CSV, PDF (via Solr Cell),
or... what? If a PDF, maybe it has lots of hi-resolution images. If so, you
may need to strip out the images and just send the text, which would be a
lot smaller. For example, you could run Tika locally to extract the text
an
Anshum and Nobel,
I've downloaded 5.4, and this seems to be working so far
Thanks again
-Original Message-
From: Anshum Gupta [mailto:ans...@anshumgupta.net]
Sent: Tuesday, December 01, 2015 12:52 AM
To: solr-user@lucene.apache.org
Subject: Re: Re:Re: Implementing security.json is break
On Mon, Dec 14, 2015 at 1:22 PM, Shawn Heisey wrote:
> On 12/14/2015 10:49 AM, Tom Evans wrote:
>> When I tried this in SolrCloud mode, specifying
>> "-Dsolr.data.dir=/mnt/solr/" when starting each node, it worked fine
>> for the first collection, but then the second collection tried to use
>> the
Currently, it'll be a little tedious but here's what you can do (going
partly from memory)...
When you create the collection, specify the special value EMPTY for
createNodeSet (Solr 5.3+).
Use ADDREPLICA to add each individual replica. When you do this, you
can add a dataDir for
each individual re
Hi Daniel,
That sounds good. It is a custom solution, which is a way to secure just
about any server. I think Noble's point was about out of the box, community
supported, way of securing Solr.
Regards,
Ishan
On Mon, Dec 14, 2015 at 9:26 PM, Davis, Daniel (NIH/NLM) [C] <
daniel.da...@nih.gov> wrote
Don’t set solr.data.dir. Instead, set the install dir. Something like:
-Dsolr.solr.home=/data/solr
-Dsolr.install.dir=/opt/solr
I have many solrcloud collections, and separate data/install dirs, and
I’ve never had to do anything with manual per-collection or per-replica
data dirs.
That said, it’
We currently moved data from magnetic drive to SSD. We run Solr in cloud
mode. Only data is stored in the drive configuration is stored in ZK. We
start solr using the -s option specifying the data dir
Command to start solr
./bin/solr start -c -h -p -z -s
We followed the following steps to migr
Hi Philippa,
Try taking a heap dump (when heap usage is high) and then using a profiler
look at which objects are taking up most of the memory. I have seen that if
you are using faceting/sorting on large number of documents then fieldCache
grows very big and dominates most of of the heap. Enabling
Can I somehow get "documentVersion" for each doc back in the Update
Response like the way we get _version back in Optimistic Concurrency when
we set "version=true" in the update request?
On Dec 14, 2015 10:58 PM, "Chris Hostetter"
wrote:
>
> The _version_ field used to optimistic concurrency can
I am running a SolrCloud 4.6 cluster with three solr nodes and three
external zookeeper nodes. Each Solr node has 12GB RAM. 8GB RAM dedicated to
the JVM.
When solr is started it consumes barely 1GB but over the course of 36 to 48
hours physical memory will be consumed and swap will be used. The i/
Hi, thanks for the answer.
We installed solr with solr.cmd -e cloud utility that comes with the
installation.
The names of shards are odd because in this case after the installation
We've migrated an old index from our other environment (wich is solr single
node) and splitted it with Collection A
Dear Team,
I use DIH extensively and even wrote my own custom transformers in some
situations.
Recently during an architecture discussion one of my team members told that
Solr is going to take away DIH from its future versions.
Is that true?
Also is using DIH for say 2 or 3 million docs a good o
Hello
I've been using 5.3.1. I would like to enable this feature: when user
enters a query, the results should include documents that also partially
match the query. For example, the document is Apple Company
and user query is "apple computer company". Though the document is missing
the term "comp
45 matches
Mail list logo