Hello,
I've upgraded the SolrCloud from 7.6 to 8.8 and unfortunately I got the
following exception on Atomic updates of some of the documents.
And in some cases some fields are retrieved with an array of multi values
in case the field is defined as a single value.
Is there a bug on this version
, Feb 17, 2021 at 4:51 PM xiefengchang
wrote:
> Hi:
> I think you are just trying to avoid complete re-index right?
> why don't you take a look at this:
> https://lucene.apache.org/solr/guide/8_0/updating-parts-of-documents.html
>
>
>
>
>
>
>
>
>
&
Hello,
I've an integer field on an index with billions of documents and need to do
facets on this field, unfortunately the field doesn't have the docValues
property, so the FieldCache will be fired and use much memory.
What is the best way to change the field to be docValues supported?
Regards,
ange = slice.getRange()
>
> for (each doc) {
> int hash = Hash.murmurhash3_x86_32(whatever_your_unique_key_is, 0,
> id.length(), 0);
> if (range.includes(hash)) {
> index it to Solr
> }
> }
>
> "Hash" is in org.apache.solr.common.util, in
>
>
e you have to make sure you use the exact same hashing
>> algorithm on the .
>>
>> See CompositeIdRouter.sliceHash
>>
>> Best,
>> Erick
>> On Fri, Dec 14, 2018 at 3:36 AM Mahmoud Almokadem
>> wrote:
>> >
>> > Hello,
>> >
>> > I
ithm on the .
>
> See CompositeIdRouter.sliceHash
>
> Best,
> Erick
> On Fri, Dec 14, 2018 at 3:36 AM Mahmoud Almokadem
> wrote:
> >
> > Hello,
> >
> > I've a corruption on some of the shards on my collection and I've a full
> > dataset on my
onstructing the segments_N file from the available segments on disk
> will result
> duplicate documents in your index.
>
> FWIW,
> Erick
> On Fri, Dec 14, 2018 at 3:27 AM Mahmoud Almokadem
> wrote:
> >
> > Hello,
> >
> > I'm facing an issue that some sha
Hello,
I've a corruption on some of the shards on my collection and I've a full
dataset on my database, and I'm using CompositeId for routing documents.
Can I traverse the whole dataset and do something like hashing the
document_id to identify that this document belongs to a specific shard to
se
Hello,
I'm facing an issue that some shards of my SolrCloud collection is
corrupted due to they don't have segments_N file but I think the whole
segments are still available. Can I create a segment_N file from the
available files?
This is the stack trace:
org.apache.solr.core.SolrCoreInitializat
running now. Because the it's related to core
not collection.
I think the dataimporter feature must moved to the core level instead of
collection level.
Thanks,
Mahmoud
On Tue, Dec 5, 2017 at 6:57 AM, Shawn Heisey wrote:
> On 12/3/2017 9:27 AM, Mahmoud Almokadem wrote:
>
>> We
Hi all,
I'm running Solr 7.0.1. When I tried to run TopicStream with the following
expression
String expression = "topic(checkpointCollection," +
"myCollection" + "," +
"q=\"*:*\"," +
"fl=\"document_id,title,full_text\"," +
We're facing an issue related to the dataimporter status on new Admin UI
(7.0.1).
Calling to the API
http://solrip/solr/collection/dataimport?_=1512314812090&command=status&indent=on&wt=json
returns different status despite the importer is running
The messages are swapped between the following w
Hello,
I've an issue related to the Log page on the new Admin UI (7.0.1), When I
expand an item, it collapsed again after short time.
This behavior is different than the old Admin UI.
Thanks,
Mahmoud
Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 16 Oct 2017, at 13:35, Mahmoud Almokadem
> wrote:
> >
> > The transition of the load happened after I restarted the bulk
raining - http://sematext.com/
>
>
>
> > On 16 Oct 2017, at 12:39, Mahmoud Almokadem
> wrote:
> >
> > Yes, it's constantly since I started this bulk indexing process.
> > As you see the write operations on the loaded server are 3x the normal
> > serv
etection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 16 Oct 2017, at 11:58, Mahmoud Almokadem
> wrote:
> >
> > Here are the screen shots for the two server metrics on Amazon
> >
> > https://ibb.co/kxBQam
&
Here are the screen shots for the two server metrics on Amazon
https://ibb.co/kxBQam
https://ibb.co/fn0Jvm
https://ibb.co/kUpYT6
On Mon, Oct 16, 2017 at 11:37 AM, Mahmoud Almokadem
wrote:
> Hi Emir,
>
> We doesn't use routing.
>
> Servers is already balanced and the nu
t; Are documents of similar size?
>
> Thanks,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 16 Oct 2017, at 10:46, Mahmoud Almokadem
> wrot
We've installed SolrCloud 7.0.1 with two nodes and 8 shards per node.
The configurations and the specs of the two servers are identical.
When running bulk indexing using SolrJ we see one of the servers is fully
loaded as you see on the images and the other is normal.
Images URLs:
https://ibb.co
Thanks all for your commits.
I followed Shawn steps (rsync) cause everything on that volume (ZooKeeper,
Solr home and data) and everything went great.
Thanks again,
Mahmoud
On Sun, Aug 6, 2017 at 12:47 AM, Erick Erickson
wrote:
> bq: I was envisioning a scenario where the entire solr home is
anged theo path?
And what do you mean with "Using multiple passes with rsync"?
Thanks,
Mahmoud
On Tuesday, August 1, 2017, Shawn Heisey wrote:
> On 7/31/2017 12:28 PM, Mahmoud Almokadem wrote:
> > I've a SolrCloud of four instances on Amazon and the EBS volumes that
>
Hello,
I've a SolrCloud of four instances on Amazon and the EBS volumes that
contain the data on everynode is going to be full, unfortunately Amazon
doesn't support expanding the EBS. So, I'll attach larger EBS volumes to
move the index to.
I can stop the updates on the index, but I'm afraid to u
Thanks Shawn,
We already use the admin UI for testing and bulk uploads. We are using curl
scripts for automation process.
I'll report the issues regarding the new UI on JIRA.
Thanks,
Mahmoud
On Tuesday, May 2, 2017, Shawn Heisey wrote:
> On 5/2/2017 6:53 AM, Mahmoud Almokadem wrote:
the new UI.
Thanks,
Mahmoud
On Mon, May 1, 2017 at 4:30 PM, Shawn Heisey wrote:
> On 4/28/2017 9:01 AM, Mahmoud Almokadem wrote:
> > We already using a shell scripts to do our import and using fullimport
> > command to do our delta import and everything is doing well several
> >
without any
notification.
Thanks,
Mahmoud
On Fri, Apr 28, 2017 at 2:51 PM, Shawn Heisey wrote:
> On 4/28/2017 5:11 AM, Mahmoud Almokadem wrote:
> > I'd like to request to uncheck the "Clean" checkbox by default on DIH
> page,
> > cause it cleaned the whole ind
Hello,
I'd like to request to uncheck the "Clean" checkbox by default on DIH page,
cause it cleaned the whole index about 2TB when I click Execute button by
wrong. Or show a confirmation message that the whole index will be cleaned!!
Sincerely,
Mahmoud
Hello,
When I try to update a document exists on solr cloud I got this message:
TransactionLog doesn't know how to serialize class java.util.UUID; try
implementing ObjectResolver?
With the stack trace:
{"data":{"responseHeader":{"status":500,"QTime":3},"error":{"metadata":["error-class","org.a
ure, so you might not want to
> gzip them. But maybe your setup is different?
>
> Older versions of Solr used Tomcat which supported gzip. Newer versions use
> Zookeeper and Jetty and you prolly will find a way.
> Cheers -- Rick
>
>> On April 12, 2017 8:48:45 AM EDT, Ma
Hello,
How can I enable Gzip compression for Solr 6.0 to save bandwidth between
the server and clients?
Thanks,
Mahmoud
:
> On Tue, 2017-03-14 at 11:51 +0200, Mahmoud Almokadem wrote:
> > Here is the profiler screenshot from VisualVM after upgrading
> >
> > https://drive.google.com/open?id=0BwLcshoSCVcddldVRTExaDR2dzg
> >
> > the jetty is taking the most time on CPU. Does this mean, t
t;
> https://drive.google.com/open?id=0BwLcshoSCVcdc0dQZGJtMWxDOFk
>
> https://drive.google.com/open?id=0BwLcshoSCVcdR3hJSHRZTjdSZm8
>
> https://drive.google.com/open?id=0BwLcshoSCVcdUzRETDlFeFIxU2M
>
> Thanks,
> Mahmoud
>
> On Tue, Mar 14, 2017 at 10:20 AM, Mahmoud Almokadem <
&
Here is the profiler screenshot from VisualVM after upgrading
https://drive.google.com/open?id=0BwLcshoSCVcddldVRTExaDR2dzg
the jetty is taking the most time on CPU. Does this mean, the jetty is the
bottleneck on indexing?
Thanks,
Mahmoud
On Tue, Mar 14, 2017 at 11:41 AM, Mahmoud Almokadem
lrJ.
> SolrJ uses the javabin binary format to send documents to Solr and it
> never ever uses JSON so there is definitely some other indexing
> process that you have not accounted for.
>
> On Tue, Mar 14, 2017 at 12:31 AM, Mahmoud Almokadem
> wrote:
> > Thanks E
rsion 6.4.1 I'll upgrade my cluster and
test again.
Thanks for help
Mahmoud
On Tue, Mar 14, 2017 at 1:20 AM, Shawn Heisey wrote:
> On 3/13/2017 7:58 AM, Mahmoud Almokadem wrote:
> > When I start my bulk indexer program the CPU utilization is 100% on each
> > server but t
https://drive.google.com/open?id=0BwLcshoSCVcdR3hJSHRZTjdSZm8
https://drive.google.com/open?id=0BwLcshoSCVcdUzRETDlFeFIxU2M
Thanks,
Mahmoud
On Tue, Mar 14, 2017 at 10:20 AM, Mahmoud Almokadem
wrote:
> Thanks Erick,
>
> I think there are something missing, the rate I'm talking about is fo
a very short
> retention.
>
> See:
> https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>
> Best,
> Erick
>
> On Mon, Mar 13, 2017 at 12:01 PM, Mahmoud Almokadem
> wrote:
>> Thanks Erick,
>>
>&g
t; without profiling. But 300+ fields per doc probably just means you're
> doing a lot of processing, I'm not particularly hopeful you'll be able
> to speed things up without either more shards or simplifying your
> schema.
>
> Best,
> Erick
>
> On Mon, Mar 13,
Hi great community,
I have a SolrCloud with the following configuration:
- 2 nodes (r3.2xlarge 61GB RAM)
- 4 shards.
- The producer can produce 13,000+ docs per second
- The schema contains about 300+ fields and the document size is about
3KB.
- Using SolrJ and SolrCloudClient,
Thanks Alessandro,
I used the DIH as it is and no atomic updates was called with this DIH.
Add this script to my script transformation section and everything worked
properly:
var now = java.time.LocalDateTime.now();
var dtf =
java.time.format.DateTimeFormatter.ofPattern("-MM-dd'T'HH:mm:ss'
rds,
>Alex.
>
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 6 February 2017 at 15:32, Mahmoud Almokadem
> wrote:
> > Hello,
> >
> > I'm using dih on solr 6 for indexing data from sql server. The document
> can
Hello,
I'm using dih on solr 6 for indexing data from sql server. The document can
br indexed many times according to the updates on it. Is that available to
get the first time the document inserted to solr?
And how to get the dates of the document updated?
Thanks for help,
Mahmoud
Hello,
Is there a way to get SolrCloud to pull data from a topic in Kafak periodically
using Dataimport Handler?
Thanks
Mahmoud
v
> > Sent: Wednesday 21st September 2016 9:24
> > To: solr-user
> > Subject: Re: Search with the start of field
> >
> > You can experiment with {!xmlparser}.. see
> > https://cwiki.apache.org/confluence/display/solr/Other+
> Parsers#OtherParsers-XMLQ
Hello,
What is the best way to search with the start token of field?
For example: the field contains these values
Document1: ABC DEF GHI
Document2: DEF GHI JKL
when I search with DEF, I want to get Document2 only. Is that possible?
Thanks,
Mahmoud
Hello,
We always update the same document many times using DataImportHandler. Can I
add a field for the first time the document inserted to the index and another
field for the last time the document updated?
Thanks,
Mahmoud
g, most of the
> index is held in memory anyway and the source of
> the data (SSD or spinning) is not a huge issue. SSDs
> certainly are better/faster, but have you measured whether
> they are _enough_ faster to be worth the added
> complexity?
>
> Best,
> Erick
>
>
Hi,
We have SolrCloud 6.0 installed on 4 i2.2xlarge instances with 4 shards. We
store the indices on EBS attached to these instances. Fortunately these
instances are equipped with TEMPORARY SSDs. We need to the store the indices on
the SSDs but they are not safe.
The index is updated every fi
Hello,
We have a cluster of solr 4.8.1 installed on tomcat servlet container and we’re
able to use DIH Schedule by adding this lines to web.xml of the installation
directory:
org.apache.solr.handler.dataimport.scheduler.ApplicationListener
No we are planing to migrate to Solr 6 an
Hi all,
I have two cores (core1, core2). core1 contains fields(f1, f2, f3, date1) and
core2 contains fields(f2, f3, f4, date2).
I want to search on the two cores with the date field. Is there an alias to
query the two fields on distributed search.
For example when q=dateField:NOW perform searc
eating a special field type you can query just like
> any other; this was also presented at Lucene/Solr Revolution last month (
> http://lucenerevolution.org/sessions/simple-fuzzy-name-matching-in-solr/).
>
> Best,
> David Murgatroyd
> (VP, Engineering, Basis Technology)
>
>
> Regards,
>Alex.
>
> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> http://www.solr-start.com/
>
>
> On 10 November 2015 at 08:46, Mahmoud Almokadem
> wrote:
> > Thanks Pual,
> >
> > Arabic analyser applying filters of norma
:
> Mahmoud,
>
> there is an arabic analyzer:
> https://wiki.apache.org/solr/LanguageAnalysis#Arabic
> doesn't it do what you describe?
> Synonyms probably work there too.
>
> Paul
>
> > Mahmoud Almokadem <mailto:prog.mahm...@gmail.com>
> > 9 nove
> This will index the combined word in addition to the separate words.
>
> -- Jack Krupansky
>
> On Mon, Nov 9, 2015 at 4:48 AM, Mahmoud Almokadem
> wrote:
>
>> Hello,
>>
>> We are indexing Arabic content and facing a problem for tokenizing multi
>
Hello,
We are indexing Arabic content and facing a problem for tokenizing multi
terms phrases like 'عبد الله' 'Abd Allah', so users will search for
'عبدالله' 'Abdallah' without space and need to get the results of 'عبد
الله' with space. We are using StandardTokenizer.
Is there any configurations
as a bug since
> either it is an actual bug or some behavior nuance that needs to be
> documented better.
>
> -- Jack Krupansky
>
> On Wed, Nov 4, 2015 at 8:24 AM, Mahmoud Almokadem
> wrote:
>
>> I removed the q.op=“AND” and add the mm=2
>>
Mahmoud
> On Nov 4, 2015, at 3:04 PM, Alessandro Benedetti
> wrote:
>
> Here we go :
>
> Title^200 TotalField^1
>
> + Jack explanation and you have the parsed query explained !
>
> Cheers
>
> On 4 November 2015 at 12:56, Mahmoud Almokadem
> wrote
can you send us the solrconfig.xml snippet of your request handler please ?
>
> It's kinda strange you get a boost factor for the Title field and that
> parsing query, according to your config.
>
> Cheers
>
> On 4 November 2015 at 08:39, Mahmoud Almokadem
> wrote:
>
Hello,
I'm using solr 4.8.1. Using edismax as the parser we got the undesirable parsed
queries and results. The following is two different cases with strange
behavior: Searching with these parameters
"mm":"2",
"df":"TotalField",
"debug":"true",
"indent":"true",
"fl":"Title",
"start
Hello,
I'm using solr 4.8.1. Using edismax as the parser we got the undesirable
parsed queries and results. The following is two different cases with
strange behavior: Searching with these parameters
"mm":"2",
"df":"TotalField",
"debug":"true",
"indent":"true",
"fl":"Title",
"start":"
Thanks all for you response,
But the parsed_query and number of results still when changing MM parameter
the following results for mm=100% and mm=0%
http://solrserver/solr/collection1/select?q=%2B(word1+word2)&rows=0&fl=Title&wt=json&indent=true&debugQuery=true&defType=edismax&qf=title&mm=100%25
ery. They don't magically
> distribute to all nested queries.
>
> Let's see you full set of query parameters, both on the request and in
> solrconfig.
>
> -- Jack Krupansky
>
> On Thu, Apr 2, 2015 at 7:12 AM, Mahmoud Almokadem
> wrote:
>
> > Hello,
>
Hello,
I've a strange behaviour on using edismax with multiwords. When using
passing q=+(word1 word2) I got
"rawquerystring": "+(word1 word2)", "querystring": "+(word1 word2)", "
parsedquery": "(+(+(DisjunctionMaxQuery((title:word1))
DisjunctionMaxQuery((title:word2)/no_coord",
"parsedquery_t
You may have a field types in your schema that using stopwords.txt file
like this:
so, you must have files *stopwords_ar.txt* and* stopwords_en.txt* in
INSTA
Thanks Shawn.
What do you mean with "important parts of index"? and how to calculate their
size?
Thanks,
Mahmoud
Sent from my iPhone
> On Dec 29, 2014, at 8:19 PM, Shawn Heisey wrote:
>
>> On 12/29/2014 2:36 AM, Mahmoud Almokadem wrote:
>> I've the same inde
using LVM?
Thanks
On Mon, Dec 29, 2014 at 2:00 AM, Toke Eskildsen
wrote:
> Mahmoud Almokadem [prog.mahm...@gmail.com] wrote:
> > We've installed a cluster of one collection of 350M documents on 3
> > r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is
&g
Dears,
We've installed a cluster of one collection of 350M documents on 3
r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is
about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS
General purpose (1x1TB + 1x500GB) on each instance. Then we create logical
volume
Hi, you can search using this sample Url
http://localhost:8080/solr/core1/select?q=*:*&shards=localhost:8080/solr/core1,localhost:8080/solr/core2,localhost:8080/solr/core3
Mahmoud Almokadem
On Thu, Jun 5, 2014 at 8:13 AM, Anurag Verma wrote:
> Hi,
> Can you please he
uot;,
"base_url":"http://10.0.1.237:8080/solr";,
"core":"academicfull",
"node_name":"10.0.1.237:8080_solr",
"leader":"true"}}},
"shard2":{
"range":"0-7fff",
"state":"active",
"replicas":{"10.0.1.6:8080_solr_academicfull":{
"state":"active",
"base_url":"http://10.0.1.6:8080/solr";,
"core":"academicfull",
"node_name":"10.0.1.6:8080_solr",
"leader":"true",
"maxShardsPerNode":"1",
"router":{"name":"compositeId"},
"replicationFactor":"1",
"autoCreated":"true"},
"tvprograms":{
"shards":{
"shard1":{
"range":"8000-d554",
"state":"active",
"replicas":{"10.0.1.237:8080_solr_tvprograms":{
"state":"active",
"base_url":"http://10.0.1.237:8080/solr";,
"core":"tvprograms",
"node_name":"10.0.1.237:8080_solr",
"leader":"true"}}},
"shard2":{
"range":"d555-2aa9",
"state":"active",
"replicas":{"10.0.1.6:8080_solr_tvprograms":{
"state":"active",
"base_url":"http://10.0.1.6:8080/solr";,
"core":"tvprograms",
"node_name":"10.0.1.6:8080_solr",
"leader":"true"}}},
"shard3":{
"range":"2aaa-7fff",
"state":"active",
"replicas":{"10.0.1.205:8080_solr_tvprograms":{
"state":"active",
"base_url":"http://10.0.1.205:8080/solr";,
"core":"tvprograms",
"node_name":"10.0.1.205:8080_solr",
"leader":"true",
"maxShardsPerNode":"1",
"router":{"name":"compositeId"},
"replicationFactor":"1",
"autoCreated":"true"},
"paeb":{
"shards":{
"shard1":{
"range":"8000-",
"state":"active",
"replicas":{"10.0.1.237:8080_solr_paeb":{
"state":"active",
"base_url":"http://10.0.1.237:8080/solr";,
"core":"paeb",
"node_name":"10.0.1.237:8080_solr",
"leader":"true"}}},
"shard2":{
"range":"0-7fff",
"state":"active",
"replicas":{"10.0.1.6:8080_solr_paeb":{
"state":"active",
"base_url":"http://10.0.1.6:8080/solr";,
"core":"paeb",
"node_name":"10.0.1.6:8080_solr",
"leader":"true",
"maxShardsPerNode":"1",
"router":{"name":"compositeId"},
"replicationFactor":"1",
"autoCreated":"true"},
"shoghlanty":{
"shards":{
"shard1":{
"range":"8000-",
"state":"active",
"replicas":{"10.0.1.237:8080_solr_shoghlanty":{
"state":"active",
"base_url":"http://10.0.1.237:8080/solr";,
"core":"shoghlanty",
"node_name":"10.0.1.237:8080_solr",
"leader":"true"}}},
"shard2":{
"range":"0-7fff",
"state":"active",
"replicas":{"10.0.1.6:8080_solr_shoghlanty":{
"state":"active",
"base_url":"http://10.0.1.6:8080/solr";,
"core":"shoghlanty",
"node_name":"10.0.1.6:8080_solr",
"leader":"true",
"maxShardsPerNode":"1",
"router":{"name":"compositeId"},
"replicationFactor":"1",
"autoCreated":"true"},
"news_english":{
"shards":{
"shard1":{
"range":"8000-",
"state":"active",
"replicas":{"10.0.1.237:8080_solr_news_english":{
"state":"active",
"base_url":"http://10.0.1.237:8080/solr";,
"core":"news_english",
"node_name":"10.0.1.237:8080_solr",
"leader":"true"}}},
"shard2":{
"range":"0-7fff",
"state":"active",
"replicas":{"10.0.1.6:8080_solr_news_english":{
"state":"active",
"base_url":"http://10.0.1.6:8080/solr";,
"core":"news_english",
"node_name":"10.0.1.6:8080_solr",
"leader":"true",
"maxShardsPerNode":"1",
"router":{"name":"compositeId"},
"replicationFactor":"1",
"autoCreated":"true"}}
So, what should I do to create new collections with 3 shards
Thanks to all and sorry for poor English
Mahmoud Almokadem
68 matches
Mail list logo