Re: _version_ field missing in schema?

2019-01-23 Thread Aleksandar Dimitrov

Hi Alex,

thanks for you answer. I took the lines directly from the
managed-schema, deleted the managed-schema, and pasted those lines 
into

my schema.xml.

If I have other errors in the schema.xml (such as a missing field 
type),
solr complains about those until I fix them. So I would guess that 
the
schema is at least *read*, but unsure if it is in fact used. I've 
not

used solr before.

I cannot use the admin UI, at least not while the core with the 
faulty

schema is used.

I wanted to use schema.xml because it allows for version control, 
and
because it's easier for me to just use xml to define my schema. Is 
there
a preferred approach? I don't (want to) use solr cloud, as for our 
use

case a single instance of solr is more than enough.

Thanks for your help,
Aleks

Alexandre Rafalovitch  writes:

What do you mean schema.xml from managed-schema? schema.xml is 
old
non-managed approach. If you have both, schema.xml will be 
ignored.


I suspect you are not running with the schema you think you do. 
You can

check that with API or in Admin UI if you get that far.

Regards,
Alex

On Tue, Jan 22, 2019, 11:39 AM Aleksandar Dimitrov <
a.dimit...@seidemann-web.com wrote:


Hi,

I'm using solr 7.5, in my schema.xml I have this, which I took
from the
managed-schema:

  
  
  
  

However, on startup, solr complains:

 Caused by: org.apache.solr.common.SolrException: _version_ 
 field
 must exist in schema and be searchable (indexed or docValues) 
 and
 retrievable(stored or docValues) and not multiValued 
 (_version_

 does not exist)
  at

org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:69)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at
  org.apache.solr.update.VersionInfo.(VersionInfo.java:95)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at 
  org.apache.solr.update.UpdateLog.init(UpdateLog.java:404)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at
  org.apache.solr.update.UpdateHandler.(UpdateHandler.java:161)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at
  org.apache.solr.update.UpdateHandler.(UpdateHandler.java:116)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at

org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:119)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at

jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
  Method) ~[?:?]
  at

jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

  ~[?:?]
  at

jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

  ~[?:?]
  at
  java.lang.reflect.Constructor.newInstance(Constructor.java:488)
  ~[?:?]
  at
  org.apache.solr.core.SolrCore.createInstance(SolrCore.java:799)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at
  org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:861)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at
  org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1114)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at 
  org.apache.solr.core.SolrCore.(SolrCore.java:984)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at 
  org.apache.solr.core.SolrCore.(SolrCore.java:869)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at

org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1138)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  ... 7 more

Anyone know what I'm doing wrong?
I've tried having the _version_ field be string, and indexed 
and

stored,
but that didn't help.

Thanks!

Aleks







Re: difference in behavior of term boosting between Solr 6 and Solr 7

2019-01-23 Thread Elaine Cario
I predicted some colleague would come to me 2 minutes after I sent this
with some finding - I was wrong, it was a few hours! It seems there was a
change in a custom similarity class (I think because of an API change in
Solr), which caused the query boost to not be applied.  We're looking at
this angle, so please ignore this for now.

On Tue, Jan 22, 2019 at 11:16 AM Elaine Cario  wrote:

> We're preparing to upgrade from Solr 6.4.2 to Solr 7.6.0, and found an
> inconsistency in scoring. It appears that term boosts in the query are not
> applied in Solr 7.
>
> The query itself against both versions is identical (removed un-important
> params):
>
> ("one"^1) OR ("two"^2) OR ("three"^3)
> edismax
> max_term
> AND
> dictionary_id:"WKUS-TAL-DEPLURALIZATION-THESAURUS"
> 100
> xml
> on
> 
>
> 3 documents are returned, but in Solr 6 results the docs are returned in
> order of the boosts (three, two, one), as the boosts accounts for the
> entirety of the score, while in Solr 7 they are returned randomly, as all
> the scores are 1.0.
>
> Looking at the debug and explains, in Solr 6 the boost is multiplied to
> the rest of the score:
>
> 
> ("one"^1) OR ("two"^2) OR ("three"^3)
> ("one"^1) OR ("two"^2) OR ("three"^3)
> (+(DisjunctionMaxQuery((max_term:" one
> "))^1.0 DisjunctionMaxQuery((max_term:" two "))^2.0
> DisjunctionMaxQuery((max_term:" three "))^3.0))/no_coord
> +(((max_term:" one "))^1.0
> ((max_term:" two "))^2.0 ((max_term:" three "))^3.0)
> 
> 
> 3.0 = sum of:
>   3.0 = weight(max_term:" three " in 658) [WKSimilarity], result
> of:
> 3.0 = score(doc=658,freq=1.0 = phraseFreq=1.0
> ), product of:
>   3.0 = boost
>   1.0 = idf(), for phrases, always set to 1
>   1.0 = tfNorm, computed as (freq * (k1a + 1)) / (freq + k1b)
> [WKSimilarity] from:
> 1.0 = phraseFreq=1.0
> 1.2 = k1a
> 1.2 = k1b
> 0.0 = b (norms omitted for field)
> 
>
> But in Solr 7, the boost is not there at all:
>
> 
> ("one"^1) OR ("two"^2) OR ("three"^3)
> ("one"^1) OR ("two"^2) OR ("three"^3)
> +((+DisjunctionMaxQuery((max_term:" one
> "))^1.0) (+DisjunctionMaxQuery((max_term:" two "))^2.0)
> (+DisjunctionMaxQuery((max_term:" three "))^3.0))
> +((+((max_term:" one "))^1.0)
> (+((max_term:" two "))^2.0) (+((max_term:" three
> "))^3.0))
> 
> 
> 1.0 = sum of:
>   1.0 = weight(max_term:" two " in 436) [WKSimilarity], result of:
> 1.0 = score(doc=436,freq=1.0 = phraseFreq=1.0
> ), product of:
>   1.0 = idf(), for phrases, always set to 1
>   1.0 = tfNorm, computed as (freq * (k1a + 1)) / (freq + k1b)
> [WKSimilarity] from:
> 1.0 = phraseFreq=1.0
> 1.2 = k1a
> 1.2 = k1b
> 0.0 = b (norms omitted for field)
> 
>
> I noted a subtle difference in the parsedquery between the 2 versions as
> well, not sure if that is causing the boost to drop out in Solr 7:
>
> SOLR 6:  +(((max_term:" one "))^1.0 ((max_term:" two
> "))^2.0 ((max_term:" three "))^3.0)
> SOLR 7:  +((+((max_term:" one "))^1.0) (+((max_term:" two
> "))^2.0) (+((max_term:" three "))^3.0))
> For our use case , I think we can work around it using a constant score
> query, but it would be good to know if this is a bug or expected behavior,
> or we're missing something in the query to get boost to work again.
>
> Thanks!
>
>
>
>
>


Re: The parent shard will never be delete/clean?

2019-01-23 Thread Andrzej Białecki
Solr 7.4.0 added a periodic maintenance task that cleans up old inactive parent 
shards left after the split. “Old” means 2 days by default.

> On 22 Jan 2019, at 15:31, Jason Gerlowski  wrote:
> 
> Hi,
> 
> You might want to check out the documentation, which goes over
> split-shard in a bit more detail:
> https://lucene.apache.org/solr/guide/7_6/collections-api.html#CollectionsAPI-splitshard
> 
> To answer your question directly though, no.  Split-shard creates two
> new subshards, but it doesn't do anything to remove or cleanup the
> original shard.  The original shard remains with its data and will
> delegate future requests to the result shards.
> 
> Hope that helps,
> 
> Jason
> 
> On Tue, Jan 22, 2019 at 4:17 AM zhenyuan wei  wrote:
>> 
>> Hi,
>>   If I split shard1 to shard1_0,shard1_1, Is the parent shard1 will
>> never be clean up?
>> 
>> 
>> Best,
>> Tinswzy
> 



Adding and deleting documents in the same update request

2019-01-23 Thread Andreas Nilsson
Hi all,

I am updating a Solr Collection (Solr 7.3.1 in Cloud mode using SolrJ Java API) 
with requests that include both adding new documents as well as deleting 
existing ones (by query). The deletion part is meant to make sure any earlier 
revisions of the indexed source are deleted as part of the index update. This 
has worked well for a long time, but in some rare cases, there has been issues 
where the update process returns success, but the added document(s) are nowhere 
to be found in the collection.

After some investigation, I'm suspecting that there is an edge case where the 
delete query can actually overlap the documents added in the same update. 
Obviously the first suspect to look at here is the delete query, but I also had 
to start looking into what the documented semantics (if any) for the 
multi-command update API (JSON update command) actually are. I cannot find any 
documentation that seems to even touch on this subject.

I've looked through most of the online Solr documentation chapters 
(https://lucene.apache.org/solr/guide/7_3/), though only as an overview. The 
documentation detailing multi-operation JSON update requests 
(https://lucene.apache.org/solr/guide/7_3/uploading-data-with-index-handlers.html#solr-style-json
 - JSON Update Command) doesn't seem to have any details or even link to 
further reading. I've also read the javadoc for 
org.apache.solr.client.solrj.request.UpdateRequest (part of SolrJ).

Is there is a specific order in which operations in an update request will be 
executed? Is the order guaranteed for any of the possible operations (add, 
delete by id / query, optimize, commit) in a single update command? Since I 
cannot find any details, I have to assume it's undefined and that I should 
never rely on any order.

I suspect that the developers that did this part of our code either assumed it 
would always be performed in the same order or that the delete query could 
never overlap. Or perhaps it was just an oversight and we've been lucky so far.

Related: in the case where I cannot rely on the operations order in a single 
update request, is there a recommended way to do these kinds of updates 
"atomically" in a single request? Ideally, I obviously don't want the 
collection to be left in a state where the deletion has happened but not the 
additions or the other way around.

Thanks in advance,
Andreas



Re: Adding and deleting documents in the same update request

2019-01-23 Thread Shawn Heisey

On 1/23/2019 5:58 AM, Andreas Nilsson wrote:

Related: in the case where I cannot rely on the operations order in a single update 
request, is there a recommended way to do these kinds of updates "atomically" 
in a single request? Ideally, I obviously don't want the collection to be left in a state 
where the deletion has happened but not the additions or the other way around.


Assuming that you have a uniqueKey field and that you are replacing an 
existing document, do not issue a delete for that document at all.  When 
you index a document with the same value in the uniqueKey field as an 
existing document, Solr will handle the delete of the existing document 
for you.


When a uniqueKey is present, you should only issue delete commands for 
documents that will be permanently deleted.


Alternatively, send deletes in their own request, separate from 
inserts.  If you take this route, wait for acknowledgement from the 
delete before sending the insert.


Thanks,
Shawn



Re: _version_ field missing in schema?

2019-01-23 Thread Alexandre Rafalovitch
If you do not use API or Admin to change schema, it will not get
automatically rewritten. So you can just stay with managed-shema file and
version that. You can even disable write changes in solrconfig.xml:
http://lucene.apache.org/solr/guide/7_6/schema-factory-definition-in-solrconfig.html#solr-uses-managed-schema-by-default


Also, the file format of managed-schema is XML even if it does not have the
appropriate extension.

Finally, since you are trying to really tweak the schema and general
configuration right from the start, you may find some of my presentations
useful, as they show the minimal configuration. Not perfect for your needs,
as I do skip _version, but as an additional data point. The recent one is:
https://www.slideshare.net/arafalov/rapid-solr-schema-development-phone-directory
and the Git repo is at:
https://github.com/arafalov/solr-presentation-2018-may . This one may be
useful as well:
https://www.slideshare.net/arafalov/from-content-to-search-speeddating-apache-solr-apachecon-2018-116330553

Regards,
   Alex.

On Wed, Jan 23, 2019, 5:50 AM Aleksandar Dimitrov <
a.dimit...@seidemann-web.com wrote:

> Hi Alex,
>
> thanks for you answer. I took the lines directly from the
> managed-schema, deleted the managed-schema, and pasted those lines
> into
> my schema.xml.
>
> If I have other errors in the schema.xml (such as a missing field
> type),
> solr complains about those until I fix them. So I would guess that
> the
> schema is at least *read*, but unsure if it is in fact used. I've
> not
> used solr before.
>
> I cannot use the admin UI, at least not while the core with the
> faulty
> schema is used.
>
> I wanted to use schema.xml because it allows for version control,
> and
> because it's easier for me to just use xml to define my schema. Is
> there
> a preferred approach? I don't (want to) use solr cloud, as for our
> use
> case a single instance of solr is more than enough.
>
> Thanks for your help,
> Aleks
>
> Alexandre Rafalovitch  writes:
>
> > What do you mean schema.xml from managed-schema? schema.xml is
> > old
> > non-managed approach. If you have both, schema.xml will be
> > ignored.
> >
> > I suspect you are not running with the schema you think you do.
> > You can
> > check that with API or in Admin UI if you get that far.
> >
> > Regards,
> > Alex
> >
> > On Tue, Jan 22, 2019, 11:39 AM Aleksandar Dimitrov <
> > a.dimit...@seidemann-web.com wrote:
> >
> >> Hi,
> >>
> >> I'm using solr 7.5, in my schema.xml I have this, which I took
> >> from the
> >> managed-schema:
> >>
> >>   
> >>   
> >>>>   stored="false" />
> >>>>   docValues="true" />
> >>
> >> However, on startup, solr complains:
> >>
> >>  Caused by: org.apache.solr.common.SolrException: _version_
> >>  field
> >>  must exist in schema and be searchable (indexed or docValues)
> >>  and
> >>  retrievable(stored or docValues) and not multiValued
> >>  (_version_
> >>  does not exist)
> >>   at
> >>
> >>
> org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:69)
> >>
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
> >>   2018-09-18 13:07:55]
> >>   at
> >>   org.apache.solr.update.VersionInfo.(VersionInfo.java:95)
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
> >>   2018-09-18 13:07:55]
> >>   at
> >>   org.apache.solr.update.UpdateLog.init(UpdateLog.java:404)
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
> >>   2018-09-18 13:07:55]
> >>   at
> >>
>  org.apache.solr.update.UpdateHandler.(UpdateHandler.java:161)
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
> >>   2018-09-18 13:07:55]
> >>   at
> >>
>  org.apache.solr.update.UpdateHandler.(UpdateHandler.java:116)
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
> >>   2018-09-18 13:07:55]
> >>   at
> >>
> >>
> org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:119)
> >>
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
> >>   2018-09-18 13:07:55]
> >>   at
> >>
> >> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> >>   Method) ~[?:?]
> >>   at
> >>
> >>
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> >>
> >>   ~[?:?]
> >>   at
> >>
> >>
> jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >>
> >>   ~[?:?]
> >>   at
> >>   java.lang.reflect.Constructor.newInstance(Constructor.java:488)
> >>   ~[?:?]
> >>   at
> >>   org.apache.solr.core.SolrCore.createInstance(SolrCore.java:799)
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5f

Re: _version_ field missing in schema?

2019-01-23 Thread Shawn Heisey

On 1/23/2019 3:49 AM, Aleksandar Dimitrov wrote:

Hi Alex,

thanks for you answer. I took the lines directly from the
managed-schema, deleted the managed-schema, and pasted those lines into
my schema.xml.


Unless you have changed the solrconfig.xml to refer to the classic 
schema, the file named schema.xml is not used.


With the standard schema factory, on core startup, if schema.xml is 
found, it is copied to managed-schema and then renamed to a backup 
filename.  This would also happen on reload, I believe.


Recommendation: unless you're using the classic schema, never use the 
schema.xml file.  Only work with managed-schema.


Thanks,
Shawn



Solr indexing raises error while posting PDF

2019-01-23 Thread sonam mittal
I am using Solr-6.6.4 version and Ubuntu 16 version.I have created a
collection in Solr using the configuration files of the Solr example
*techproducts*. I am trying to post a PDF in Solr but it is raising some
errors.I have also installed the apache tika through maven but still it is
showing the following error.

SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8983/solr/ifarm_tech/update...
Entering auto mode. File endings considered are
xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
POSTing file Types.pdf (application/pdf) to [base]/extract
SimplePostTool: WARNING: Solr returned an error #500 (Server Error)
for url: 
http://localhost:8983/solr/ifarm_tech/update/extract?resource.name=%2Fhome%2Fubuntu%2Fpdf_cancer%2FTypes.pdf&literal.id=%2Fhome%2Fubuntu%2Fpdf_cancer%2FTypes.pdf
SimplePostTool: WARNING: Response: 


Error 500 Server Error

HTTP ERROR 500
Problem accessing /solr/ifarm_tech/update/extract. Reason:
Server ErrorCaused
by:java.lang.NoClassDefFoundError: Could not initialize
class org.apache.pdfbox.pdmodel.PDDocument
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:149)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
at 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)




SimplePostTool: WARNING: IOException while reading response:
java.io.IOException: Server returned HTTP response c

Solr times out connecting to Zookeeper

2019-01-23 Thread marotosg
Hi,

I am having some trouble trying to increase Zookeeper client timeout when
Solr tries to connect to Zookeeper when starts the first time. 

Tried already with no luck updating the following properties :
1) Update the zkClientTimeout on solr.xml
${zkClientTimeout:9}

2) Uncomment ZK_CLIENT_TIMEOUT in solr.in.sh
# Set the ZooKeeper client timeout (for SolrCloud mode)
ZK_CLIENT_TIMEOUT="9"

Still getting this message:
2019-01-23 15:56:15.543 ERROR org.apache.solr.core.SolrCore -
null:org.apache.solr.common.SolrException:
java.util.concurrent.TimeoutException: Could not connect to ZooKeeper
dv08t02qslrc01:2181,dv08t02qslrc02:2181,dv08t02qslrc03:2181 within* 3
ms*

is there any other place where I need to update that info?

Thanks in advance
Sergio




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Indexing in one collection affect index in another collection

2019-01-23 Thread Zheng Lin Edwin Yeo
Hi,

I am using Solr 7.5.0, and currently I am facing an issue of when I am
indexing in collection2, the indexing affects the records in collection1.
Although the records are still intact, it seems that the settings of the
termVecotrs get wipe out, and the index size of collection1 reduced from
3.3GB to 2.1GB after I do the indexing in collection2. Also, the search in
collection1, which was originall very fast, becomes very slow after the
indexing is done is collection2.

Anyone has faced such issues before or have any idea on what may have gone
wrong?

Regards,
Edwin


Re: Indexing in one collection affect index in another collection

2019-01-23 Thread Shawn Heisey

On 1/23/2019 10:01 AM, Zheng Lin Edwin Yeo wrote:

I am using Solr 7.5.0, and currently I am facing an issue of when I am
indexing in collection2, the indexing affects the records in collection1.
Although the records are still intact, it seems that the settings of the
termVecotrs get wipe out, and the index size of collection1 reduced from
3.3GB to 2.1GB after I do the indexing in collection2.


This should not be possible.  Indexing in one collection should have 
absolutely no effect on another collection.


If logging has been left at its default settings, the solr.log file 
should have enough info to show what actually happened.



Also, the search in
collection1, which was originall very fast, becomes very slow after the
indexing is done is collection2.


If the two collections have data on the same server(s), I can see this 
happening.  More memory is consumed when there is additional data, and 
when Solr needs more memory, performance might be affected.  The 
solution is generally to install more memory in the server.  If the 
system is working, there should be no need to increase the heap size 
when the memory size increases ... but there can be situations where the 
heap is a little bit too small, where you WOULD want to increase the 
heap size.


Thanks,
Shawn



How to clear old versions of a documents

2019-01-23 Thread Eyyub Çil
Hi All,

I have nested document based solr schema. Solr keeps multiple _version_ of
parent document when any update occurse solr keeps . I know that comes with
NoSQL feature of Solr .


My question is how I clean up older versions of documents ? and display
last version  of a document in search results?

thank you


Lucene Solr custom tokenizer - How to include delimiter special characters as tokens?

2019-01-23 Thread rina joseph
Hello Solr users,

I have a need to write a tokenizer for source code files in Solr, but don't
have the option of including custom JARs. So for ex:

Input: foo.bar

Tokens: 'foo', '.', 'bar'

How can I have a custom tokenizer or filter in schema.xml that can split on
some characters, but also not drop the character ?

I tried Regex pattern tokenizer but that drops the delimiters.
Thanks in advance!


API to convert a SolrInputDocument to JSON

2019-01-23 Thread Pushkar Raste
Hi,
We are setting up cluster with new version Solr and going to reindex data.
However, until all the data is indexed I need keep indexing data in the old
cluster as well. We are currently using the Solrj client and constructing
SolrInputDocument objects to index data.

 To avoid conflicts with the old and new jars, I am planning to use
HttpClient and json payload . Is there a convenient API to convert a
SolrInputDocument to json


Solr admin UI new features

2019-01-23 Thread Dwane Hall
Hi user community,


I recently upgraded a single node solr cloud environment from 7.3.1 to 7.6. 
While traversing through the release notes for solr 7.5 to identify any 
important changes to consider for the upgrade I noticed two excellent additions 
to the Admin UI that took effect in solr 7.5 (SOLR-8207 – Add Nodes view to 
Admin UI “Cloud” tab and SOLR-7767 ZK Status under “Cloud” tab).After 
completing my upgrade all collections are online and functioning as expected 
and solr is working without issue however these two new menu items do not 
appear to work (the urls are hit https://server:port/solr/#/~cloud?view=nodes, 
https://server:port/solr/#/~cloud?view=zkstatus) but the pages are blank.  The 
original menu items all function without issue (Tree, Graph, Graph (Radial)).

I’ve cleared my browser cache and checked the logs which are all clean (with 
the log level set to DEBUG on org.apache.jetty.server.*).  Are there any 
additional configuration changes I’m overlooking that I need to take advantage 
of these new features?


Environment

Chrome 70, and Firefox 45
Solr 7.6 (Cloud, single node)
Https, basic auth plugin enabled
Zookeeper 3.4.6

As always any advice is appreciated,

Thanks

Dwane


Re: API to convert a SolrInputDocument to JSON

2019-01-23 Thread Shawn Heisey

On 1/23/2019 5:05 PM, Pushkar Raste wrote:

We are setting up cluster with new version Solr and going to reindex data.
However, until all the data is indexed I need keep indexing data in the old
cluster as well. We are currently using the Solrj client and constructing
SolrInputDocument objects to index data.

  To avoid conflicts with the old and new jars, I am planning to use
HttpClient and json payload . Is there a convenient API to convert a
SolrInputDocument to json


First question: Is this SolrCloud?  If so, are both versions running 
SolrCloud?


Second question: What are the old and new Solr versions on the server side?

The answers to those questions will determine where I go with further 
questions and assistance.


Thanks,
Shawn



Re: API to convert a SolrInputDocument to JSON

2019-01-23 Thread Pushkar Raste
Thanks for the quick response Shawn. It is migrating ion from Solr 4.10
master/slave to Solr Cloud 7.x

On Wed, Jan 23, 2019 at 7:41 PM Shawn Heisey  wrote:

> On 1/23/2019 5:05 PM, Pushkar Raste wrote:
> > We are setting up cluster with new version Solr and going to reindex
> data.
> > However, until all the data is indexed I need keep indexing data in the
> old
> > cluster as well. We are currently using the Solrj client and constructing
> > SolrInputDocument objects to index data.
> >
> >   To avoid conflicts with the old and new jars, I am planning to use
> > HttpClient and json payload . Is there a convenient API to convert a
> > SolrInputDocument to json
>
> First question: Is this SolrCloud?  If so, are both versions running
> SolrCloud?
>
> Second question: What are the old and new Solr versions on the server side?
>
> The answers to those questions will determine where I go with further
> questions and assistance.
>
> Thanks,
> Shawn
>
>


Re: API to convert a SolrInputDocument to JSON

2019-01-23 Thread Shawn Heisey

On 1/23/2019 5:49 PM, Pushkar Raste wrote:

Thanks for the quick response Shawn. It is migrating ion from Solr 4.10
master/slave to Solr Cloud 7.x


In that case, use SolrJ 7.x, with CloudSolrClient to talk to the new 
version and HttpSolrClient to talk to the old version. Use the same 
SolrInputDocument objects for both.


Thanks,
Shawn



Re: API to convert a SolrInputDocument to JSON

2019-01-23 Thread Pushkar Raste
You mean I can use SolrJ 7.x for both indexing documents to both Solr 4 and
Solr 7 as well as the SolrInputDocument class from Solrj 7.x

Wouldn’t there be issues if there are any backwards incompatible changes.

On Wed, Jan 23, 2019 at 8:09 PM Shawn Heisey  wrote:

> On 1/23/2019 5:49 PM, Pushkar Raste wrote:
> > Thanks for the quick response Shawn. It is migrating ion from Solr 4.10
> > master/slave to Solr Cloud 7.x
>
> In that case, use SolrJ 7.x, with CloudSolrClient to talk to the new
> version and HttpSolrClient to talk to the old version. Use the same
> SolrInputDocument objects for both.
>
> Thanks,
> Shawn
>
>


Re: API to convert a SolrInputDocument to JSON

2019-01-23 Thread Walter Underwood
Only use CloudSolrClient if you don’t care about the error return from updates. 
For us, that is a fatal flaw that makes CloudSolrClient unacceptable for prod 
use.

We use HttpSolrClient directed at the load balancer for the cluster and I 
haven’t noticed any speed issues. I expect that forwarding a document ot the 
right leader is a small overhead.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jan 23, 2019, at 5:09 PM, Shawn Heisey  wrote:
> 
> On 1/23/2019 5:49 PM, Pushkar Raste wrote:
>> Thanks for the quick response Shawn. It is migrating ion from Solr 4.10
>> master/slave to Solr Cloud 7.x
> 
> In that case, use SolrJ 7.x, with CloudSolrClient to talk to the new version 
> and HttpSolrClient to talk to the old version. Use the same SolrInputDocument 
> objects for both.
> 
> Thanks,
> Shawn
> 



Re: Solr admin UI new features

2019-01-23 Thread Erick Erickson
H, is there any chance you somehow have your Solr instances
pulling some things, particularly browser-related from your old
install? Or from some intermediate cache between your browser and the
Solr instances? Or perhaps "something got copied somewhere" and is
being picked up from the old install? I'm really grasping at straws
here admittedly

Here's what I'd do. Install a fresh Solr 7.6 somewhere, your laptop
would be fine, some node in your system that doesn't have anything on
it you can use temporarily, I don't really care as long as you can
guarantee your new install is the only Solr install being referenced
and you can point it at your production ZooKeeper ensemble. Do you
still have the same problem? If not, I'd guess that your production
system has somehow mixed-and-matched...

Best,
Erick

On Wed, Jan 23, 2019 at 4:36 PM Dwane Hall  wrote:
>
> Hi user community,
>
>
> I recently upgraded a single node solr cloud environment from 7.3.1 to 7.6. 
> While traversing through the release notes for solr 7.5 to identify any 
> important changes to consider for the upgrade I noticed two excellent 
> additions to the Admin UI that took effect in solr 7.5 (SOLR-8207 – Add Nodes 
> view to Admin UI “Cloud” tab and SOLR-7767 ZK Status under “Cloud” tab).
> After completing my upgrade all collections are online and functioning as 
> expected and solr is working without issue however these two new menu items 
> do not appear to work (the urls are hit 
> https://server:port/solr/#/~cloud?view=nodes, 
> https://server:port/solr/#/~cloud?view=zkstatus) but the pages are blank.  
> The original menu items all function without issue (Tree, Graph, Graph 
> (Radial)).
>
> I’ve cleared my browser cache and checked the logs which are all clean (with 
> the log level set to DEBUG on org.apache.jetty.server.*).  Are there any 
> additional configuration changes I’m overlooking that I need to take 
> advantage of these new features?
>
>
> Environment
>
> Chrome 70, and Firefox 45
> Solr 7.6 (Cloud, single node)
> Https, basic auth plugin enabled
> Zookeeper 3.4.6
>
> As always any advice is appreciated,
>
> Thanks
>
> Dwane


Re: API to convert a SolrInputDocument to JSON

2019-01-23 Thread Erick Erickson
Walter:

Don't know if it helps, but have you looked at:
https://issues.apache.org/jira/browse/SOLR-445

I have _not_ worked with this personally in prod SolrCloud systems, so
I can't say much more
than it exists. It's only available in Solr 6.1+

Best,
Erick

On Wed, Jan 23, 2019 at 5:55 PM Pushkar Raste  wrote:
>
> You mean I can use SolrJ 7.x for both indexing documents to both Solr 4 and
> Solr 7 as well as the SolrInputDocument class from Solrj 7.x
>
> Wouldn’t there be issues if there are any backwards incompatible changes.
>
> On Wed, Jan 23, 2019 at 8:09 PM Shawn Heisey  wrote:
>
> > On 1/23/2019 5:49 PM, Pushkar Raste wrote:
> > > Thanks for the quick response Shawn. It is migrating ion from Solr 4.10
> > > master/slave to Solr Cloud 7.x
> >
> > In that case, use SolrJ 7.x, with CloudSolrClient to talk to the new
> > version and HttpSolrClient to talk to the old version. Use the same
> > SolrInputDocument objects for both.
> >
> > Thanks,
> > Shawn
> >
> >


Re: Indexing in one collection affect index in another collection

2019-01-23 Thread Zheng Lin Edwin Yeo
Hi Shawn,

Thanks for your reply.

The log only shows a list  the following and I don't see any other logs
besides these.

2019-01-24 02:47:57.925 INFO  (qtp2131952342-1330) [c:collectioin1 s:shard1
r:core_node4 x:policies_shard1_replica_n2]
o.a.s.u.p.StatelessScriptUpdateProcessorFactory update-script#processAdd:
id=13245417
2019-01-24 02:47:57.957 INFO  (qtp2131952342-1330) [c:collectioin1 s:shard1
r:core_node4 x:policies_shard1_replica_n2]
o.a.s.u.p.StatelessScriptUpdateProcessorFactory update-script#processAdd:
id=13245430
2019-01-24 02:47:57.957 INFO  (qtp2131952342-1330) [c:collectioin1 s:shard1
r:core_node4 x:policies_shard1_replica_n2]
o.a.s.u.p.StatelessScriptUpdateProcessorFactory update-script#processAdd:
id=13245435

There is no change to the segments info. but the slowdown in the first
collection is very drastic.
Before the indexing of collection2, the collection1 query QTime are in the
range of 4 to 50 ms. However, after indexing collection2, the collection1
query QTime increases to more than 1000 ms. The index are done in CSV
format, and the size of the index is 3GB.

Regards,
Edwin



On Thu, 24 Jan 2019 at 01:09, Shawn Heisey  wrote:

> On 1/23/2019 10:01 AM, Zheng Lin Edwin Yeo wrote:
> > I am using Solr 7.5.0, and currently I am facing an issue of when I am
> > indexing in collection2, the indexing affects the records in collection1.
> > Although the records are still intact, it seems that the settings of the
> > termVecotrs get wipe out, and the index size of collection1 reduced from
> > 3.3GB to 2.1GB after I do the indexing in collection2.
>
> This should not be possible.  Indexing in one collection should have
> absolutely no effect on another collection.
>
> If logging has been left at its default settings, the solr.log file
> should have enough info to show what actually happened.
>
> > Also, the search in
> > collection1, which was originall very fast, becomes very slow after the
> > indexing is done is collection2.
>
> If the two collections have data on the same server(s), I can see this
> happening.  More memory is consumed when there is additional data, and
> when Solr needs more memory, performance might be affected.  The
> solution is generally to install more memory in the server.  If the
> system is working, there should be no need to increase the heap size
> when the memory size increases ... but there can be situations where the
> heap is a little bit too small, where you WOULD want to increase the
> heap size.
>
> Thanks,
> Shawn
>
>


Re: Solr admin UI new features

2019-01-23 Thread Dwane Hall
Thanks Erick, very helpful as always ...we're up and going now. Before the 
install I spun up a stand alone instance to check comparability and the process 
did not shut down cleanly from the looks of things. I'm guessing solr was a 
little confused on which instance of zookeeper to use. (The bundled version or 
our production instance). Thanks again for your assistance it's very much 
appreciated.

Dwane

From: Erick Erickson 
Sent: Thursday, 24 January 2019 1:23:15 PM
To: solr-user
Subject: Re: Solr admin UI new features

H, is there any chance you somehow have your Solr instances
pulling some things, particularly browser-related from your old
install? Or from some intermediate cache between your browser and the
Solr instances? Or perhaps "something got copied somewhere" and is
being picked up from the old install? I'm really grasping at straws
here admittedly

Here's what I'd do. Install a fresh Solr 7.6 somewhere, your laptop
would be fine, some node in your system that doesn't have anything on
it you can use temporarily, I don't really care as long as you can
guarantee your new install is the only Solr install being referenced
and you can point it at your production ZooKeeper ensemble. Do you
still have the same problem? If not, I'd guess that your production
system has somehow mixed-and-matched...

Best,
Erick

On Wed, Jan 23, 2019 at 4:36 PM Dwane Hall  wrote:
>
> Hi user community,
>
>
> I recently upgraded a single node solr cloud environment from 7.3.1 to 7.6. 
> While traversing through the release notes for solr 7.5 to identify any 
> important changes to consider for the upgrade I noticed two excellent 
> additions to the Admin UI that took effect in solr 7.5 (SOLR-8207 – Add Nodes 
> view to Admin UI “Cloud” tab and SOLR-7767 ZK Status under “Cloud” tab).
> After completing my upgrade all collections are online and functioning as 
> expected and solr is working without issue however these two new menu items 
> do not appear to work (the urls are hit 
> https://server:port/solr/#/~cloud?view=nodes, 
> https://server:port/solr/#/~cloud?view=zkstatus) but the pages are blank.  
> The original menu items all function without issue (Tree, Graph, Graph 
> (Radial)).
>
> I’ve cleared my browser cache and checked the logs which are all clean (with 
> the log level set to DEBUG on org.apache.jetty.server.*).  Are there any 
> additional configuration changes I’m overlooking that I need to take 
> advantage of these new features?
>
>
> Environment
>
> Chrome 70, and Firefox 45
> Solr 7.6 (Cloud, single node)
> Https, basic auth plugin enabled
> Zookeeper 3.4.6
>
> As always any advice is appreciated,
>
> Thanks
>
> Dwane


Per-field slop param in eDisMax

2019-01-23 Thread Yasufumi Mizoguchi
Hi,

I am struggling to set per-field slop param in eDisMax query parser with
Solr 6.0 and 7.6.
What I want to do with eDixMax is similar to following in the default query
parser.

* Query string : "aaa bbb"
* Target fields : fieldA(TextField), fieldB(TextField)

q=fieldA:"aaa bbb"~2 OR fieldB:"aaa bbb"~5

Anyone have good ideas?

Thanks,
Yasufumi.


Re: Solr Size Limitation upto 32 kb limitation

2019-01-23 Thread Jörn Franke
I guess selfwritten loader means that you split up the file in 32 kB chunks or 
smaller and then posts each of those 32 kB chunks to the multi valued field.

> Am 18.01.2019 um 11:51 schrieb Kranthi Kumar K 
> :
> 
> Hi team,
>  
> Thank you Erick Erickson ,Bernd Fehling , Jan Hoydahl for your suggested 
> solutions. I’ve tried the suggested one’s and still we are unable to import 
> files havingsize  >32 kb, it is displaying same error.
>  
> Below link has the suggested solutions. Please have a look once.
>  
> http://lucene.472066.n3.nabble.com/Solr-Size-Limitation-upto-32-KB-files-td4419779.html
>  
> As per Erick Erickson, I’ve changed the string type to Text type based and 
> still the issue occurs .
> I’ve changed from :
>  
> 
>  
> Changed to:
>  
> 
>  
> If we do so, it is showing error in the log, please find the error in the 
> attachment.
>  
> If I change to:
>  
> 
>  
> It is not showing any error , but the issue still exists.
>  
> As per Jan Hoydahl, I have gone through the link that you have provided and 
> checked ‘requestParsers’ tag in solrconfig.xml,
>  
> RequestParsers tag in our application is as follows:
>  
> ‘ multipartUploadLimitInKB="2048000"
> formdataUploadLimitInKB="2048"
> addHttpRequestToContext="false"/>’
> Request parsers, which we are using and in the link you have provided are 
> similar. And still we are unable to import the files size >32 kb.
>  
> As per Bernd Fehling, we are using Solr 4.10.2. you have mentioned as,
> ‘If you are trying to add larger content then you have to "chop" that 
> by yourself and add it as multivalued. Can be done within a self written 
> loader. ’
>  
> I’m a newbie to Solr and I didn’t get what exactly ‘self written loader’ is?
>  
> Could you please provide us sample code, that helps us to go further?
>  
>  
> 
> 
> Thanks & Regards,
> Kranthi Kumar.K,
> Software Engineer,
> Ccube Fintech Global Services Pvt Ltd.,
> Email/Skype: kranthikuma...@ccubefintech.com,
> Mobile: +91-8978078449.
>  
>  
> From: Kranthi Kumar K  
> Sent: Thursday, January 17, 2019 12:43 PM
> To: d...@lucene.apache.org; solr-user@lucene.apache.org
> Cc: Ananda Babu medida ; Srinivasa Reddy 
> Karri ; Michelle Ngo 
> 
> Subject: Re: Solr Size Limitation upto 32 kb limitation
>  
> Hi Team,
> 
>  
> 
> Can we have any updates on the below issue? We are awaiting your reply.
> 
>  
> 
> Thanks,
> 
> Kranthi kumar.K
> 
> From: Kranthi Kumar K
> Sent: Friday, January 4, 2019 5:01:38 PM
> To: d...@lucene.apache.org
> Cc: Ananda Babu medida; Srinivasa Reddy Karri
> Subject: Solr Size Limitation upto 32 kb limitation
>  
> Hi team,
> 
>  
> 
> We are currently using Solr 4.2.1 version in our project and everything is 
> going well. But recently, we are facing an issue with Solr Data Import. It is 
> not importing the files with size greater than 32766 bytes (i.e, 32 kb) and 
> showing 2 exceptions:
> 
>  
> 
> java.lang.illegalargumentexception
> org.apache.lucene.util.bytesref hash$maxbyteslengthexceededexception
>  
> 
> Please find the attached screenshot for reference.
> 
>  
> 
> We have searched for solutions in many forums and didn’t find the exact 
> solution for this issue. Interestingly, we found in the article, by changing 
> the type of the ‘field’ from sting to  ‘text_general’ might solve the issue. 
> Please have a look in the below forum:
> 
>  
> 
> https://stackoverflow.com/questions/29445323/adding-a-document-to-the-index-in-solr-document-contains-at-least-one-immense-t
>   
> 
>  
> 
> Schema.xml:
> 
> Changed from:
> 
> ‘ multiValued="true" />’
> 
>  
> 
> Changed to:
> 
> ‘ multiValued="true" />’
> 
>  
> 
> We have tried it but still it is not importing the files > 32 KB or 32766 
> bytes.
> 
>  
> 
> Could you please let us know the solution to fix this issue? We’ll be 
> awaiting your reply.
> 
>  
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org


Frange Alternative in the filter query

2019-01-23 Thread Aman deep singh

Hi,
I have created a value source parser ,to use the parser in the filter query i 
was using the frange function,But using the frange function is giving the 
really bad performance (4x of current),my value source parser performance is 
almost same when used in sort and fl ,Only performance degrade when i try to 
use that in the filter query using frange.
Is their any alternative of frange or any other thing i can do so the 
performance shouldn’t degrade

I have already tried by introducing the cost factor (cost=200) in the frange 
also so that the filter will be applied as post filter.

Regards,
Aman Deep Singh

RE: Solr Size Limitation upto 32 kb limitation

2019-01-23 Thread Kranthi Kumar K
Thank you Bernd Fehling for your suggested solution, I've tried the same by 
changing the type and added multivalued to true in Schema.xml file i.e,
change from:



Changed to:



After changing it also still we are unable to import the files size > 32 kb. 
please find the solution suggested by Bernd in the below url:

http://lucene.472066.n3.nabble.com/Re-Solr-Size-Limitation-upto-32-kb-limitation-td4421569.html

Bernd Fehling, could you please suggest another alternative solution to resolve 
our issue, which would help us alot?

Please let me know for any questions.

[image001]
Thanks & Regards,
Kranthi Kumar.K,
Software Engineer,
Ccube Fintech Global Services Pvt Ltd.,
Email/Skype: 
kranthikuma...@ccubefintech.com,
Mobile: +91-8978078449.


From: Kranthi Kumar K
Sent: Friday, January 18, 2019 4:22 PM
To: d...@lucene.apache.org; solr-user@lucene.apache.org
Cc: Ananda Babu medida ; Srinivasa Reddy 
Karri ; Michelle Ngo 
; Ravi Vangala 
Subject: RE: Solr Size Limitation upto 32 kb limitation

Hi team,

Thank you Erick Erickson ,Bernd Fehling , Jan Hoydahl for your suggested 
solutions. I've tried the suggested one's and still we are unable to import 
files havingsize  >32 kb, it is displaying same error.

Below link has the suggested solutions. Please have a look once.

http://lucene.472066.n3.nabble.com/Solr-Size-Limitation-upto-32-KB-files-td4419779.html


  1.  As per Erick Erickson, I've changed the string type to Text type based 
and still the issue occurs .

I've changed from :







Changed to:







If we do so, it is showing error in the log, please find the error in the 
attachment.



If I change to:







It is not showing any error , but the issue still exists.



  1.  As per Jan Hoydahl, I have gone through the link that you have provided 
and checked 'requestParsers' tag in solrconfig.xml,



RequestParsers tag in our application is as follows:



''

Request parsers, which we are using and in the link you have provided are 
similar. And still we are unable to import the files size >32 kb.



  1.  As per Bernd Fehling, we are using Solr 4.10.2. you have mentioned as,
'If you are trying to add larger content then you have to "chop" that
by yourself and add it as multivalued. Can be done within a self written 
loader. '


I'm a newbie to Solr and I didn't get what exactly 'self written loader' is?



Could you please provide us sample code, that helps us to go further?


[image001]
Thanks & Regards,
Kranthi Kumar.K,
Software Engineer,
Ccube Fintech Global Services Pvt Ltd.,
Email/Skype: 
kranthikuma...@ccubefintech.com,
Mobile: +91-8978078449.


From: Kranthi Kumar K 
mailto:kranthikuma...@ccubefintech.com>>
Sent: Thursday, January 17, 2019 12:43 PM
To: d...@lucene.apache.org; 
solr-user@lucene.apache.org
Cc: Ananda Babu medida 
mailto:anandababu.med...@ccubefintech.com>>;
 Srinivasa Reddy Karri 
mailto:srinivasareddy.ka...@ccubefintech.com>>;
 Michelle Ngo mailto:michelle@ccube.com.au>>
Subject: Re: Solr Size Limitation upto 32 kb limitation


Hi Team,



Can we have any updates on the below issue? We are awaiting your reply.



Thanks,

Kranthi kumar.K


From: Kranthi Kumar K
Sent: Friday, January 4, 2019 5:01:38 PM
To: d...@lucene.apache.org
Cc: Ananda Babu medida; Srinivasa Reddy Karri
Subject: Solr Size Limitation upto 32 kb limitation


Hi team,



We are currently using Solr 4.2.1 version in our project and everything is 
going well. But recently, we are facing an issue with Solr Data Import. It is 
not importing the files with size greater than 32766 bytes (i.e, 32 kb) and 
showing 2 exceptions:



  1.  java.lang.illegalargumentexception
  2.  org.apache.lucene.util.bytesref hash$maxbyteslengthexceededexception



Please find the attached screenshot for reference.



We have searched for solutions in many forums and didn't find the exact 
solution for this issue. Interestingly, we found in the article, by changing 
the type of the 'field' from sting to  'text_general' might solve the issue. 
Please have a look in the below forum:



https://stackoverflow.com/questions/29445323/adding-a-document-to-the-index-in-solr-document-contains-at-least-one-immense-t



Schema.xml:

Changed from:

''



Changed to:

''



We have tried it but still it is not importing the files > 32 KB or 32766 bytes.



Could you please let us know the solution to fix this issue? We'll be awaiting 
your reply.



RE: Solr Size Limitation upto 32 kb limitation

2019-01-23 Thread Michelle Ngo
Thanks @Kranthi Kumar K for following up

From: Kranthi Kumar K 
Sent: Thursday, 24 January 2019 4:51 PM
To: d...@lucene.apache.org; solr-user@lucene.apache.org
Cc: Ananda Babu medida ; Srinivasa Reddy 
Karri ; Michelle Ngo 
; Ravi Vangala ; 
Suresh Malladi ; Vijay Nandula 

Subject: RE: Solr Size Limitation upto 32 kb limitation

Thank you Bernd Fehling for your suggested solution, I've tried the same by 
changing the type and added multivalued to true in Schema.xml file i.e,
change from:



Changed to:



After changing it also still we are unable to import the files size > 32 kb. 
please find the solution suggested by Bernd in the below url:

http://lucene.472066.n3.nabble.com/Re-Solr-Size-Limitation-upto-32-kb-limitation-td4421569.html

Bernd Fehling, could you please suggest another alternative solution to resolve 
our issue, which would help us alot?

Please let me know for any questions.

[image001]
Thanks & Regards,
Kranthi Kumar.K,
Software Engineer,
Ccube Fintech Global Services Pvt Ltd.,
Email/Skype: 
kranthikuma...@ccubefintech.com,
Mobile: +91-8978078449.


From: Kranthi Kumar K
Sent: Friday, January 18, 2019 4:22 PM
To: d...@lucene.apache.org; 
solr-user@lucene.apache.org
Cc: Ananda Babu medida 
mailto:anandababu.med...@ccubefintech.com>>;
 Srinivasa Reddy Karri 
mailto:srinivasareddy.ka...@ccubefintech.com>>;
 Michelle Ngo mailto:michelle@ccube.com.au>>; 
Ravi Vangala 
mailto:ravi.vang...@ccubefintech.com>>
Subject: RE: Solr Size Limitation upto 32 kb limitation

Hi team,

Thank you Erick Erickson ,Bernd Fehling , Jan Hoydahl for your suggested 
solutions. I've tried the suggested one's and still we are unable to import 
files havingsize  >32 kb, it is displaying same error.

Below link has the suggested solutions. Please have a look once.

http://lucene.472066.n3.nabble.com/Solr-Size-Limitation-upto-32-KB-files-td4419779.html


  1.  As per Erick Erickson, I've changed the string type to Text type based 
and still the issue occurs .

I've changed from :







Changed to:







If we do so, it is showing error in the log, please find the error in the 
attachment.



If I change to:







It is not showing any error , but the issue still exists.



  1.  As per Jan Hoydahl, I have gone through the link that you have provided 
and checked 'requestParsers' tag in solrconfig.xml,



RequestParsers tag in our application is as follows:



''

Request parsers, which we are using and in the link you have provided are 
similar. And still we are unable to import the files size >32 kb.



  1.  As per Bernd Fehling, we are using Solr 4.10.2. you have mentioned as,
'If you are trying to add larger content then you have to "chop" that
by yourself and add it as multivalued. Can be done within a self written 
loader. '


I'm a newbie to Solr and I didn't get what exactly 'self written loader' is?



Could you please provide us sample code, that helps us to go further?


[image001]
Thanks & Regards,
Kranthi Kumar.K,
Software Engineer,
Ccube Fintech Global Services Pvt Ltd.,
Email/Skype: 
kranthikuma...@ccubefintech.com,
Mobile: +91-8978078449.


From: Kranthi Kumar K 
mailto:kranthikuma...@ccubefintech.com>>
Sent: Thursday, January 17, 2019 12:43 PM
To: d...@lucene.apache.org; 
solr-user@lucene.apache.org
Cc: Ananda Babu medida 
mailto:anandababu.med...@ccubefintech.com>>;
 Srinivasa Reddy Karri 
mailto:srinivasareddy.ka...@ccubefintech.com>>;
 Michelle Ngo mailto:michelle@ccube.com.au>>
Subject: Re: Solr Size Limitation upto 32 kb limitation


Hi Team,



Can we have any updates on the below issue? We are awaiting your reply.



Thanks,

Kranthi kumar.K


From: Kranthi Kumar K
Sent: Friday, January 4, 2019 5:01:38 PM
To: d...@lucene.apache.org
Cc: Ananda Babu medida; Srinivasa Reddy Karri
Subject: Solr Size Limitation upto 32 kb limitation


Hi team,



We are currently using Solr 4.2.1 version in our project and everything is 
going well. But recently, we are facing an issue with Solr Data Import. It is 
not importing the files with size greater than 32766 bytes (i.e, 32 kb) and 
showing 2 exceptions:



  1.  java.lang.illegalargumentexception
  2.  org.apache.lucene.util.bytesref hash$maxbyteslengthexceededexception



Please find the attached screenshot for reference.



We have searched for solutions in many forums and didn't find the exact 
solution for this issue. Interestingly, we found in the article, by changing 
the type of the 'field' from sting to  'text_general' might solve the issue. 
Please have a look in the below forum:



https://stackoverflow.com/questions/29445323/adding-a-document-to-the-index-in-solr-document-contains-at-least-one-immense-t



Schem