date:20181116

indexing multiple levels of data

2018-11-16 Thread Martin Frank Hansen (MHQ)

Hi, I am trying to add meta data and files to Solr, but are experiencing some problems. Data is divided on three two, cases and files. For each case the meta-data is given in an xml document, while meta data for the files is given in another xml document, and the actual files are kept in yet a

Re: Extracting important multi term phrases from the text

2018-11-16 Thread David Hastings

Which function of the SKG are you using? significantTerms? On Thu, Nov 15, 2018 at 7:09 PM Alexandre Rafalovitch wrote: > I think the underscore actually comes from the Shingles (parameter > fillerToken). Have you tried setting it to empty string? > > Regards, >Alex. > On Thu, 15 Nov 2018 a

Re: indexing multiple levels of data

2018-11-16 Thread Jan Høydahl

Hi Martin, For a complex use case as this I would recommend you write a separate indexer application that crawls the files, looks up the correct metadata XMLs based on given business rules, and then constructs the full Solr document to send to Solr. Even parsing full-text from PDF etc I would r

RE: indexing multiple levels of data

2018-11-16 Thread Martin Frank Hansen (MHQ)

Hi Jan, Thanks for your quick reply! I was fearing that you would suggest this 😉 I have already moved much of the indexing application out of Solr which gives me the desired flexibility, but I am a bit concerned about the time consumption doing so. Right now I have about 20,000 xml documents a

Re: Extracting important multi term phrases from the text

2018-11-16 Thread Pratik Patel

@Markus @Walter, @Alexandre is right. The culprit was not StopWord Filter, it was ShingleFilter. I could not find parameter filterToken in documentation, is it a new addition? BTW, I tried that and it works. Thanks! I still ended up using pattern replacement filter because I did not want any singl

Re: Extracting important multi term phrases from the text

2018-11-16 Thread David Hastings

Thanks, I would be really curious to see your url call if you dont mind. I am just getting started with the skg stuff and finding this conversation in particular has really helped On Fri, Nov 16, 2018 at 10:44 AM Pratik Patel wrote: > @Markus @Walter, @Alexandre is right. The culprit was not S

Re: Extracting important multi term phrases from the text

2018-11-16 Thread Alexandre Rafalovitch

Good catch Pratik. It is in Javadoc, but not in the reference guide: https://lucene.apache.org/core/6_3_0/analyzers-common/org/apache/lucene/analysis/shingle/ShingleFilterFactory.html . I'll try to fix that later (SOLR-12996). Regards, Alex. On Fri, 16 Nov 2018 at 10:44, Pratik Patel wrote: >

Not able reproduce race condition issue to justify implementation of optimistic concurrency

2018-11-16 Thread Arnold Bronley

Hi, Before implementing optimistic concurrency solution, I had written one test case to check if two threads atomically writing two different fields (say f1 and f2) of the same document (say d) run into conflict or not. Thread t1 atomically writes counter c1 to field f1 of document d, commits and

Re: Not able reproduce race condition issue to justify implementation of optimistic concurrency

2018-11-16 Thread Chris Hostetter

1) depending on the number of CPUs / load on your solr server, it's possible you're just getting lucky. it's hard to "prove" with a multithreaded test that concurrency bugs exist. 2) a lot depends on what your updates look like (ie: the impl of SolrDocWriter.atomicWrite()), and what the field

Soft commits and new Searcher

2018-11-16 Thread Walter Underwood

Does a soft commit always open a new Searcher? I’ve been reading all the documentation and articles I can find, and they all say that soft commit makes documents visible for searching. They don’t specifically say that they invalidate the caches and/or open a new Searcher. I guess I can see a us

Re: Soft commits and new Searcher

2018-11-16 Thread Shawn Heisey

On 11/16/2018 11:54 AM, Walter Underwood wrote: Does a soft commit always open a new Searcher? In general, yes. To quote the oft-referenced blog post ... hard commits are about durability, soft commits are about visibility. I actually don't know if "openSearcher=false" would work on a soft

Re: Soft commits and new Searcher

2018-11-16 Thread Shawn Heisey

On 11/16/2018 12:21 PM, Shawn Heisey wrote: On 11/16/2018 11:54 AM, Walter Underwood wrote: I’ve been reading all the documentation and articles I can find, and they all say that soft commit makes documents visible for searching. They don’t specifically say that they invalidate the caches and/o

Re: Soft commits and new Searcher

2018-11-16 Thread Walter Underwood

Thanks. I don’t need openSearcher=false on soft commits. I was just musing about it. Keeping the same query result cache would be very similar to using an HTTP cache in front of Solr. Which means that it should be done with an HTTP cache, because those are straighforward and very fast. It would

Re: is SearchComponent the correct way?

2018-11-16 Thread Mikhail Khludnev

On Tue, Nov 13, 2018 at 6:36 AM John Thorhauer wrote: > Mikhail, > > Where do I implement the buffering? I can not do it in then collect() > method. Please clarify why exactly? Notice my statement about one segment only. > I can not see how I can get access to what I need in the finish() > me

Re: Not able reproduce race condition issue to justify implementation of optimistic concurrency

2018-11-16 Thread Arnold Bronley

Thanks for replying, Chris. 1) depending on the number of CPUs / load on your solr server, it's possible you're just getting lucky. it's hard to "prove" with a multithreaded test that concurrency bugs exist. - Agreed. However, between 200k total calls, race condition not happening even once - I f

indexing multiple levels of data

Re: Extracting important multi term phrases from the text

Re: indexing multiple levels of data

RE: indexing multiple levels of data

Re: Extracting important multi term phrases from the text

Re: Extracting important multi term phrases from the text

Re: Extracting important multi term phrases from the text

Not able reproduce race condition issue to justify implementation of optimistic concurrency

Re: Not able reproduce race condition issue to justify implementation of optimistic concurrency

Soft commits and new Searcher

Re: Soft commits and new Searcher

Re: Soft commits and new Searcher

Re: Soft commits and new Searcher

Re: is SearchComponent the correct way?

Re: Not able reproduce race condition issue to justify implementation of optimistic concurrency

15 matches

Site Navigation

Mail list logo

Footer information