Solr541 Carriage Return Stripped Off In String Field ?

2016-02-02 Thread Kosila Yuichiro
Hello. I have a question regarding to "string" type field. [ Symptom ] When a string value including carriage return line feed (\r\n) and passed that over to a string field, it is stored, however, when I query that document and see the value of the field, carriage return is stripped off away. [ Q

Re: Solr segment merging in different replica

2016-02-02 Thread Zheng Lin Edwin Yeo
Hi Emir, Thanks for your reply. As currently both of my main and replica are in the same server, and as I am using the SolrCloud setup, both the replica are doing the merging concurrently, which causes the memory usage of the server to be very high, and affect the other functions like querying. T

Re: catch alls and nuances

2016-02-02 Thread John Blythe
yo, erick: thanks for the reply. yes, i was only meaning my own custom fieldType. my bad on not sticking w my original example. i've been using the StandardTokenizerFactory to break out the stream. While I understand the tokenization/stream on paper, perhaps I'm not connecting all the dots I need

Re: Using Tika that comes with Solr 5.2

2016-02-02 Thread Steven White
I found my issue. I need to include JARs off: \solr\contrib\extraction\lib\ Steve On Tue, Feb 2, 2016 at 4:24 PM, Steven White wrote: > I'm not using solr-app.jar. I need to stick with Tika JARs that come with > Solr 5.2 and yet get the full text extraction feature of Tika (all file > types i

Re: catch alls and nuances

2016-02-02 Thread Susheel Kumar
Hi John - You can take more close look on different options with WordDelimeterFilterFactory at https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory to see if they meet your requirement and use Analysis tab in Solr Admin UI. If still have question, you can sha

Re: catch alls and nuances

2016-02-02 Thread Erick Erickson
bq: Have now begun writing my own. I hope by that you mean defining your own , at least until you're sure that none of the zillion things you can do with an analysis chain don't suit your needs. If you haven't already looked _seriously_ at the admin/analysis page (you have to choose a core to hav

Re: URI is too long

2016-02-02 Thread Shawn Heisey
On 2/2/2016 1:46 PM, Salman Ansari wrote: > OK then, if there is no way around this problem, can someone tell me the > maximum size a POST body can handle in Solr? It is configurable in solrconfig.xml. Look for the formdataUploadLimitInKB setting in the 5.x configsets. This setting defaults to 2

Re: Using Tika that comes with Solr 5.2

2016-02-02 Thread Steven White
I'm not using solr-app.jar. I need to stick with Tika JARs that come with Solr 5.2 and yet get the full text extraction feature of Tika (all file types it supports). At first, I started to include Tika JARs as needed; I now have all Tika related JARs that come with Solr and yet it is not working.

Re: catch alls and nuances

2016-02-02 Thread John Blythe
I had been using text_general at the time of my email's writing. Have tried a couple of the other stock ones (text_en, text_en_splitting, _tight). Have now begun writing my own. I began to wonder if simply doing one of the above, such as text_general, with a fuzzy distance (probably just ~1) would

Re: URI is too long

2016-02-02 Thread Salman Ansari
OK then, if there is no way around this problem, can someone tell me the maximum size a POST body can handle in Solr? Regards, Salman On Tue, Feb 2, 2016 at 12:12 AM, Salman Ansari wrote: > That is what I have tried. I tried using POST with > application/x-www-form-urlencoded and I got the exce

Re: Data Import Handler takes different time on different machines

2016-02-02 Thread Troy Edwards
That is help! Thank you for the thoughts. On Tue, Feb 2, 2016 at 12:17 PM, Erick Erickson wrote: > Scratch that installation and start over? > > Really, it sounds like something is fundamentally messed up with the > Linux install. Perhaps something as simple as file paths, or you have > old ja

RE: Using Tika that comes with Solr 5.2

2016-02-02 Thread Allison, Timothy B.
Might not have the parsers on your path within your Solr framework? Which tika jars are on your path? If you want the functionality of all of Tika, use the standalone tika-app.jar, but do not use the app in the same JVM as Solr...without a custom class loader. The Solr team carefully prunes

Using Tika that comes with Solr 5.2

2016-02-02 Thread Steven White
Hi, I'm trying to use Tika that comes with Solr 5.2. The following code is not working: public static void parseWithTika() throws Exception { File file = new File("C:\\temp\\test.pdf"); FileInputStream in = new FileInputStream(file); AutoDetectParser parser = new AutoDetectParser();

RE: Multi-lingual search

2016-02-02 Thread Allison, Timothy B.
Three basic options: 1) one generic field that handles non-whitespace languages and normalization robustly (downside: no language specific stopwords, stemming, etc) 2) one field per language (hope lang id works and that you don't have many multilingual docs) 3) one Solr core for language (ditto)

RE: When does Solr plan to update its embedded Apache Tika version?

2016-02-02 Thread Allison, Timothy B.
Don't know what the answer from the Solr side is, but from the Tika side, I recently failed to get TIKA-1830 into Tika 1.12...so there may be a need to wait for Tika 1.13. No matter the answer on when there'll be an upgrade within Solr, I strongly encourage carving Tika into a separate JVM/serv

Re: Data Import Handler takes different time on different machines

2016-02-02 Thread Erick Erickson
Scratch that installation and start over? Really, it sounds like something is fundamentally messed up with the Linux install. Perhaps something as simple as file paths, or you have old jars hanging around that are mis-matched. Or someone manually deleted files from the Solr install. Or your disk f

Re: Data Import Handler takes different time on different machines

2016-02-02 Thread Troy Edwards
Rerunning the Data Import Handler again on the the linux machine has started producing some errors and warnings: On the node on which DIH was started: WARN SolrWriter Error creating document : SolrInputDocument org.apache.solr.common.SolrException: No registered leader was found after waiting fo

Re: Solr+HDFS

2016-02-02 Thread Erick Erickson
Does this happen all the time or only when bringing up Solr on some of the nodes? My (naive) question is whether this message: AlreadyBeingCreatedException could indicate that more than one Solr is trying to access the same tlog Best, Erick On Tue, Feb 2, 2016 at 9:01 AM, Scott Stults wrote

Re: Nested documents and many-many relation

2016-02-02 Thread Jan Høydahl
The new Parallell SQL feature of 6.0? Also query-time on top of streaming, don’t know performance... -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 1. feb. 2016 kl. 07.37 skrev Sathyakumar Seshachalam > : > > Thanks, query time joins are not an option for me, beca

Re: sorry, no dataimport-handler defined!

2016-02-02 Thread Jean-Jacques Monot
Exact. Newbie user ! OK i have seen what is missing ... Le 2 févr. 2016 15:40, "Davis, Daniel (NIH/NLM) [C]" a écrit : > > It sounds a bit like you are just exploring Solr for the first time.   To use > the Data Import Handler, you need to create an XML file that configures it, > data-config.

Re: Scripting server side

2016-02-02 Thread Scott Stults
Are you trying to manipulate the query with a script, or just the response? If it's the response you want to work with, I think your only options are using Velocity templates or XSLT. For working with the query you'll either have to make your own QueryParserPlugin or intercept the request before it

Re: upgrade SolrCloud

2016-02-02 Thread Scott Stults
That appears to be the case. If you're apprehensive because you had trouble upgrading to 5.4.0, there was a bug in that release (fixed in 5.4.1) that could've bitten you: https://issues.apache.org/jira/browse/SOLR-8561 k/r, Scott On Thu, Jan 28, 2016 at 1:36 PM, Oakley, Craig (NIH/NLM/NCBI) [C]

Re: Solr+HDFS

2016-02-02 Thread Scott Stults
It seems odd that the tlog files are so large. HDFS aside, is there a reason why you're not committing? Also, as far as disk space goes, if you dip below 50% free you run the risk that the index segments can't be merged. k/r, Scott On Fri, Jan 29, 2016 at 12:40 AM, Joseph Obernberger < joseph.ob

Re: Multi-lingual search

2016-02-02 Thread Scott Stults
The IndicNormalizationFilter appears to work with Tamil. Is it not working for you? k/r, Scott On Mon, Feb 1, 2016 at 8:34 AM, vidya wrote: > Hi > > My use case is to index and able to query different languages in solr > which > are not in-built languages supported by solr. How can i implemen

Re: plugging an analyzer

2016-02-02 Thread Scott Stults
There are a lot of things that can go wrong when you're wiring up a custom analyzer. I'd first check the simple things: * Custom jar is in Solr's classpath * Not using the custom factory in a field type's analysis chain * Not declaring a field with that type * Not using that field in a document *

Update to solr 5 - custom coordination factor implementation issue

2016-02-02 Thread Elodie Sannier
Hello, We are using solr 4.10.4 and we want to update to 5.4.1. With solr 4.10.4: - we extend BooleanQuery with a custom class in order to update the coordination factor behaviour (coord method) but with solr 5.4.1 this computation does not seem to be done by BooleanQuery anymore - in order to u

sorry, no dataimport-handler defined!

2016-02-02 Thread Davis, Daniel (NIH/NLM) [C]
It sounds a bit like you are just exploring Solr for the first time. To use the Data Import Handler, you need to create an XML file that configures it, data-config.xml by default. But before we go into details, what are you trying to accomplish with Solr? -Original Message- From: Jean

Re: Shard allocation across nodes

2016-02-02 Thread Tom Evans
Thank you both, those are exactly what I was looking for! If I'm reading it right, if I specify a "-Dvmhost=foo" when starting SolrCloud, and then specify a snitch rule like this when creating the collection: sysprop.vmhost:*,replica:<2 then this would ensure that on each vmhost there is at mo

Re: Update to solr 5 - custom phrase query implementation issue

2016-02-02 Thread Erik Hatcher
> On Feb 2, 2016, at 8:57 AM, Elodie Sannier wrote: > > Hello, > > We are using solr 4.10.4 and we want to update to 5.4.1. > > With solr 4.10.4: > - we extend PhraseQuery with a custom class in order to remove some > terms from phrase queries with phrase slop (update of add(Term term, int > po

Update to solr 5 - custom phrase query implementation issue

2016-02-02 Thread Elodie Sannier
Hello, We are using solr 4.10.4 and we want to update to 5.4.1. With solr 4.10.4: - we extend PhraseQuery with a custom class in order to remove some terms from phrase queries with phrase slop (update of add(Term term, int position) method) - in order to use our implementation, we extend Extende

RE: filters to work with dates

2016-02-02 Thread Markus Jelsma
Hello - i would opt for having a date field, and a custom update processor that converts a string date via DateUtils.parseDate() to an actual Date object. I think this would be a much simpler approach than a custom field or token filter. Markus -Original message- > From:Miguel Valencia

RE: When does Solr plan to update its embedded Apache Tika version?

2016-02-02 Thread Markus Jelsma
Hi - there is no open issue on upgrading Tika to 1.11, but you can always open one yourself. Markus -Original message- > From:Giovanni Usai > Sent: Tuesday 2nd February 2016 14:43 > To: solr-user@lucene.apache.org > Subject: When does Solr plan to update its embedded Apache Tika versio

Update to solr 5 - custom coordination factor implementation issue

2016-02-02 Thread Elodie Sannier
Hello, We are using solr 4.10.4 and we want to update to 5.4.1. With solr 4.10.4: - we extend BooleanQuery with a custom class in order to update the coordination factor behaviour (coord method) but with solr 5.4.1 this computation does not seem to be done by BooleanQuery anymore - in order to u

When does Solr plan to update its embedded Apache Tika version?

2016-02-02 Thread Giovanni Usai
Hello, I would gladly welcome the reply of the community on the following subject: Until the last version (5.4.1) Solr is embedding Tika artifacts (in contrib/extraction/lib) version 1.7 and dependent artifacts POI version 3.11. Do you know when do you plan to update the version of Tika to a

Re: KeepWord

2016-02-02 Thread John Blythe
nice tip. i appreciate it! -- *John Blythe* Product Manager & Lead Developer 251.605.3071 | j...@curvolabs.com www.curvolabs.com 58 Adams Ave Evansville, IN 47713 On Mon, Feb 1, 2016 at 4:55 PM, Erik Hatcher wrote: > And if you want to have the “kept” words stored, consider the trick used >

Re: Solr segment merging in different replica

2016-02-02 Thread Emir Arnautovic
Hi Edwin, Do you see any signs of network being bottleneck that would justify such setup? I would suggest you monitor your cluster before deciding if you need separate interfaces for external and internal communication. Sematext's SPM (http://sematext.com/spm) allows you to monitor SolrCloud,

Re: Export request handler via SolrJ

2016-02-02 Thread Joel Bernstein
Take a look at SolrStream and CloudSolrStream. These are available since Solr 5.1 but the 6.0 release will greatly improved on the streaming capabilities. http://joelsolr.blogspot.com/2015/04/the-streaming-api-solrjio-basics.html Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Feb 2, 2016 a

filters to work with dates

2016-02-02 Thread Miguel Valencia Zurera
Hi everybody I'm looking for a filter o similar function to resolve the next problem in my solr index: I have a string field that it contains a date but each record of this field can be in diferent formats. Now I have to sort by this field and for this I have to normalize this field. I've thou

Re: Solr highlight

2016-02-02 Thread Anil
no df, but hl.fl is * and docId is string field. On 2 February 2016 at 11:01, Zheng Lin Edwin Yeo wrote: > Do you have any setting for "df" and "hl.fl: under your /highlight request > handler in your solrconfig.xml? And which version of Solr are you using? > > Regards, > Edwin > > On 2 February

Re: SolrCloud with large synonym files

2016-02-02 Thread Janit Anjaria
Hi Vincenzo, That seems to be a great solution as well. We had actually tried to move all our synonym files to the solr config file, but that did not work for us. I think we can try moving it to our collection config and check as well. Thanks for the input anyways :) Janit -- View this message

RE: Error configuring UIMA

2016-02-02 Thread Gian Maria Ricci - aka Alkampfer
Agree, a better error message could help to resolve the problem in no time, instead of forcing the user to double check every settings until you find the wrong one. Also the official page in the wiki is outdated because it refers to Solr4, while for Solr5 you need to do some little modification

plugging an analyzer

2016-02-02 Thread Roxana Danger
Hello, I would like to use some code embedded on an analyser. The problem is that I need to pass some parameters for initializing it. My though was to create a plugin and initialize the parameters with the init( Map args ) or init( NamedList args ) methods as explained in http://wiki.apache.org/sol