Hello.
I have a question regarding to "string" type field.
[ Symptom ]
When a string value including carriage return line feed (\r\n)
and passed that over to a string field, it is stored, however,
when I query that document and see the value of the field,
carriage return is stripped off away.
[ Q
Hi Emir,
Thanks for your reply.
As currently both of my main and replica are in the same server, and as I
am using the SolrCloud setup, both the replica are doing the merging
concurrently, which causes the memory usage of the server to be very high,
and affect the other functions like querying. T
yo,
erick: thanks for the reply. yes, i was only meaning my own custom
fieldType. my bad on not sticking w my original example. i've been using
the StandardTokenizerFactory to break out the stream. While I understand
the tokenization/stream on paper, perhaps I'm not connecting all the dots I
need
I found my issue. I need to include JARs off: \solr\contrib\extraction\lib\
Steve
On Tue, Feb 2, 2016 at 4:24 PM, Steven White wrote:
> I'm not using solr-app.jar. I need to stick with Tika JARs that come with
> Solr 5.2 and yet get the full text extraction feature of Tika (all file
> types i
Hi John - You can take more close look on different options with
WordDelimeterFilterFactory at
https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory
to see if they meet your requirement and use Analysis tab in Solr Admin UI.
If still have question, you can sha
bq: Have now begun writing my own.
I hope by that you mean defining your own ,
at least until you're sure that none of the zillion things
you can do with an analysis chain don't suit your needs.
If you haven't already looked _seriously_ at the admin/analysis
page (you have to choose a core to hav
On 2/2/2016 1:46 PM, Salman Ansari wrote:
> OK then, if there is no way around this problem, can someone tell me the
> maximum size a POST body can handle in Solr?
It is configurable in solrconfig.xml. Look for the
formdataUploadLimitInKB setting in the 5.x configsets. This setting
defaults to 2
I'm not using solr-app.jar. I need to stick with Tika JARs that come with
Solr 5.2 and yet get the full text extraction feature of Tika (all file
types it supports).
At first, I started to include Tika JARs as needed; I now have all Tika
related JARs that come with Solr and yet it is not working.
I had been using text_general at the time of my email's writing. Have tried
a couple of the other stock ones (text_en, text_en_splitting, _tight). Have
now begun writing my own. I began to wonder if simply doing one of the
above, such as text_general, with a fuzzy distance (probably just ~1) would
OK then, if there is no way around this problem, can someone tell me the
maximum size a POST body can handle in Solr?
Regards,
Salman
On Tue, Feb 2, 2016 at 12:12 AM, Salman Ansari
wrote:
> That is what I have tried. I tried using POST with
> application/x-www-form-urlencoded and I got the exce
That is help!
Thank you for the thoughts.
On Tue, Feb 2, 2016 at 12:17 PM, Erick Erickson
wrote:
> Scratch that installation and start over?
>
> Really, it sounds like something is fundamentally messed up with the
> Linux install. Perhaps something as simple as file paths, or you have
> old ja
Might not have the parsers on your path within your Solr framework?
Which tika jars are on your path?
If you want the functionality of all of Tika, use the standalone tika-app.jar,
but do not use the app in the same JVM as Solr...without a custom class loader.
The Solr team carefully prunes
Hi,
I'm trying to use Tika that comes with Solr 5.2. The following code is not
working:
public static void parseWithTika() throws Exception
{
File file = new File("C:\\temp\\test.pdf");
FileInputStream in = new FileInputStream(file);
AutoDetectParser parser = new AutoDetectParser();
Three basic options:
1) one generic field that handles non-whitespace languages and normalization
robustly (downside: no language specific stopwords, stemming, etc)
2) one field per language (hope lang id works and that you don't have many
multilingual docs)
3) one Solr core for language (ditto)
Don't know what the answer from the Solr side is, but from the Tika side, I
recently failed to get TIKA-1830 into Tika 1.12...so there may be a need to
wait for Tika 1.13.
No matter the answer on when there'll be an upgrade within Solr, I strongly
encourage carving Tika into a separate JVM/serv
Scratch that installation and start over?
Really, it sounds like something is fundamentally messed up with the
Linux install. Perhaps something as simple as file paths, or you have
old jars hanging around that are mis-matched. Or someone manually
deleted files from the Solr install. Or your disk f
Rerunning the Data Import Handler again on the the linux machine has
started producing some errors and warnings:
On the node on which DIH was started:
WARN SolrWriter Error creating document : SolrInputDocument
org.apache.solr.common.SolrException: No registered leader was found
after waiting fo
Does this happen all the time or only when bringing up Solr on some of
the nodes?
My (naive) question is whether this message: AlreadyBeingCreatedException
could indicate that more than one Solr is trying to access the same tlog
Best,
Erick
On Tue, Feb 2, 2016 at 9:01 AM, Scott Stults
wrote
The new Parallell SQL feature of 6.0? Also query-time on top of streaming,
don’t know performance...
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
> 1. feb. 2016 kl. 07.37 skrev Sathyakumar Seshachalam
> :
>
> Thanks, query time joins are not an option for me, beca
Exact. Newbie user !
OK i have seen what is missing ...
Le 2 févr. 2016 15:40, "Davis, Daniel (NIH/NLM) [C]" a
écrit :
>
> It sounds a bit like you are just exploring Solr for the first time. To use
> the Data Import Handler, you need to create an XML file that configures it,
> data-config.
Are you trying to manipulate the query with a script, or just the response?
If it's the response you want to work with, I think your only options are
using Velocity templates or XSLT. For working with the query you'll either
have to make your own QueryParserPlugin or intercept the request before it
That appears to be the case. If you're apprehensive because you had trouble
upgrading to 5.4.0, there was a bug in that release (fixed in 5.4.1) that
could've bitten you:
https://issues.apache.org/jira/browse/SOLR-8561
k/r,
Scott
On Thu, Jan 28, 2016 at 1:36 PM, Oakley, Craig (NIH/NLM/NCBI) [C]
It seems odd that the tlog files are so large. HDFS aside, is there a
reason why you're not committing? Also, as far as disk space goes, if you
dip below 50% free you run the risk that the index segments can't be merged.
k/r,
Scott
On Fri, Jan 29, 2016 at 12:40 AM, Joseph Obernberger <
joseph.ob
The IndicNormalizationFilter appears to work with Tamil. Is it not working
for you?
k/r,
Scott
On Mon, Feb 1, 2016 at 8:34 AM, vidya wrote:
> Hi
>
> My use case is to index and able to query different languages in solr
> which
> are not in-built languages supported by solr. How can i implemen
There are a lot of things that can go wrong when you're wiring up a custom
analyzer. I'd first check the simple things:
* Custom jar is in Solr's classpath
* Not using the custom factory in a field type's analysis chain
* Not declaring a field with that type
* Not using that field in a document
*
Hello,
We are using solr 4.10.4 and we want to update to 5.4.1.
With solr 4.10.4:
- we extend BooleanQuery with a custom class in order to update the
coordination factor behaviour (coord method) but with solr 5.4.1 this
computation does not seem to be done by BooleanQuery anymore
- in order to u
It sounds a bit like you are just exploring Solr for the first time. To use
the Data Import Handler, you need to create an XML file that configures it,
data-config.xml by default.
But before we go into details, what are you trying to accomplish with Solr?
-Original Message-
From: Jean
Thank you both, those are exactly what I was looking for!
If I'm reading it right, if I specify a "-Dvmhost=foo" when starting
SolrCloud, and then specify a snitch rule like this when creating the
collection:
sysprop.vmhost:*,replica:<2
then this would ensure that on each vmhost there is at mo
> On Feb 2, 2016, at 8:57 AM, Elodie Sannier wrote:
>
> Hello,
>
> We are using solr 4.10.4 and we want to update to 5.4.1.
>
> With solr 4.10.4:
> - we extend PhraseQuery with a custom class in order to remove some
> terms from phrase queries with phrase slop (update of add(Term term, int
> po
Hello,
We are using solr 4.10.4 and we want to update to 5.4.1.
With solr 4.10.4:
- we extend PhraseQuery with a custom class in order to remove some
terms from phrase queries with phrase slop (update of add(Term term, int
position) method)
- in order to use our implementation, we extend Extende
Hello - i would opt for having a date field, and a custom update processor that
converts a string date via DateUtils.parseDate() to an actual Date object. I
think this would be a much simpler approach than a custom field or token filter.
Markus
-Original message-
> From:Miguel Valencia
Hi - there is no open issue on upgrading Tika to 1.11, but you can always open
one yourself.
Markus
-Original message-
> From:Giovanni Usai
> Sent: Tuesday 2nd February 2016 14:43
> To: solr-user@lucene.apache.org
> Subject: When does Solr plan to update its embedded Apache Tika versio
Hello,
We are using solr 4.10.4 and we want to update to 5.4.1.
With solr 4.10.4:
- we extend BooleanQuery with a custom class in order to update the
coordination factor behaviour (coord method) but with solr 5.4.1 this
computation does not seem to be done by BooleanQuery anymore
- in order to u
Hello,
I would gladly welcome the reply of the community on the following subject:
Until the last version (5.4.1) Solr is embedding Tika artifacts (in
contrib/extraction/lib) version 1.7 and dependent artifacts POI version
3.11.
Do you know when do you plan to update the version of Tika to a
nice tip. i appreciate it!
--
*John Blythe*
Product Manager & Lead Developer
251.605.3071 | j...@curvolabs.com
www.curvolabs.com
58 Adams Ave
Evansville, IN 47713
On Mon, Feb 1, 2016 at 4:55 PM, Erik Hatcher wrote:
> And if you want to have the “kept” words stored, consider the trick used
>
Hi Edwin,
Do you see any signs of network being bottleneck that would justify such
setup? I would suggest you monitor your cluster before deciding if you
need separate interfaces for external and internal communication.
Sematext's SPM (http://sematext.com/spm) allows you to monitor
SolrCloud,
Take a look at SolrStream and CloudSolrStream. These are available since
Solr 5.1 but the 6.0 release will greatly improved on the streaming
capabilities.
http://joelsolr.blogspot.com/2015/04/the-streaming-api-solrjio-basics.html
Joel Bernstein
http://joelsolr.blogspot.com/
On Tue, Feb 2, 2016 a
Hi everybody
I'm looking for a filter o similar function to resolve the next problem
in my solr index:
I have a string field that it contains a date but each record of this
field can be in diferent formats. Now I have to sort by this field and
for this I have to normalize this field. I've thou
no df, but hl.fl is *
and docId is string field.
On 2 February 2016 at 11:01, Zheng Lin Edwin Yeo
wrote:
> Do you have any setting for "df" and "hl.fl: under your /highlight request
> handler in your solrconfig.xml? And which version of Solr are you using?
>
> Regards,
> Edwin
>
> On 2 February
Hi Vincenzo,
That seems to be a great solution as well. We had actually tried to move all
our synonym files to the solr config file, but that did not work for us. I
think we can try moving it to our collection config and check as well.
Thanks for the input anyways :)
Janit
--
View this message
Agree, a better error message could help to resolve the problem in no time,
instead of forcing the user to double check every settings until you find the
wrong one.
Also the official page in the wiki is outdated because it refers to Solr4,
while for Solr5 you need to do some little modification
Hello,
I would like to use some code embedded on an analyser. The problem is that
I need to pass some parameters for initializing it. My though was to create
a plugin and initialize the parameters with the init( Map
args ) or init( NamedList args ) methods as explained in
http://wiki.apache.org/sol
42 matches
Mail list logo