Erick, thanks for you input. You are right that the "miraculous connection" is
not always that miraculous ;)
In your example the extraction is being done in the client side. But as I said,
I'd ideally like to put the burden of tika-extraction into the Solr-process.
All fields, but the file-cont
http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
?
Clone, not Copy.
Regards,
Alex.
P.s. I welcome private email feedback on that resource page as well :-)
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources a
Hi Craig,
Have you seen SOLR-4722 (https://issues.apache.org/jira/browse/SOLR-4722)?
This was my attempt at something similar.
Regards,
Tricia
On Fri, Sep 12, 2014 at 2:23 PM, Craig Longman wrote:
> In order to take our Solr usage to the next step, we really need to
> improve its highlighting
Hi Alex,
That seems fair. The only downside I can think of is that I have to include
the copyField URP in every request handler that imports data. Not convenient
but not a big problem either.
Do you happen to know the name of the URP that performs the copyField
functionality? I looked throu
If you do copyField equivalent in the request processor (there is a
URP for that) before the script one, you would then not need to do the
copyField in the schema. So, a move, not a duplicate.
Or are things more complicated than that?
Regards,
Alex.
Personal: http://www.outerthoughts.com/ and
In order to take our Solr usage to the next step, we really need to
improve its highlighting abilities. What I'm trying to do is to be able
to write a new component that can return the fields that matched the
search (including numeric fields) and the start/end positions for the
alphanumeric matche
I'm using the StatelessScriptUpdateProcessorFactory to run a script against
data as it is imported.
Some relevant pieces from solrconfig.xml:
data-config.xml
textCopyByLang
Right. Consider your situation where you have 12 shards on 12
machines. 1/12 of your index is stored on each, so by definition if
one of them is down you cannot get 1/12 of your data. shards.tolerant
is the only option here. Although I'm surprised shards.tolerant makes
things slow, perhaps there's
Hi thanks for feedback erik.
What do you mean all the nodes in shards are down? i have 12 shards and i
suppose i have 12 nodes right? correct me if i am wrong.
Now whenever any one of them is down , say 1 down 11 active still i am getting
same error... is there any way to get results ignorin
Solr allows customization to return query result through a message broker
such as ActiveMQ. How: define listener event as in my example. Sample use
case: searchable/sortable log display where the front end gets Solr entries
via messages through predefined topic of a message-broker (such as
ActiveMQ
Hi
Thank you very much. we make the change to low down the Heap size ,we are
watching the effect of this change.we will inform you about the result.
It is really helpful.
Best Regard
2014-09-12 23:00 GMT+08:00 Walter Underwood :
> I agree about the 80Gb heap as a possible problem.
>
> A GC i
Usually, I recommend
1> you configure autocommit on the server. Make it reasonable as in
multiple seconds.
2> do NOT commit from the client.
If you must commit from the client, then consider the
server.add(docs, commitwithin) call
Under no circumstances should you commit from the client IMO excep
bq: I could of course push in the filename(s) in a field, but this
would require Solr (due to field-type e.g. "filecontent") to extract
the content from the given file.
Why? If you're already dealing with SolrJ, you do all the work you
need to there by adding fields to a SolrInputDocument, includi
Hmmm, if all the nodes for a shard are down, shards.tolerant=true
shouldn't be slow unless there's some kind of bug. Solr should be
smart enough not to wait for a timeout. So I'm a bit surprised by that
statement, how sure of it are you? Do you have a test case?
bq: but this is slow and dont gives
John:
Glad it worked. Bit a little careful with large slops. As the slop
increases, you approach the same result set as
vis AND dis AND dur
so choosing the appropriate slop is something of a balancing act
Best,
Erick
On Fri, Sep 12, 2014 at 2:10 AM, John Nielsen wrote:
> I didn't know about s
On 9/12/2014 9:10 AM, Joshi, Shital wrote:
> We're updating Solr cloud from a java process using UpdateRequest API.
>
> UpdateRequest req = new UpdateRequest();
> req.setResponseParser(new XMLResponseParser());
> req.setParam("_shard_", shard);
> req.add(docs);
>
> We see too many searcher open
Hi,
We're updating Solr cloud from a java process using UpdateRequest API.
UpdateRequest req = new UpdateRequest();
req.setResponseParser(new XMLResponseParser());
req.setParam("_shard_", shard);
req.add(docs);
We see too many searcher open errors in log and wondering if frequent updates
from
I agree about the 80Gb heap as a possible problem.
A GC is essentially a linear scan of memory. More memory means a longer scan.
We run with an 8Gb heap. I’d try that. Test it by replaying logs from
production against a test instance. You can use JMeter and the Apache access
log sampler.
https
Hi,
Is there any semantic analyzer in solr?
On Sep 12, 2014 10:51 AM, "Sandeep B A" wrote:
> Hi All,
> Sorry for the delayed response.
> I was out of office for last few days and was not able to reply.
> Thanks for the information.
>
> We have a use case were one sentence is the unit token with
On 9/12/2014 7:36 AM, YouPeng Yang wrote:
> We build the SolrCloud using solr4.6.0 and jdk1.7.60 ,our cluster contains
> 360G*3 data(one core with 2 replica).
> Our cluster becomes unstable which means occasionlly it comes out long
> time full gc.This is awful,the full gc take long take that the
Hi
We build the SolrCloud using solr4.6.0 and jdk1.7.60 ,our cluster contains
360G*3 data(one core with 2 replica).
Our cluster becomes unstable which means occasionlly it comes out long
time full gc.This is awful,the full gc take long take that the solrcloud
consider it as down.
Normally full
Hi
We build the SolrCloud using solr4.6.0 and jdk1.7.60 ,our cluster contains
360G*3 data(one core with 2 replica).
Our cluster becomes unstable which means occasionlly it comes out long
time full gc.This is awful,the full gc take long take that the solrcloud
consider it as down.
Normally full
Thanks Alex,
> Do you just care about document content?
content only.
The documents (not necessarily coming from a Db) are being pushed (through
Solrj). This is at least the initial idea, mainly due to the dynamic nature of
our index/search architecture.
I could of course push in the filename(s
Basis Technology's toolset includes sentence boundary detectors. Please
contact me for more details.
On Fri, Sep 12, 2014 at 1:15 AM, Sandeep B A
wrote:
> Hi All,
> Sorry for the delayed response.
> I was out of office for last few days and was not able to reply.
> Thanks for the information.
>
Do you just care about document content? Not metadata, such as file
name, date, author, etc?
Does it have to be push into Solr or can be pull? If pull,
DataImportHandler should be able to do what you want with nested
entities design.
Regards,
Alex.
Personal: http://www.outerthoughts.com/ and @
In an ideal world, how often would you be running such cleanup and how
many documents would you expect to delete each time?
Regards,
Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community:
Looks like I haven't finished " I know"
I know I could extract the content on our server's side, but I'd really like to
take that burden of it.
That said:
Can I hand in the path-to-the-file in a "specific field" which would yield an
extraction in Solr?
-Ursprüngliche Nachricht-
Von: Cle
I'm working on a production system that is indexing user's interaction
events as documents in Solr index.
Each documents looks similar to: {user_id, event_data, update_time}
The index size increase monotonously over time and so documents need to be
deleted from the index in fixed intervals.
A re
Just a dumb question but how can i make solr cloud fault tolerant for queries ?
why i am asking this question because, i have 12 different physical server and
i am running 12 solr shards on that, whenever any one of them is going down
because of any reason it gives me below error, i have 3 zook
First of all I'd like to say hello to the Solr world/community ;) So far we
have been using Lucene as-is and now intend to go for Solr.
Say I have a document which in one field should have the content of a file
(indexed only, not stored), in order to make the document searchable due to the
fi
I'm working on a production system that is indexing user's interaction
events as documents in Solr index.
Each documents looks similar to: {user_id, event_data, update_time}
The index size increase monotonously over time and so documents need to be
deleted from the index in fixed intervals.
A req
I didn't know about sloppy queries. This is great stuff!
I solved it with a &qs=100.
Thank you for the help.
On Thu, Sep 11, 2014 at 11:36 PM, Erick Erickson
wrote:
> just skimmed, but:
>
> bq: I would get a hit for "vis dis dur", but "vis dur dis" no longer
> returns anything. This is not
I am out of the office until 09/15/2014.
I'll be out of the office this Friday, but I will be back on Monday.
Please contact Jason Brown for anything JAS Team related.
Note: This is an automated response to your message "Re: Is there any
sentence tokenizers in sold 4.9.0?" sent on 9/12/2014 1
The reason why I'm asking this is because I have no influence on the fields
that are indexed. The CMS automatically does that. So there is no way for me
to split up languages in seperate fields.
I can change the scheme.xml, but I don't know if there is a way to copy
fields into seperate language
Sorry for dumb question but how do you integrate ActiveMQ and Solr? What is
the purpos/use case?
Thanks, in advance,
Flavio
On Thu, Sep 11, 2014 at 11:00 PM, vivison wrote:
> Solr works fine with ActiveMQ provided a good solrconfig.xml. I was
> omitting
> the required property "java.naming.prov
35 matches
Mail list logo