Hi,
Spell check is not working for alphanumeric and numeric words.
For example, I have indexed alphanumeric words like
BL5C
BL4C
BL5F
Spellcheck for word "BL" does not suggest me any of above results.
I have same problem even with numeric words
For example, indexed words-
Nokia Lumia 520
Nokia
Hi Mike,
What is exact your use case?
What do mean by "controlling the fields used for phrase queries" ?
Rgds
AJ
> On 12-Dec-2014, at 20:11, Michael Sokolov
> wrote:
>
> Doug - I believe pf controls the fields that are used for the phrase queries
> *generated by the parser*.
>
> What I
RTFineM:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-CSVFormattedIndexUpdates
The default separator is ',' (a coma). If you want semicolon, you need
to use 'separator' parameter to tell Solr to do so. It's not quite
magic, esp
Hi. I tried setting up and running solr on a pc. Then I tried to index a
document that was semicolon delimited although it has a file extension of
.csv and got the following:
C:\Users\Owner\Downloads\SOLR\solr-4.10.2>java -classpath
dist/solr-core-4.10.2.jar -Dauto org.apache.solr.util.SimplePostT
Sounds like a bug report. Can you be very specific on what the broken
definition looked like. To replicate.
Regards,
Alex
On 12/12/2014 6:54 pm, "solr-user" wrote:
> I did find out the cause of my problems. Turns out the problem wasn't due
> to
> the solrconfig.xml file; it was in the schem
I did find out the cause of my problems. Turns out the problem wasn't due to
the solrconfig.xml file; it was in the schema.xml file
I spent a fair bit of time making my solrconfig closer to the default
solrconfig.xml in the solr download; when that didnt get rid of the error I
went back to the on
After I applied the LUCENE-2899.patch file to lucene-solr 4.10.2 release I
tried to run an ant compile persuant to the following directions under
'instillation' :
https://wiki.apache.org/solr/OpenNLP
And I received the following error indicating a dependency is missing - how
do I find that depend
The AMIs are Red Hat (not Amazon's) and the instances are properly sized
for the environment (t1.micro for ZK, m3.xlarge for Solr). I do plan to add
hooks for a clean shutdown of Solr when the VM is shut down, but if Solr
takes too long, AWS may clobber it anyway. One frustrating part of auto
scali
> The Solr leader should stop sending requests to the stopped replica once
> that replica's live node is removed from ZK (after session expiry).
Fwiw, here's the Zookeeper log entry for a graceful shutdown of the Solr
replica:
2014-12-12 15:04:21,304 [myid:2] - INFO [ProcessThread(sid:2
cport:81
: No, I wasn't aware of these. I will give that a try. If I stop the Solr
: jetty service manually, things recover fine, but the hang occurs when I
: 'stop' or 'terminate' the EC2 instance. The Zookeeper leader reports a
I don't know squat about AWS Auto-Scaling, (and barely anything about AWS)
Ted,
Thanks a lot, I had gone through your blogs but the white space issue
slipped out of my mind. "replaceWhitespaceWith" addressed the issue. I think
it's a great filter to have, surely takes care of an important use case.
Appreciate your help.
-Shamik
--
View this message in context:
ht
Ted, nice work on this filter. I happened to read the blog article
yesterday, it definitely addresses a common pain-point in a lot of
relevancy work. I had been doing something similar with a combination of
shingles and a keepwords list and my own query parser (or using
hon-lucene-synonyms)
Are th
Thanks!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Have-anyone-used-Automatic-Phrase-Tokenization-AutoPhrasingTokenFilterFactory-tp4173808p4174109.html
Sent from the Solr - User mailing list archive at Nabble.com.
Yes, I'll submit a pull-request back to the LucidWorks github. For the
specific replaceWhitespaceWith nothing issue look at my changesets:
https://github.com/jstrassburg/auto-phrase-tokenfilter/commit/a9450f2500d864539b3e5632c6cd47b283f3a481
and
https://github.com/jstrassburg/auto-phrase-tokenfilt
Hi James:
Could you send me the fix? I would be interested in merging this in for my
submission to Solr/Lucene, and any other bugs that you found would be much
appreciated.
Ted
--
View this message in context:
http://lucene.472066.n3.nabble.com/Have-anyone-used-Automatic-Phrase-Tokenization-A
Yes, actually that was one of the bugs I fixed so that we could
replaceWhitespaceWith nothing.
On Fri, Dec 12, 2014 at 3:34 PM, Ted Sullivan
wrote:
>
> Hi Shamik:
>
> One thing that might help is to use the "replaceWhitespaceWith" parameter
> of
> the QParserPlugin and in your index-time Autophra
Have a look at the documentation for the rootEntity attribute.
https://wiki.apache.org/solr/DataImportHandler
If you set it on the outer entity, I think it should give you what you
want with the nested entity structure. Then the outside entity will
load from the constant table and the inside from
Hi Shamik:
The link to the second blog post which discusses this problem is
https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
Ted
--
View this message in context:
http://lucene.472066.n3.nabble.com/Have-anyone-used-Automatic-Phra
Hi Shamik:
One thing that might help is to use the "replaceWhitespaceWith" parameter of
the QParserPlugin and in your index-time Autophrase TokenFilter. so in my
solrconfig.xml I have
autophrases.txt
_
then if in your fieldType in schema.xml if you have:
The reason for this
I'm on 4.8.1
On Fri, Dec 12, 2014 at 3:11 PM, shamik wrote:
>
> Jim,
>
> Thanks for your response. I've tried including
> AutoPhrasingTokenFilterFactory as part of the query analyzer, but didn't
> make any difference.
>
> positionIncrementGap="100">
>
>
Jim,
Thanks for your response. I've tried including
AutoPhrasingTokenFilterFactory as part of the query analyzer, but didn't
make any difference.
Ted,
Here's the query I'm using and the debug info. It's still returning all 5
results back as if it's simply looking for either of the term with q.op set
as OR (default).
http://localhost:8983/solr/autophrase?q=text:seat+cushions&wt=xml&debugQuery=true
Debug
text:seat cushions
text:seat
Also, Shamik:
I believe you need to configure the AutoPhrasingTokenFilterFactory in your
query analyzer for your text_autophrase field type.
JiM
On Fri, Dec 12, 2014 at 2:39 PM, James Strassburg
wrote:
>
> Hello,
>
> I've been using auto-phrasing. I believe it was my company's query to
> LucidW
Sorry I should have specified. These timeouts go inside the
section and apply for inter-shard update requests only. The socket and
connection timeout inside the shardHandlerFactory section apply for
inter-shard search requests.
On Fri, Dec 12, 2014 at 8:38 PM, Peter Keegan
wrote:
> Btw, are the
Hello,
I've been using auto-phrasing. I believe it was my company's query to
LucidWorks that got that initial implementation created.
In working with it I found a few issues and forked the repo and simplified
some code (where I didn't need features) and expanded the testing quite a
bit. I've got m
Okay, that should solve the hung threads on the leader.
When you stop the jetty service then it is a graceful shutdown where
existing requests finish before the searcher thread pool is shutdown
completely. A EC2 terminate probably just kills the processes and leader
threads just wait due to a lack
Btw, are the following timeouts still supported in solr.xml, and do they
only apply to distributed search?
${socketTimeout:0}
${connTimeout:0}
Thanks,
Peter
On Fri, Dec 12, 2014 at 3:14 PM, Peter Keegan
wrote:
> No, I wasn't aware of these. I will give that a try. If I stop the S
Hi Shamik:
Can you send me a JSON output using debugQuery=true so I can help
troubleshoot this?
As to the question about edismax features - yes I *think* so :) but it would
be great if you could give me some specific examples of queries as I am
currently writing the test cases for this. General
No, I wasn't aware of these. I will give that a try. If I stop the Solr
jetty service manually, things recover fine, but the hang occurs when I
'stop' or 'terminate' the EC2 instance. The Zookeeper leader reports a
15-sec timeout from the stopped node, and expires the session, but the Solr
leader n
Do you mean with inner entity something like
Yes that i could use. But i would use always the same entity in the
where clause of the sub-entity.
I would like to do something like
Do you have distribUpdateConnTimeout and distribUpdateSoTimeout set to
reasonable values in your solr.xml? These are the timeouts used for
inter-shard update requests.
On Fri, Dec 12, 2014 at 2:20 PM, Peter Keegan
wrote:
> We are running SolrCloud in AWS and using their auto scaling groups to sp
A couple of options:
1> physically copy the index over
2> (what I prefer) is to use the ADDREPLICA
command from the Collections API to bring
up a new node on the new machine as a replica
of one of your splits. It'll automatically synchronize,
and after it's done then shut down the original split.
Hello,
We have a 2 shards (S1, S2), 2 replica (R1, R2) setup (Solr Cloud) using
4.10.2 version. Each shard and replica resides on its own nodes (so, total
of 4 nodes).
As the data increased, we would like to split the shards. So, we are
thinking about creating 4 more nodes (2 for shards (S3, S4)
Anyone ?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Have-anyone-used-Automatic-Phrase-Tokenization-AutoPhrasingTokenFilterFactory-tp4173808p4174069.html
Sent from the Solr - User mailing list archive at Nabble.com.
Sounds like a case for nested entity definitions with the inner entity
being the one that's actually indexed? Just need to remember that all
the parent mapping is also applicable to all children.
Have you tried that?
Regards,
Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr r
Tom,
note about https://issues.apache.org/jira/browse/SOLR-6559 and
https://issues.apache.org/jira/browse/SOLR-3585. They seem relevant.
On Fri, Dec 12, 2014 at 7:31 PM, Tom Burton-West wrote:
> Thanks everybody for the information.
>
> Shawn, thanks for bringing up the issues around making sure
Hello,
i would like to load an entity before document import in DIH starts.
I want to use the entity id for a sub-select in the document entity.
Can i achieve something like that?
Thanks for helping me
Per
Thanks everybody for the information.
Shawn, thanks for bringing up the issues around making sure each document
is indexed ok. With our current architecture, that is important for us.
Yonik's clarification about streaming really helped me to understand one of
the main advantages of CUSS:
>>When
Hi everyone,
I am trying to use the Open NLP and Solr integration described here:
https://wiki.apache.org/solr/OpenNLP
I have followed all of the steps under the 'Instillation' component of the
wiki(I have completed running the ant test-contrib command), however, I do
not have an 'opennlp' folde
On Fri, Dec 12, 2014 at 5:31 PM, Shawn Heisey wrote:
> Using a database view that does the JOIN on the server side is pretty
> much guaranteed to have far better performance. Database software is
> very good at doing joins efficiently when proper DB indexes are
> available ... the dataimport han
Doug - I believe pf controls the fields that are used for the phrase
queries *generated by the parser*.
What I am after is controlling the fields used for the phrase queries
*supplied by the user* -- ie surrounded by double-quotes.
-Mike
On 12/12/2014 08:53 AM, Doug Turnbull wrote:
Michael,
gzcat may do the job by streaming as it expands.
Another option is to DataImportHandler and write a custom FileSystem
data source that will do expansion.
Regards,
Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstar
Searching is all about speed, and relevance calculations
can be very expensive. As Shawn says, when you explicitly
specify sort criteria you are, in effect, taking explicit control
of ranking so scores don't need to be calculated and that
expense can be avoided.
If you need score, just specify it
On 12/12/2014 5:16 AM, Tomoko Uchida wrote:
> I cannot find out your table structure and Solr schema,
> but if your requirement is too complex to handle by DIH, you could handle
> it by rich database functionality.
>
> I think creating a database view is good choice...
>
> (Of course, other exper
On 12/12/2014 3:49 AM, Michael Della Bitta wrote:
> I seem to remember being able to do something about errors with the
> handleError method, but I must have had to do it in a custom subclass to
> actually have visibility into what exactly went wrong.
Although it may be possible to override the ha
On 12/11/2014 11:56 PM, Sithik wrote:
> I have a compressed text file (gz) which holds tab delimited data. Is it
> possible for me to index this file directly without doing any pre
> processing of uncompressing the file on my own? if so, can you please tell
> me the steps/config changes I am suppos
On 12/11/2014 9:51 PM, eakarsu wrote:
> I am having difficulty with my sort function. With the following sort,
> documents are not sorted by score if you can see. Why sort function is not
> able to sort it properly?
I don't know why this is surprising. If you don't use the sort
parameter at all,
We are running SolrCloud in AWS and using their auto scaling groups to spin
up new Solr replicas when CPU utilization exceeds a threshold for a period
of time. All is well until the replicas are terminated when CPU utilization
falls below another threshold. What happens is that index updates sent t
Michael,
I typically solve this problem by using a copyField and running different
analysis on the destination field. Then you could use this field as pf
insteaf of qf. If I recall, fields in pf must also be mentioned in qf for
this to work.
-Doug
On Fri, Dec 12, 2014 at 8:13 AM, Michael Sokolov
Yes, I guess it's a common expectation that searches work this way. It
was actually almost trivial to add as an extension to the edismax
parser, and I have what I need now; I opened SOLR-6842; if there's
interest I'll try to find the time to contribute back to Solr
-Mike
On 12/11/14 5:20 PM,
I did not know about LUCENE-3080. (and take a look now)
So discussion seems to be beyond user mailing list, but thank you for your
information.
Regards,
Tomoko
2014-12-12 21:04 GMT+09:00 Pawel :
>
> Hi again,
> When I removed those lines from DefaultSolrHighlighter and rebuilt Solr it
> seems to
I cannot find out your table structure and Solr schema,
but if your requirement is too complex to handle by DIH, you could handle
it by rich database functionality.
I think creating a database view is good choice...
(Of course, other experts may have ideas using DIH?)
2014-12-12 20:43 GMT+09:0
Hi again,
When I removed those lines from DefaultSolrHighlighter and rebuilt Solr it
seems to work.
final SchemaField schemaField = schema.getFieldOrNull(fieldName);
if (schemaField != null && ((schemaField.getType() instanceof
org.apache.solr.schema.TrieField)
|| (schemaField.
Hi,
As Mike have pointed, there is no way to highlight numeric fields.
If you want to highlight, you have to index them as *text* field. It's not
about Solr, but Lucene.
(Maybe it's possible to "highlight" them in application layer, server side
program or client side JavaScript, rather than Solr?
Yes. two entities are child for the first one. I've gone through the link.
But what I can get out of the configuration given in that link is, I could
get an array for all the individual fields defined in the sub-entities.
For. e.g if my sub-entity has 3 fields name, id, desc. I'm getting a list
for
Thank you for config information.
Three tables have relation (by foreign key) ?
You might want to have one nested tag in rather than 3
one in .
By using nested tag, you may able to merge tables *before*
importing them to Solr. All works done by SQL.
You have already seen this wiki? If not, ex
Shawn:
I seem to remember being able to do something about errors with the
handleError method, but I must have had to do it in a custom subclass to
actually have visibility into what exactly went wrong.
On Dec 11, 2014 9:28 PM, "Shawn Heisey" wrote:
> On 12/11/2014 9:19 AM, Michael Della Bitta w
Hi,
Thanks for your response. Do you maybe have an idea how to handle integers
(even on low level - Lucene) in highlighter?
--
Paweł
On Fri, Dec 12, 2014 at 12:28 AM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:
>
> So the short answer to your original question is "no." Highlighting i
Thanks for your reply Tomoko. My data-config file looks like the below.
Each entity represents a table in DB. Now, If I want to join these three
tables, can I make use of the SOLR join functionality..
--
View this message in context:
http://lucene.472066.n3.nabble.com/Join-in-SOLR-
On Wed, Dec 10, 2014 at 10:12 PM, Tom Burton-West
wrote:
> I have very large XML documents, and the examples I see all build documents
> by adding fields in Java code. Is there an example that actually reads XML
> files from the file system?
>
Tom,
What's the possible architecture, can you let S
60 matches
Mail list logo