Sure, I think it's fine to raise a JIRA, especially if you can include
a patch, even a preliminary one to solicit feedback... which I'll
leave to people who are more familiar with that code...
I'm not sure how generally useful this would be, and if it comes
at a cost to normal searching there's su
That should be the case.
Best,
Erick
On Tue, Aug 25, 2015 at 8:55 PM, Modassar Ather wrote:
> Thanks Erick,
>
> I saw the link. So is it that the grouping functionality works fine in
> distributed search except the two cases mentioned in the link?
>
> Regards,
> Modassar
>
> On Tue, Aug 25,
Hello,
I indexed 2 million documents and after completing indexing. I
tried for searching. It throws IOException and Connection Timeout Error.
"error":{
"msg":"org.apache.solr.client.solrj.SolrServerException:
IOException occured when talking to server at:
http://192.168.1.25:8983/so
On Wed, 2015-08-26 at 10:10 +0800, Zheng Lin Edwin Yeo wrote:
> I'm currently trying out on the Carrot2 Workbench and get it to call Solr
> to see how they did the clustering. Although it still takes some time to do
> the clustering, but the results of the cluster is much better than mine. I
> thin
Thanks Erick,
I saw the link. So is it that the grouping functionality works fine in
distributed search except the two cases mentioned in the link?
Regards,
Modassar
On Tue, Aug 25, 2015 at 10:40 PM, Erick Erickson
wrote:
> That's not really the case. Perhaps you're confusing
> group.ngroups a
Can you elaborate a bit more on the setup, what do the custom plugins do,
what error do you get ? It seems like a classloader/classpath issue to me
which doesn't really relate to Shard splitting.
On Tue, Aug 25, 2015 at 7:59 PM, Jeff Courtade
wrote:
> I am getting failures when trying too split
I am getting failures when trying too split shards on solr 4.2.7 with
custom plugins.
It fails regularily it cannot find the jar files for plugins when creating
the new cores/shards.
Ideas?
--
Thanks,
Jeff Courtade
M: 240.507.6116
Looks like I have something basic working for Trie fields. I am doing
exactly what I said in my previous email, so good news there. I think this
is a big step as there are only a few field types left that I need to
support, those being date (should be similar to Trie) and Spatial fields,
which at
Hi Toke,
Thank you for your reply.
I'm currently trying out on the Carrot2 Workbench and get it to call Solr
to see how they did the clustering. Although it still takes some time to do
the clustering, but the results of the cluster is much better than mine. I
think its probably due to the differe
When using the new Suggester component (with AnalyzingInfixSuggester) in
Solr trunk with solrj, the suggest.build command seems to be executed only
on one of the solr cloud nodes.
I had to add shards.qt=/suggest and
shards=host1:port2/solr/mycollection,host2:port2/solr/mycollection... to
distribut
: updates? i can't do this because i have delta-import queries which also
: should be able to assign uuid when it is needed
You really need to give us a full and complete picture of what exactly you
are currently doing, what's working, what's not working, and when it's not
working what is it
Right, I had assumed (obviously here is my problem) that I'd be able to
specify payloads for the field regardless of the field type. Looking at
TrieField that is certainly non-trivial. After a bit of digging it appears
that if I wanted to do something here I'd need to build a new TrieField,
overr
Well, you're going down a path that hasn't been trodden before ;).
If you can treat your primitive types as text types you might get
some traction, but that makes a lot of operations like numeric
comparison difficult.
H. another idea from left field. For single-valued types,
what about a side
Hmmm, this sounds like a nonsensical question, but "what do you mean
by arbitrary substring"?
Because if your substrings consist of whole _tokens_, then ngramming
is totally unnecessary (and gets in the way). Phrase queries with no slop
fulfill this requirement.
But let's assume you need to march
Hi
I'm trying to build an index for technical documents that basically
works like "grep", i.e. the user gives an arbitray substring somewhere
in a line of a document and the exact matches will be returned. I
specifically want no stemming etc. and keep all whitespace, parentheses
etc. because they
On Tue, Aug 25, 2015 at 5:02 PM, O. Klein wrote:
> I'm trying to find the best way to search for stores that are open NOW.
It's probably not the *best* way, but assuming it's currently 4:10pm,
you could do
+open:[* TO 1610] +close:[1610 TO *]
And to account for days of the week have different f
We were originally using this approach, i.e. run things through the
KeywordTokenizer -> DelimitedPayloadFilter -> WordDelimiterFilter. Again
this works fine for text, though I had wanted to use the StandardTokenizer
in the chain. Is there an equivalent filter that does what the
StandardTokenizer
Have you seen:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3c1354991310424-4025359.p...@n3.nabble.com%3E
https://wiki.apache.org/solr/SpatialForTimeDurations
https://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/
Regards,
Alex.
Solr Analyzers
I'm trying to find the best way to search for stores that are open NOW.
I have day of week, open and closing times.
I've seen some examples, but not an exact fit.
What is the best way to tackle this?
Thank you for any suggestions you have to offer.
--
View this message in context:
http
> So, usually, the middleware is the answer, just like with a database.
With applications backed by database systems, there is usually an application
server tier, and then a database tier. There may be a web server tier in
front of the application server tier.The search engine and database
Thanks Hoss! It's obvious what the problem(s) are when you lay it all out
that way.
On Tue, Aug 25, 2015 at 12:14 PM, Chris Hostetter
wrote:
>
> 1) The "terms" Query Parser (TermsQParser) has nothing to do with the
> "TermsComponent" (the first is for quering many distinct terms, the
> later is
The Lucene PMC is pleased to announce the release of the Solr Reference
Guide for Solr 5.3.
This 577 page PDF is the definitive guide for using Apache Solr and can be
downloaded from:
https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/
If you have
Well, if i remember correctly (i have no testing facility at hand)
WordDelimiterFilter maintains payloads on emitted sub terms. So if you use a
KeywordTokenizer, input 'some text^PAYLOAD', and have a DelimitedPayloadFilter,
the entire string gets a payload. You can then split that string up agai
1) The "terms" Query Parser (TermsQParser) has nothing to do with the
"TermsComponent" (the first is for quering many distinct terms, the
later is for requesting info about low level terms in your index)
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermsQueryParse
Oh My. What fun!
bq: I need a way to specify the payload on the other field types
Not to my knowledge. The payload mechanism is built on
the capability of having a filter in the analysis chain. And there's
no analysis chain with primitive types (string, numeric and the like).
Hmmm. Totally off t
Hi,
We've encountered a strange situation, I'm hoping someone might be able to
shed some light. We're using Solr 4.9 deployed in Tomcat 7.
We build a query that has these params:
'params'=>{
'fl'=>'id',
'sort'=>'system_create_dtsi asc',
'indent'=>'true',
'start'=>'0',
Hello,
I'm trying to guess what are you doing. It's not clear so far.
I found http://stackoverflow.com/questions/11951695/embedded-solr-dih
My conclusion, if you play with DIH and EmbeddedSolrServer you'd better to
avoid the third beast, you don't need to bother with tests.
I guess that main() is
To be clear, we are using payloads as a way to attach authorizations to
individual tokens within Solr. The payloads are normal Solr Payloads
though we are not using floats, we are using the identity payload encoder
(org.apache.lucene.analysis.payloads.IdentityEncoder) which allows for
storing a by
What you want to do is basically named entity recognition. We have a quite
similar use case (medical/scientific documents, need to look for disease
names /drug names /MeSH terms, etc).
Take a look at David Smiley's Solr Text Tagger (
https://github.com/OpenSextant/SolrTextTagger ) which we've been
That's not really the case. Perhaps you're confusing
group.ngroups and group.facet with just grouping?
See the ref guide:
https://cwiki.apache.org/confluence/display/solr/Result+Grouping#ResultGrouping-DistributedResultGroupingCaveats
Best,
Erick
On Tue, Aug 25, 2015 at 4:51 AM, Modassar Ather
This really sounds like an XY problem. Or when you use
"payload" it's not the Solr payload.
So Solr Payloads are a float value that you can attach to
individual terms to influence the scoring. Attaching the
_same_ payload to all terms in a field is much the same
thing as boosting on any matches in
I would like to specify a particular payload for all tokens emitted from a
tokenizer, but don't see a clear way to do this. Ideally I could specify
that something like the DelimitedPayloadTokenFilter be run on the entire
field and then standard analysis be done on the rest of the field, so in
the
Hey Guys,
I wanted to create a simple, attributed based food recommender with solr.
The User makes his choice concerning ingredients, cooking time, difficulty
and so on.
It is based on a SQL database where the recipes are stored.
So, for example the user likes tomatoes, then the recipes with toma
You could also look at an integrated product such as DataStax Enterprise
which fully integrates the Cassandra database and Solr - you execute your
database transactions in Cassandra and then DSE Search automatically
indexes the data in the embedded version of Solr.
See:
http://www.datastax.com/pro
Zach,
As an alternative to 'copyField', you might want to consider the
CloneFieldUpdateProcessorFactory:
http://lucene.apache.org/solr/5_0_0/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
It supports specification of field names with regular expressions,
exclusion
We use CAS as well, and are also not using ZooKeeper/SolrCloud. We may move
to SolrCloud after getting our current very-basic setup into production.
We'll definitely take a look at the rule-based authorization plugin and see how
we can leverage that.
-Original Message-
From: LeZotte, T
On 8/25/2015 3:18 AM, Modassar Ather wrote:
> Kindly help me understand the query time allowed attribute. The following
> is set in solrconfig.xml.
> 30
>
> Does this setting stop the query from running after the timeAllowed is
> reached? If not is there a way to stop it as it will occupy resou
On Tue, Aug 25, 2015, at 01:21 PM, Simer P wrote:
> http://stackoverflow.com/questions/32138845/what-is-the-best-approach-to-guarantee-commits-in-apache-solr
> .
>
> *Question:* How can I get "guarantee commits" with Apache SOLR where
> persisting data to disk and visibility are both equally imp
UUIDUpdateProcessorFactory - "An update processor that adds a newly
generated UUID value to any document being added that does not already have
a value in the specified field."
See:
http://lucene.apache.org/solr/5_2_1/solr-core/org/apache/solr/update/processor/UUIDUpdateProcessorFactory.html
-- J
The standard answer is that exposing the API is a REALLY bad idea. To
start from, you can issue the delete commands through the API. And
they can be escaped in multiple different ways.
Plus, you have admin UI there as well to manipulate the cores as well
as to see the configuration files for them.
Is there an example of integration-testing with EmbeddedSolrServer that loads
data from a data importhandler - then queries the data? Ive tried doing this
based on
org.apache.solr.client.solrj.embedded.TestEmbeddedSolrServerConstructors.
But no data is being imported. Here is the test-class iv
Hello,
I plan to expose Solr search REST API to the world, so it can be called
from my web page directly, without additional server layer. I'm
concerned about bots, so I plan to add CAPTCHA to my page. Surely, I'd
like to do it with as little effort as possible. Does Solr provide
CAPTCHA support o
Max,
If you know the entire list of words you want to spellcheck against, you can
use FileBasedSpellChecker. See
http://wiki.apache.org/solr/FileBasedSpellChecker .
If, however, you have a field you want to spellcheck against but also want
additional words added, consider using a copy of the
http://stackoverflow.com/questions/32138845/what-is-the-best-approach-to-guarantee-commits-in-apache-solr
.
*Question:* How can I get "guarantee commits" with Apache SOLR where
persisting data to disk and visibility are both equally important ?
*Background:* We have a website which requires high
Hi Erick,
Up to now, all the tests were based on randomly generated requests.
In reality, many requests will get executed more than twice since this is to
support the advertising project. On the other hand, new queries could be
generated daily. So some of the filter queries will be used frequent
I am honestly not familiar enough to say. Best to try it
On Aug 25, 2015 7:59 AM, "CrazyDiamond" wrote:
> It sounds like you need to control when the uuid is and is not created,
> just feels like you'd get better mileage doing this outside of solr
> Can I simply insert a condition(blank or not )
It sounds like you need to control when the uuid is and is not created,
just feels like you'd get better mileage doing this outside of solr
Can I simply insert a condition(blank or not ) in uuid update-chain?
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-prevent-u
Hi,
As per my understanding, to group on a field all documents with the same
value in the field have to be in the same shard.
Can we group by a field where the documents with the same value in that
field will be distributed across shards?
Please let me know what are the limitations, feature not a
It sounds like you need to control when the uuid is and is not created,
just feels like you'd get better mileage doing this outside of solr
On Aug 25, 2015 7:49 AM, "CrazyDiamond" wrote:
> Why not generate the uuid client side on the initial save and reuse this on
> updates? i can't do this beca
Why not generate the uuid client side on the initial save and reuse this on
updates? i can't do this because i have delta-import queries which also
should be able to assign uuid when it is needed
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-prevent-uuid-field-ch
Thanks for your response Jonathon.
Please correct me if I am wrong in following points.
-query actually ceases to run once time allowed is reached and releases
all the resources.
-query expansion is stopped and the query is terminated from execution
releasing all the resources.
Thanks,
Moda
Why not generate the uuid client side on the initial save and reuse this on
updates?
On Aug 25, 2015 4:22 AM, "CrazyDiamond" wrote:
> i have uuid field. it is not set as unique, but nevertheless i want it not
> to
> be changed every time when i call /update. it might be because i added
> request
I had seen this as well, if I over wrote this by extending
SolrIndexSearcher how do I have my extension used? I didn't see a way that
could be plugged in.
On Aug 25, 2015 7:15 AM, "Mikhail Khludnev"
wrote:
> On Tue, Aug 25, 2015 at 2:03 PM, Jamie Johnson wrote:
>
> > Thanks Mikhail. If I'm rea
On Tue, Aug 25, 2015 at 2:03 PM, Jamie Johnson wrote:
> Thanks Mikhail. If I'm reading the SimpleFacets class correctly, out
> delegates to DocValuesFacets when facet method is FC, what used to be
> FieldCache I believe. DocValuesFacets either uses DocValues or builds then
> using the Uninverti
timeAllowed applies to the time taken by the collector in each shard
(TimeLimitingCollector). Once timeAllowed is exceeded the collector terminates
early, returning any partial results it has and freeing the resources it was
using.
From Solr 5.0 timeAllowed also applies to the query expansion ph
Thanks Mikhail. If I'm reading the SimpleFacets class correctly, out
delegates to DocValuesFacets when facet method is FC, what used to be
FieldCache I believe. DocValuesFacets either uses DocValues or builds then
using the UninvertingReader.
I am not seeing a clean extension point to add a cust
Hi,
Kindly help me understand the query time allowed attribute. The following
is set in solrconfig.xml.
30
Does this setting stop the query from running after the timeAllowed is
reached? If not is there a way to stop it as it will occupy resources in
background for no benefit.
Thanks,
Modass
i have uuid field. it is not set as unique, but nevertheless i want it not to
be changed every time when i call /update. it might be because i added
requesthandler with name "/update" which contains uuid update срфшт .But if
i not do this i have no uuid at all.May be i can config uuid update-chain
On Tue, 2015-08-25 at 10:40 +0800, Zheng Lin Edwin Yeo wrote:
> Would like to confirm, when I set rows=100, does it mean that it only build
> the cluster based on the first 100 records that are returned by the search,
> and if I have 1000 records that matches the search, all the remaining 900
> rec
59 matches
Mail list logo