Attached patch into the JIRA issue.
Reviews are welcome.
On Thu, Dec 19, 2013 at 7:24 PM, Isaac Hebsh wrote:
> Roman, do you have any results?
>
> created SOLR-5561
>
> Robert, if I'm wrong, you are welcome to close that issue.
>
>
> On Mon, Dec 9, 2013 at 10:50 PM
Roman, do you have any results?
created SOLR-5561
Robert, if I'm wrong, you are welcome to close that issue.
On Mon, Dec 9, 2013 at 10:50 PM, Isaac Hebsh wrote:
> You can see the norm value, in the "explain" text, when setting
> debugQuery=true.
> If the same item ge
created SOLR-5560
On Tue, Dec 10, 2013 at 8:48 AM, William Bell wrote:
> Sounds like a bug.
>
>
> On Mon, Dec 9, 2013 at 1:16 PM, Isaac Hebsh wrote:
>
> > If so, can someone suggest how a query should be escaped (securely and
> > correctly)?
> > Should I esc
my case is field aliasing of edismax.
consider this request, which sent to the example configuration:
http://localhost:8983/solr/collection1/select?defType=edismax&q.myalias.qf=text&q=myalias:1234&facet=true&facet.query=myalias:1234
the result is:
undefined field myalias
400
when disabling th
--roman
>
>
> On Mon, Dec 9, 2013 at 2:42 PM, Isaac Hebsh
> >
> wrote:
>
> > Hi Robert and Manuel.
> >
> > The DefaultSimilarity indeed sets discountOverlap to true by default.
> > BUT, the *factory*, aka DefaultSimilarityFactory, when called by
> &g
created SOLR-5542.
Anyone else want it?
On Thu, Dec 5, 2013 at 8:55 PM, Isaac Hebsh wrote:
> Hi,
>
> It seems that a facet query does not use the global query parameters (for
> example, field aliasing for edismax parser).
> We have an intensive use of facet queries (in some c
If so, can someone suggest how a query should be escaped (securely and
correctly)?
Should I escape the quote mark (and backslash mark itself) only?
On Fri, Dec 6, 2013 at 2:59 PM, Isaac Hebsh wrote:
> Obviously, there is the option of external parameter ({...
> v=$nestedq}&a
Hi Robert and Manuel.
The DefaultSimilarity indeed sets discountOverlap to true by default.
BUT, the *factory*, aka DefaultSimilarityFactory, when called by
IndexSchema (the getSimilarity method), explicitly sets this value to the
value of its corresponding class member.
This class member is initi
Obviously, there is the option of external parameter ({...
v=$nestedq}&nestedq=...)
This is a good solution, but it is not practical, when having a lot of such
nested queries.
Any ideas?
On Friday, December 6, 2013, Isaac Hebsh wrote:
> We want to set a LocalParam on a nested quer
We want to set a LocalParam on a nested query. When quering with "v" inline
parameter, it works fine:
http://localhost:8983/solr/collection1/select?debugQuery=true&defType=lucene&df=id&q=TERM1AND
{!lucene df=text v="TERM2 TERM3 \"TERM4 TERM5\""}
the parsedquery_toString is
+id:TERM1 +(text:term2 t
its
> broken.
>
> On Thu, Dec 5, 2013 at 1:53 PM, Isaac Hebsh wrote:
> > Hi,
> > we implemented a morphologic analyzer, which stems words on index time.
> > For some reasons, we index both the original word and the stem (on the
> same
> > position, of course).
&g
at 9:48 AM, Ahmet Arslan wrote:
> Hi Isaac,
>
> Did you consider omitting norms completely for that field? omitNorms="true"
> Are you using solr.RemoveDuplicatesTokenFilterFactory?
>
>
>
> On Thursday, December 5, 2013 8:55 PM, Isaac Hebsh
> wrote:
>
> Hi,
Hi,
It seems that a facet query does not use the global query parameters (for
example, field aliasing for edismax parser).
We have an intensive use of facet queries (in some cases, we have a lot of
facet.query for a single q), and the using of LocalParams for each
facet.query is not convenient.
D
Hi,
we implemented a morphologic analyzer, which stems words on index time.
For some reasons, we index both the original word and the stem (on the same
position, of course).
The stemming is done on a specific language, so other languages are not
stemmed at all.
Because of that, two documents with
Hi,
Try using facet.query on each part, you will get the number of total hits
for every OR.
If you need this info per document, the answers might appear when
specifying debug query=true.. If that info is useful, try adding
"[explain]" to fl param (probably requires registering the augmenter plugin
e for reading the index, or more
CPUs because the merging process might be more CPU intensive).
Isn't it possible?
On Wed, Oct 2, 2013 at 12:42 AM, Shawn Heisey wrote:
> On 10/1/2013 2:35 PM, Isaac Hebsh wrote:
>
>> Hi Dmitry,
>>
>> I'm trying to examine your s
Hi Dmitry,
I'm trying to examine your suggestion to create a frontend node. It sounds
pretty usefull.
I saw that every node in solr cluster can serve request for any collection,
even if it does not hold a core of that collection. because of that, I
thought that adding a new node to the cluster (ak
Hi Greg, Did you get an answer?
I'm interested in the same question.
More generally, what are the benefits of HdfsDirectoryFactory, besides the
transparent restore of the shard contents in case of a disk failure, and
the ability to rebuild index using MR?
Is the next statement exact? blocks of a p
Hi,
Trying to solve query performance issue, we suspect on the number of index
segments, which might slow the query (due to I/O seeks, happens for each
term in the query, multiplied by number of segments).
We are on Solr 4.3 (TieredMergePolicy with mergeFactor of 4).
We can reduce the number of se
ira/browse/SOLR-5053
What would you do?
On Tue, Sep 17, 2013 at 10:31 PM, Isaac Hebsh wrote:
> Hi everyone,
>
> We developed a TokenFilter.
> It should act differently, depends on a parameter supplied in the
> query (for query chain only, not the index one, of course).
>
Hi everyone,
We developed a TokenFilter.
It should act differently, depends on a parameter supplied in the
query (for query chain only, not the index one, of course).
We found no way to pass that parameter into the TokenFilter flow. I guess
that the root cause is because TokenFilter is a pure luce
Thanks Hoss.
1. We currently use Solr 4.3.0.
2. I understand this architecture of LazyFields, but i did not understand
why multiple LazyFields should be created for the multivalued field. You
can't load a part of them. If you request the field, you will get ALL of
its values. so 100 (or more) plac
Hi,
We've investigated a memory dump, which was taken after some frequent OOM
incidents.
The main issue we found was a lot of millions of LazyField instances,
taking ~2GB of memory, even though queries request about 10 small fields
only.
We've found that LazyDocument creates a LazyField object fo
Thanks to Ryan Ernst, my issue is duplicate of SOLR-4449.
I think that this proposal might be very useful (some supporting links are
attached there. worth reading..)
On Tue, Jul 30, 2013 at 11:49 PM, Isaac Hebsh wrote:
> Hi,
> I submitted a new JIRA for this:
> https://issues.apache
know that's where the request is sent out.
>
> I'd think that would be better than changing Solr itself
> since if you found that this was useful you wouldn't
> be patching your Solr release, just keeping your client
> up to date.
>
> Best
> Erick
>
&g
olution or not.. :)
On Sun, Jul 28, 2013 at 1:06 AM, Shawn Heisey wrote:
> On 7/27/2013 3:33 PM, Isaac Hebsh wrote:
> > I have about 40 shards. repFactor=2.
> > The cause of slower shards is very interesting, and this is the main
> > approach we took.
> > Note that in e
y to figure out _why_
> you so often have a slow shard and whether the problem could
> be cured with, say, better warming queries on the shards...
>
> Best
> Erick
>
> On Fri, Jul 26, 2013 at 8:23 AM, Isaac Hebsh
> wrote:
> > Hi!
> >
> > When SolrClound
Hi!
When SolrClound executes a query, it creates shard requests, which is sent
to one replica of each shard. Total QTime is determined by the slowest
shard response (plus some extra time). [For simplicity, let's assume that
no stored fields are requested.]
I suffer from a situation where in every
Hi,
There was a thread about viewing Solr Wiki offline, About 6 months ago. I'm
intersted, too.
It seems that a manual (cron?) dump will do the work...
Would it be too much to ask that one of the admins will manually create
such a dump? (http://moinmo.in/HelpOnMoinCommand/ExportDump)
Otis, is t
.
>
>
> You could try with higher solr versions too. If it does not work, please
> lets us know.
>
>
> https://issues.apache.org/jira/secure/attachment/12579832/ComplexPhrase-4.2.1.zip
>
>
>
> ____
> From: Isaac Hebsh
> To: solr-use
terms of whether you wanted
> to use these for production.
>
> I confess I don't know what state they were left in or why they were
> never committed.
>
> FWIW,
> Erick
>
> On Wed, Jun 19, 2013 at 10:08 AM, Isaac Hebsh
> wrote:
> > Hi,
> >
> >
Hi,
I'm trying to understand what is the status of enabling wildcards on phrase
queries?
Lucene JIRA issue: https://issues.apache.org/jira/browse/LUCENE-1486
Solr JIRA issue: https://issues.apache.org/jira/browse/SOLR-1604
It looks like these issues are not going to be solved in the close future
Hi everyone,
My SolrCloud cluster (4.3.0) has came into production a few days ago.
Docs are being indexed into Solr using "/update" requestHandler, as a POST
request, containing text/xml content-type.
The collection is sharded into 36 pieces, each shard has two replicas.
There are 36 nodes (each
n Tue, May 28, 2013 at 7:08 AM, Isaac Hebsh wrote:
> I don't want to affect on the (correctness of the) real query parsing, so
> creating a QParserPlugin is risky.
> Instead, If I'll parse the query in my search component, it will be
> detached from the real query parsin
be at the same place,
> or just above, the wildcard processor
>
> also make sure you are setting your qparser for FQ queries, ie.
> fq="{!nw}foo"
>
>
> On Mon, May 27, 2013 at 5:01 PM, Isaac Hebsh
> wrote:
>
> > Thanks Roman.
> > Based on some
> raise error etc
>
> this way, you are changing semantics - but don't need to touch the syntax
> definition; of course, you may also change the grammar and allow only one
> instance of wildcard (or some combination) but for that you should probably
> use LUCENE-5014
>
Hi.
Searching terms with wildcard in their start, is solved with
ReversedWildcardFilterFactory. But, what about terms with wildcard in both
start AND end?
This query is heavy, and I want to disallow such queries from my users.
I'm looking for a way to cause these queries to fail.
I guess there i
Hi everyone..
I'm indexing docs into Solr using the update request handler, by POSTing
data to the REST endpoint (not SolrJ, not DIH).
My indexer should return an indication, whether the document existed in the
collection before or not, based in its ID.
The obvious solution is the perform a query
p a query in a way that leverages Solr's field
> type analysis settings, but it is a technologically possible technique
> maybe worth considering.
>
> Erik
>
>
>
> On May 16, 2013, at 16:38 , Isaac Hebsh wrote:
>
> Hi,
>>
>> I'm trying to use Surround Qu
Hi,
I'm trying to use Surround Query Parser for two reasons, which are not
covered by proximity slops:
1. find documents with two words within a given distance, *unordered*
2. given two lists of words, find documents with (at least) one word from
list A and (at least) one word from list B, within
Let's say you have machine A and machine B. you want to shutdown B.
If all the shards on B have replicas (on A), you can shutdown B instantly.
If there is a shard on B that has no replica, you should create one on
machine A (using Core API), let it replicate the whole shard contents, and
then you a
Hi Tim,
Are you running Solr 4.2? (In 4.0 and 4.1, the Collections API didn't
return any failure message. see SOLR-4043 issue).
As far as I know, you can't tell Solr to use authentication credentials
when communicating other nodes. It's a bigger issue.. for example, if you
want to protect the "/up
Hi,
The example schema.xml in Solr 4.2 does not define "id" field
as docValues=true.
Any good reason? (other than backward compat for index for previous
version...)
If my common case is fl=id (and no other field), DocValues is classic for
me. Am I right?
Hi,
I'm trying to monitor some Solr behaviour, using JMX.
It looks like a great job was done there, but I can't find any
documentation on the MBeans themselves.
For example, DirectUpdateHandler2 attributes. What is the difference
between "adds" and "cumulative_adds"? Is "adds" count the last X se
if exists).
This solution exactly covers my case. Thank you!
On Wed, Feb 20, 2013 at 11:33 PM, Isaac Hebsh wrote:
> Nobody responded my JIRA issue :(
> Should I commit this patch into SVN's trunk, and set the issue as Resolved?
>
>
> On Sun, Feb 17, 2013 at 9:26 PM, Isaac He
Hi.
I add documents to Solr by POSTing them to UpdateHandler, as bulks of
commands (DIH is not used).
If one document contains any invalid data (e.g. string data into numeric
field), Solr returns HTTP 400 Bad Request, and the whole bulk is failed.
I'm searching for a way to tell Solr to accept
Nobody responded my JIRA issue :(
Should I commit this patch into SVN's trunk, and set the issue as Resolved?
On Sun, Feb 17, 2013 at 9:26 PM, Isaac Hebsh wrote:
> Thank you Alex.
> Atomic Update allows you to "add" new values into multivalued field, for
> example... It
Thank you Alex.
Atomic Update allows you to "add" new values into multivalued field, for
example... It means that the original document is being read (using
RealTimeGet, which depends on updateLog).
There is no reason that the list of operations (add/set/inc) will not
include a "create-only" operat
7 AM, Upayavira wrote:
> I think what Walter means is make the thing that sends it to Solr set
> the timestamp when it does so.
>
> Upayavira
>
> On Sat, Feb 16, 2013, at 08:56 PM, Isaac Hebsh wrote:
> > Hi,
> > I do have an externally-created timestamp, but some minut
tem? I think an external
> create timestamp would be a lot more useful.
>
> wunder
>
> On Feb 16, 2013, at 12:37 PM, Isaac Hebsh wrote:
>
> > I opened a JIRA for this improvement request (attached a patch to
> > DistributedUpdateProcessor).
> > It's my firs
I opened a JIRA for this improvement request (attached a patch to
DistributedUpdateProcessor).
It's my first JIRA. please review it...
(Or, if someone has an easier solution, tell us...)
https://issues.apache.org/jira/browse/SOLR-4468
On Fri, Feb 15, 2013 at 8:13 AM, Isaac Hebsh wrote:
them as a MUST clause, like
> +(original query) +id:(1 2 3 4).
>
> Third possibility, see https://issues.apache.org/jira/browse/SOLR-2429,
> but
> the short form is:
> fq={!cache=false}restoffq
>
>
> On Mon, Feb 11, 2013 at 2:41 PM, Isaac Hebsh
> wrote:
>
> > Hi e
Hi everyone.
I have queries that should be bounded to a set of IDs (the uniqueKey field
of my schema).
My client front-end sends two Solr request:
In the first one, it wants to get the top X IDs. This result should return
very fast. No time to "waste" on highlighting. this is a very standard
query
Shawn, what about 'flush to disk' behaviour on MMapDirectoryFactory?
On Fri, Feb 8, 2013 at 11:12 AM, Prakhar Birla wrote:
> Great explanation Shawn! BTW soft commited documents will be not be
> recovered on JVM crash.
>
> On 8 February 2013 13:27, Shawn Heisey wrote:
>
> > On 2/7/2013 9:29 PM,
Small addition:
To support query, I probably have to implement an analyzer (query time)...
An analyzer can be configured on numeric (i.e non TEXT) field?
On Thu, Feb 7, 2013 at 6:48 PM, Isaac Hebsh wrote:
> Hi.
>
> I have to index field which contains an IP address.
> Users want t
2/4/2013 12:06 PM, Isaac Hebsh wrote:
>
>> LBHttpSolrServer is only solrj feature.. doesn't it?
>>
>> I think that Solr does not balance queries among cores in the same server.
>> You can claim that it's a non-issue, if a single core can completely serve
>
es nothing. I feel that we can achieve some improvement in this
case...
On Mon, Feb 4, 2013 at 12:45 AM, Shawn Heisey wrote:
> On 2/3/2013 3:24 PM, Isaac Hebsh wrote:
>
>> Thanks Shawn for your quick answer.
>>
>> When using collection name, Solr will choose the leader
ithreading works well here, Is utilizing all
the cores would not be useful?
On Sun, Feb 3, 2013 at 11:49 PM, Shawn Heisey wrote:
> On 2/3/2013 1:18 PM, Isaac Hebsh wrote:
>
>> Hi.
>>
>> I have a SolrCloud cluster, which contains some servers. each server runs
>> multip
nd the boost is pretty
> impressive (roughly 2-5x faster for a complicated query)
>
> Ming
>
>
> On Mon, Jan 28, 2013 at 10:54 AM, Isaac Hebsh
> wrote:
>
> > Does adding replicas (on additional servers) help to improve search
> > performance?
> >
> >
You can define a security filter in WEB-INF\web.xml, on specific url
patterns.
You might want to set the url pattern to "/admin/*".
[find examples here:
http://stackoverflow.com/questions/7920092/how-can-i-bypass-security-filter-in-web-xml
]
On Sun, Jan 27, 2013 at 8:07 PM, Mingfeng Yang wrote:
http://sematext.com/
> On Jan 23, 2013 2:53 PM, "Isaac Hebsh" wrote:
>
> > Hi,
> >
> > In my use case, Solr have to to return only the "id" field, as a response
> > for queries. However, it should return 1000 docs at once (rows=1000).
> &g
flagpole and try it. Rely on the OS to do its job
> (http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html).
> Find a bottleneck _then_ tune. Premature optimization and all
> that
>
> Several tens of millions of docs isn't that large unless the text
> field
it, so you won't see the new indexed data and caches
> wont be flushed. openSearcher=false makes sense when you are using
> hard-commits together with soft-commits, as the "soft-commit" is dealing
> with opening/closing searchers, you don't need hard commit
ransaction log to
> > assure index integrity. Not to mention that your tlog will be huge.
> > Not to mention that there is some memory usage for each document in
> > the tlog. Hard commits roll over the tlog, flush the in-memory tlog
> > pointers, close index segments, etc.
&g
64 matches
Mail list logo