Thanks Erick for replying, actually i need to configure this for solrcloud
so i am confused here.
Both mcat and cat have different schema in my case.
With Regards
Aman Tandon
On Tue, May 20, 2014 at 6:05 AM, Erick Erickson wrote:
> You can actually just remove those entries from solr.xml (and a
Thanks Shawn for quick reply.
I am trying to change the code (removing the errors from the code shown in
image) & will test the filter after that & will update here.
Thanks
Kamal Kishore
On Mon, May 19, 2014 at 10:17 PM, Shawn Heisey wrote:
> On 5/19/2014 1:10 AM, Kamal Kishore Aggarwal wrote
I am using following field type with solr 4.2 & its working fine.
But, when I am upgrading solr to solr 4.7.1, it is reporting following
errors while posting docs:
Caused by: com.spatial4j.core.exception.InvalidShapeException:
java.lang.NumberFormatException: For input string: "78.42968,30.7333
You may want to review the series of articles about CJK support for
Solr: http://discovery-grindstone.blogspot.ca/ . Probably answers more
questions that you dare to ask.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating yo
SolrCloud configuration contains a single shard and 2 Solr servers, therefore
one acts as a leader and one as a replica.
Through a series of events(*) I've ended up with one Solr server being in
"Active" status and the leader of the shard while the other one in "Recovery
failed" status which canno
Thanks, Erick!
On Tue, May 20, 2014 at 3:55 AM, Erick Erickson wrote:
> This might be useful:
> http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/
>
> Best,
> Erick
>
> On Mon, May 19, 2014 at 12:09 AM, Dmitry Kan wrote:
> > Thanks, Jack, Alex and Shawn.
> >
> > This makes proper
Hi,
I have a special case of grouping multivalued fields and I wonder if
this is possible with SOLR.
I have a field "foo" that is generally multivalued. But for a restricted
set of documents this field has one value or is not present. So normally
grouping should work.
Sadly SOLR is failing
You may want to investigate the group.func option. This would allow you to
plug in your own logic to return the group by key. I don't think there is
an existing function that does exactly what you need so you may have to
write a custom function.
Joel Bernstein
Search Engineer at Heliosearch
On W
Just to re-emphasize the point - when provisioning Solr, you need to ASSURE
that the system has enough system memory so that the Solr index on that
system fits entirely in the OS file system cache. No ifs, ands, or buts. If
you fail to follow that RULE, all bets are off for performance and don't
Am 21.05.2014 15:07, schrieb Joel Bernstein:
You may want to investigate the group.func option. This would allow you to
plug in your own logic to return the group by key. I don't think there is
an existing function that does exactly what you need so you may have to
write a custom function.
I th
I have 2 cores.
One with active data and one with historical data (for documents which were
removed from the active one).
I want to run Distributed Search on both and get the unified result (as
supported by Solr Distributed Search, I'm not using Solr Cloud).
My problem is that the query for each
Hi,
What is the prefered way to do searches with date truncation with respect to a
specific time zone? The dates are stored correctly, ie I can see the UTC date
in the index and if I add 1 or 2 hours (depending on daylight saving time or
not) I get the time in our time zone (CET/CEST). But when
Well for CEST, which is 2 hours ahead, I would think you could just do...
datefield:[* TO NOW/MONTH-2HOURS]
That would give you everything up to 2014-04-30 22:00:00 GMT, which is
2014-05-01 00:00:00 CEST.
Always always always store the correct value.
-Michael
-Original Message-
From:
OK. Feels a bit strange that one would have to do this manual calculation in
every place that performs searches like this.
Would be much more logical if solr supported specifying the timezone in the
query (with a default setting being possible to configure in solrconfig), and
that solr itself di
On 5/21/2014 7:28 AM, Jack Krupansky wrote:
> Just to re-emphasize the point - when provisioning Solr, you need to
> ASSURE that the system has enough system memory so that the Solr index
> on that system fits entirely in the OS file system cache. No ifs,
> ands, or buts. If you fail to follow that
Two possibly unrelated things:
1. Don't commit until the end.
2. Consider not optimizing at all.
You might want to look at your autocommit settings in your solrconfig.xml.
You probably want soft commits set at something north of 10 seconds, and
hard commits set to openSearcher=false with a maxTi
How much memory have you allocated the JVMs? Also, what's does the
Solr log show on the machine that isn't coming up? Sounds like the
node went down and perhaps went into recovery
And how are you indexing?
Best,
Erick
On Tue, May 20, 2014 at 11:54 PM, Tim Burner wrote:
> Hi Everyone,
>
> I
then don't worry about cores. Use the collections API to create your
collections.
Note: you use the ZkCli script to push configuration files up to ZK
and give them a name, so you can have your "mcat" and "cat"
configurations. Then when you create the collection, you tell it which
set of configurat
What version of Solr?
On Wed, May 21, 2014 at 2:23 AM, zzT wrote:
> SolrCloud configuration contains a single shard and 2 Solr servers, therefore
> one acts as a leader and one as a replica.
>
> Through a series of events(*) I've ended up with one Solr server being in
> "Active" status and the le
I suppose you could, but I _really_ question whether it's a wise
investment in time. Personally I'd treat them as two different
collections and have the app layer fire off two queries and do the
aggregation (this is a variant of "federated search" I think). This
removes your issue with having the c
Hi,
Currently, I'm building my search as follows:
q=(search string ...) AND (type:type_a OR type:type_b OR type:type_c OR ...)
Which means anything I search for will be AND'ed to be in either fields that
have "type_a", "type_b", "type_c", etc. (I have defaultOperator set to "AND")
Now
is it posiible to boost values of the same field. For example in a query like
that:
category_id:(2271578^0.5 22718986^0.4 475101^0.2)
--
View this message in context:
http://lucene.472066.n3.nabble.com/boosting-multivalued-fields-tp4137409.html
Sent from the Solr - User mailing list archive at
Try the TZ parameter on the query, as blah&TZ=GMT-4
There's a good discussion of why PDT is ambiguous here:
https://issues.apache.org/jira/browse/SOLR-2690.
You can define whatever default parameters you want in your handler in
the section, the TZ parameter included.
Best
Erick
On Wed, May 21
Sending 32K docs at a time is a bit of overkill, I usually stay around 1,000.
I agree with Michael, it's rarely a Good Thing to do the commit in
line, except (perhaps) at the very end. Just let the autocommit
settings take care of it for you.
Your log shows numbers of "overlapping ondeck searcher
Great ! This solution worked for me
Jothi Sivathanupillai,
IT Programmer Analyst Principal I
_
Confidentiality Notice: This e-mail message, including any attachments, is for
the sole use of the intended recipient(s) and may in
On 5/21/2014 9:26 AM, johnmu...@aol.com wrote:
> Currently, I'm building my search as follows:
>
>
> q=(search string ...) AND (type:type_a OR type:type_b OR type:type_c OR
> ...)
>
>
> Which means anything I search for will be AND'ed to be in either fields that
> have "type_a", "type_b", "ty
How would you characterizer the differences that you see which you try
"q=search string ...&fq=type:(type_a OR type_b OR type_c OR ...)"? That does
look like the right way to do it. Is the count of documents different? Are
some documents missing or added? Or is it just the ordering of documents
Unfortunately the same query will be sent to all cores if you use the shards
parameter to query multiple cores.
Is there some characteristic of the first core that is distinct from the
second core so that you could OR the differences between the two?
-- Jack Krupansky
-Original Message--
Thanks erick for suggestion, yeah you is right i am doing some hurry to
implement it, i will invest some time to properly understand the working of
SolrCloud :)
With Regards
Aman Tandon
On Wed, May 21, 2014 at 8:38 PM, Erick Erickson wrote:
> then don't worry about cores. Use the collections AP
Answering Jack's question first: the result is different, by few counts, but I
found my problem:I was using the wrong syntax in my code vs. what I posted here:
I was using
q=(search string ...) AND (type:type_a OR type_b OR type_c OR ...)
(see how I left out "type:" from "type_b" and "t
: Try the TZ parameter on the query, as blah&TZ=GMT-4
Docs...
https://cwiki.apache.org/confluence/display/solr/Working+with+Dates
: There's a good discussion of why PDT is ambiguous here:
: https://issues.apache.org/jira/browse/SOLR-2690.
-Hoss
http://www.lucidworks.com/
Hi,
I have list of 1000 values for some field which is sort of id (essentially
unique between documents)
(let's say firstname_lastmane).
I need to get the document for each id (to know which document is for which id,
not just list of responses).
Is there some support for multiple queries in sin
Hello,
I am trying to use the collections API to add in a secondary collection
similar to what is described in this article
http://heliosearch.org/solrcloud-assigning-nodes-machines/
It works fine on a jetty setup where bootstrap config dir is specified.
When I try to run this on a tomcat solr that
Chris,
I wasn't able to reproduce the error with a stock install of the
HelioSearch Distribution for Solr (HDS). HDS runs under Tomcat.
Looking at the stack trace this looks like the problem:
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain
timed out: NativeFSLock@/ebs-d
Thanks! I totally forgot to add the word "math" (as in 'solr date math time
zone') when searching for a solution on this, so I never stumbled upon that
jira issue. Will giv it a try.
/Jimi
> -Ursprungligt meddelande-
> Från: Erick Erickson [mailto:erickerick...@gmail.com]
> Skickat: den
Hello,
Luke 4.8.0 has been released. Download it here:
https://github.com/DmitryKey/luke/releases/tag/4.8.0
The release tested against the solr-4.8.0 index.
This is a trivial update (maven, no code changes), but a required one:
index version changed since the last release.
This version has als
The whole point of a filter query is to hide data but without impacting the
scoring for the non-hidden data. A second goal is performance since the
filter query can be cached.
So, the immediate question for you is whether you really want a true filter
query, or if you actually do what the filt
Joel,
Yes we use that to force the data dir to be someplace other than solr.home.
Removing the data dir field just puts the data in solr home, which isn't
desirable.
solr home is specified in conf/Catalina/localhost/solr.xml
Is there any way to use the collections api with a data.dir set?
Th
Hi Jack,
I'm going after speed per:
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefq%28FilterQuery%29Parameter
If using "fq" ranking will now be different, I need to understand why. Even
more, I'm now wandering, which ranking is correct th
I think of it this way:
fq: select, do not score
q: select, score
bq/boost: do not select, score
This looks better in a 2x3 table.
wunder
On May 21, 2014, at 2:05 PM, "Jack Krupansky" wrote:
> The whole point of a filter query is to hide data but without impacting the
> scoring for the non-h
As I indicated in my original response, the fq query terms do not
participate in any way in the scoring of documents - they merely filter
(eliminate or keep) documents.
If you actually do want the fq terms to participate in the scoring of
documents, either keep them on the original q query, or
Nothing special for this use case.
This seems to be a use case that I would call "bulk data retrieval - based
on ID".
I would suggest "batching" your requests - limit each request query to, say,
50 or 100 IDs.
-- Jack Krupansky
-Original Message-
From: Pavel Belenkovich
Sent: Wed
Yes.
-- Jack Krupansky
-Original Message-
From: vit
Sent: Wednesday, May 21, 2014 11:20 AM
To: solr-user@lucene.apache.org
Subject: boosting multivalued fields
is it posiible to boost values of the same field. For example in a query
like
that:
category_id:(2271578^0.5 22718986^0.4 4
Interesting!! I did not know that using "fq" means the result will NOT be
scored.
When you say "add a boosting query using the bq parameter" can you give me an
example? I read on "bq" but could not figure out how to convert:
q=(searchstring ...) AND (type:type_a OR type:type_b OR type:ty
What I did to finally get this working was remove the
Catalina/localhost/solr.xml which was allowing the binary to be run from a
different location,
I made sure the solr.solr.home=/ebs-data/solr/conf was set to the file mount
I wanted and I then had
to remove the Dsolr.data.dir=/ebs-data/solr/data
The results will be scored, but only based on terms in q, not terms in fq.
-- Jack Krupansky
-Original Message-
From: johnmu...@aol.com
Sent: Wednesday, May 21, 2014 6:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Using fq as OR
Interesting!! I did not know that using "fq" means
Hi
Doing DIH to one of shards in my SolrCloud Colleciton.I notice that every
time do ing commit in the shard,all the other shards do commit too.
I have check the source code DistributedUpdateProcessor.processCommit ,it
said
that processCommit would extend to all the shard in the collection.
W
You can probably do a custom update request processor chain and skip
the distributed component. No idea of the consequences though.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency
On Thu, May 22, 2
HI,
I have a scenario where by I apply boosting in the following two cases
- Usual search, by user selection
- Keyword search. I have a field *keyword* that is copy/combination of many
fields
When user does the usual query, my boosting works fines, this is how I do
boosting
/select?q=featured:t
The *fq* is used for searching more deterministic results something like
WHERE type={}
Where as *q* is something like WHERE type like '%%'
user *fq*, if your are sure of what your going to search
use *q*, if not sure what your trying to search
If you are using fq and if you do not get any matchin
Just add the boost to the keyword: q=toyota^100.
Or, use the dismax or edismax query parsers and then the boost can be
specified for the field: qf=keyword^100.
-- Jack Krupansky
-Original Message-
From: manju16832003
Sent: Thursday, May 22, 2014 12:04 AM
To: solr-user@lucene.apache.
Has anyone had issues with indexing pdf files? Some pdfs are bringing down
Solr completely so that it actually needs to be manually restarted. We are
using Solr 4.4 and thought that upgrading to Solr 4.8 would solve the
problem because the release notes associated with the new tika version and
also
Hi Jack,
Thanks for your help.
I do not want to boost *keyword* field. I apply full text search no keyword
field and boost based on another field *featured*.
Also qf field allows us to boost the field without values. I would like to
boost with value
Ex: qf=featured:true^100 - I don't think this
Run Tika in a client instead? Or as a standalone server listening over
TCP socket). Ship only extractions to Solr. This is more efficient as
well.
I suspect, there would always be PDFs that cause strange behaviour,
even if just based on memory requirements (e.g. embedded images). If
that becomes a
Yeah, PDF extraction has always been at least somewhat problematic. It has
improved over the years, but still not likely to be perfect.
That said, I'm not aware of any specific PDF extraction issue that would
bring down Solr - as opposed to causing a 500 status with an exception in
PDF extract
Yes, there is.
But since the real query is very long and complex per core, I don't want each
core to work very hard on irrelevant query parts of other cores.
Perhaps I can write some query plugin which will strip the unnecessary parts on
each core?
Thanks,
Avner
-Original Message-
Fro
Your original message had "q=toyota featured:true^100" and also using bq -
both are valid. If either is not working for you, please be specific about
what exactly is not behaving as you expected - what the symptom is.
Sometimes you have to experiment with the boost factor.
-- Jack Krupansky
-
I believe unifying multiple query results including facets, paging, sorts and
other extra features on my own in the application is complex as well.
Is there some Solr code I can use in the application level to unify multiple
results? (this can be actually an interesting direction)
The queries wer
58 matches
Mail list logo