Hi. We have run into an interesting situation when searching for words that are
within double-quotes in our documents. For example, when we enter the following
search: promulgation AND peace
The document in question has this text exactly (with the double quotes): "The
Promulgation of Universal
The pattern you are using in the PatternTokenizerFactory does not
contain double quotes, so indexing the text "The Promulgation of
Universal Peace" will results in the following tokens : "The /
Promulgation / of / Universal / Peace", that's why Peace will not match
Peace".
On 02/26/2013 08:0
Hi,
Thanks, that seems to be the quickest way. But I did not get the part with
building a DisjunctionMaxQuery from the clauses. I would need to keep it as a
BooleanQuery, wouldn't I, and compare the weights from each clause and nullify
all but the max weight clause?
--
Jan Høydahl, search solu
Good suggestion. That would work for simple queries, but won't work in this
case, because the score is deep down a complex query tree and should not always
be applied on root level, only if other boolean conditions nearby match.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominv
Hi, the full stack trace is below.
-
SEVERE: Unable to create core: collection1
org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.(SolrCore.java:794)
at org.apache.solr.core.SolrCore.(SolrCore.java:607)
at
Could you check the virtual memory limit (ulimit -v, check this for the
operating system user that runs Solr).
It should report "unlimited".
André
Von: zqzuk [ziqizh...@hotmail.co.uk]
Gesendet: Dienstag, 26. Februar 2013 13:22
An: solr-user@lucene.apache
Still problems running from source, see
https://github.com/jmlucjav/vifun/issues/27
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com
26. feb. 2013 kl. 09:42 skrev jmlucjav :
> To anyone that has tested this and is having an error:
>
So, I have had great success as a SOLR user so far, love its ease of
set up. However, I have run into this problem (and this may be more
of a Tomcat error; if so, please point me in that direction).
My tomcat application containers have an older lucene library in the
common library "lib" folder (
I think this can be achieved by boosting the fields and then sorting by
the score.
http://wiki.apache.org/solr/SolrRelevancyFAQ#Field_Based_Boosting
On 02/26/2013 01:55 PM, David Philip wrote:
Hi Team,
Is it possible to get search results in the order of fields names set?
Ex: say,
Check out dismax (http://wiki.apache.org/solr/ExtendedDisMax)
q="John Hopkins"&defType=edismax&qf=Author^1000 Editors^500 Raw_text^1
It's not strictly layered, but by playing with the numbers you can achieve that
effect
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Spinning the discussion into a separate thread.
What is the actual business need here (and in the elasticshell project)? Do
we want for users to have something good for easy examples? Or do we want
some sort of command line REPL tool running against Solr instance that may
not even have admin handl
Thanks Amit, that's cool! So it will also be fixed on Solr 4.2, right?
On Mon, Feb 25, 2013 at 6:04 PM, Amit Nithian wrote:
> Yeah I had a similar problem. I filed and submitted this patch:
> https://issues.apache.org/jira/browse/SOLR-4310
>
> Let me know if this is what you are looking for!
> A
Hi,
Thank you for the references. I used edismax and it works. Thanks a lot.
David
On Tue, Feb 26, 2013 at 7:33 PM, Jan Høydahl wrote:
> Check out dismax (http://wiki.apache.org/solr/ExtendedDisMax)
>
> q="John Hopkins"&defType=edismax&qf=Author^1000 Editors^500 Raw_text^1
>
> It's not stric
My feeling so far is that there are lots of Solr users with fairly simple
and small scale setups where fully distributed SolrCloud system is not
actually needed. I'm sure things are constantly shifting and in 12 months
the same poll will yield more SolrCloud yes answers.
Otis
--
Solr & ElasticSea
hi,liwei
I think you'd better ask questions in english,or most people here may not
understand what you ask.
I'm confused with the
class:cn.antvision.eagleattack.nest.analyzer.CIKTokenizerFactory.What does it
exactly do?What extra functions do you add in this class?
If you are use IKAnalyzer's de
Hi,
Is there a way to drop slow query in the distributed search?
In another words, is there a way to tell SolrCloud to wait x ms for the
response from shards in the cloud and to return the results which were
returned during the specified period of time (x ms)?
For example X=10 ms. There are 4 s
Hi Colin,
I think a filter is definitely the way to go. Moreover, you should
look into Solr's PostFilter concept which is intended to work with
"expensive" filters. Have a look at Yonik's blog post on this topic:
http://yonik.com/posts/advanced-filter-caching-in-solr/
Cheers,
Tim
On Tue, Feb 26,
Hi,
I'm wondering if it's possible to use a default field on a filter query?
This is what I'm doing. (shortend example in solrJ)
String qry = "{!qJoin}" + searchText;
sq.setQuery(qry);
sq.setParam("df", "pageTxt");
sq.addFilterQuery("{!join fromInd
I'm exploring a switch from Lucene to Solr in a Java EE webapp.
We have a method called getHitIds() that accepts as a parameter a
Lucene "Query" object:
http://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/api/core/org/apache/lucene/search/Query.html
My IDE is telling me I can't simpl
Hi Phil,
It seems like you're treating Solr like a library and not a service.
Solr is a service that abstracts away all the lower-level Lucene work
like working with hits. If you want to work with hits, then you're
better to stay with Lucene.
So yes, you're going to need to change your client app
Hey Carlos,
What version of Solr are you running and what version of openxml4j did you
import?
Swati
-Original Message-
From: Carlos Alexandro Becker [mailto:caarl...@gmail.com]
Sent: Tuesday, February 26, 2013 12:04 PM
To: solr-user
Subject: Re: POI error while extracting docx documen
4.0.0 and 1.0-beta
On Tue, Feb 26, 2013 at 2:12 PM, Swati Swoboda
wrote:
> Hey Carlos,
>
> What version of Solr are you running and what version of openxml4j did you
> import?
>
> Swati
>
> -Original Message-
> From: Carlos Alexandro Becker [mailto:caarl...@gmail.com]
> Sent: Tuesday, Fe
Thanks, Tim, you've confirmed what I've suspected.
I appreciate the reality check. :)
Phil
On Tue, Feb 26, 2013 at 11:49 AM, Timothy Potter wrote:
> Hi Phil,
>
> It seems like you're treating Solr like a library and not a service.
> Solr is a service that abstracts away all the lower-level Luce
We want to treat all optional parameters as required. I thought setting:
q.op=AND and mm=100% (using edismax request handler) would achieve this
result, but I get unexpected results
1) medical => 4,425
2) medical retired => 272
3) medical retired -working => 5,041
4) medical AND retired AND -wor
Hi
sorry I couldnt do this directly... the way I do this is by subscribing to a
cluster of computers in our organisation and send the job with required
memory. It gets randomly allocated to a node (one single server in the
cluster) once executed and it is not possible to connect to that specific
no
Any ideas?
On Tue, Feb 26, 2013 at 2:15 PM, Carlos Alexandro Becker wrote:
> 4.0.0 and 1.0-beta
>
>
> On Tue, Feb 26, 2013 at 2:12 PM, Swati Swoboda > wrote:
>
>> Hey Carlos,
>>
>> What version of Solr are you running and what version of openxml4j did
>> you import?
>>
>> Swati
>>
>> -Orig
It really should be unlimited: this setting has nothing to do with how
much RAM is on the computer.
See http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
Mike McCandless
http://blog.mikemccandless.com
On Tue, Feb 26, 2013 at 12:18 PM, zqzuk wrote:
> Hi
> sorry I couldnt d
I've composed a stackoverflow question also..
On Tue, Feb 26, 2013 at 2:23 PM, Carlos Alexandro Becker wrote:
> Any ideas?
>
>
> On Tue, Feb 26, 2013 at 2:15 PM, Carlos Alexandro Becker <
> caarl...@gmail.com> wrote:
>
>> 4.0.0 and 1.0-beta
>>
>>
>> On Tue, Feb 26, 2013 at 2:12 PM, Swati Swobo
sorry:
http://stackoverflow.com/questions/15095202/extracting-docx-files-with-tika-in-apache-solr-gives-nosuchmethod-error
On Tue, Feb 26, 2013 at 2:40 PM, Carlos Alexandro Becker wrote:
> I've composed a stackoverflow question also..
>
>
>
> On Tue, Feb 26, 2013 at 2:23 PM, Carlos Alexandro Be
Hi Clint,
Nice to see you on this list!
What about treating each article as the indexed unit (i.e. each
article is a document) with structure:
articleID
publishDate
source
company_name
company_desc
contents
Then you can do grouping by company_name field.
I happen to know you're very familiar w
Tim, thanks for the response. I definitely owe you a beer next time you're
in Austin.
I hadn't thought of your approach of turning things around. But, I don't
think it will work because of some stuff I left out in my original email.
First, the relationship between Company and Article is many-to-ma
Hi.
I add documents to Solr by POSTing them to UpdateHandler, as bulks of
commands (DIH is not used).
If one document contains any invalid data (e.g. string data into numeric
field), Solr returns HTTP 400 Bad Request, and the whole bulk is failed.
I'm searching for a way to tell Solr to accept
I have a field (non tokenized) that I want to search and then highlight on:
Schema-
Query-
q=creates%20a%20new%20layer
qf=text_exact
fl=*
defType=edismax
hl=true
hl.fl=text_exact
hl.simple.pre=BBB
hl.simple.post=EEE
hl.fragsize=10
hl.snippets=1000
The highlighting comes back as tokenized,
Ok - I suspected grouping by company_name was too obvious here ;-)
A couple of tricks to think about (not claiming any of these will help) are:
1) Document transformer - you can return any company fields you need
in the response from a database lookup using a Document transformer.
This lets you a
Here's what I do to work-around failures when processing batches of updates:
On client side, catch the exception that the batch failed. In the
exception handler, switch to one-by-one mode for the failed batch
only.
This allows you to isolate the *bad* documents as well as getting the
*good* docum
I've done exactly the same thing. On error, set the batch size to one and try
again.
wunder
On Feb 26, 2013, at 12:27 PM, Timothy Potter wrote:
> Here's what I do to work-around failures when processing batches of updates:
>
> On client side, catch the exception that the batch failed. In the
>
Jan,
No, you wouldn't. Let's say that a BooleanQuery with SHOULD clauses is
equal to a DisjunctionMaxQuery with the same clauses up to scores i.e. you
can assert that they returns absolutely same documents, but with the
different scores (max vs sum).
Idea about dropping clauses' weights reminds m
Ideally you would want to use SOLRJ or other interface which can catch
exceptions/error and re-try them.
On Tue, Feb 26, 2013 at 3:45 PM, Walter Underwood wrote:
> I've done exactly the same thing. On error, set the batch size to one and
> try again.
>
> wunder
>
> On Feb 26, 2013, at 12:27 PM,
I tried to load the search index for the first time using full-import
yesterday. However, before it completed, the server came done for
maintenance. I had the unix admin delete the index and tlog directories
so I could start over. Now, when I key solr/admin I'm getting the
following error. Th
Here is the full error:
[2/26/13 15:08:17:762 EST] 002e SolrDispatchF I
org.apache.solr.servlet.SolrDispatchFilter init
SolrDispatchFilter.init()
[2/26/13 15:08:17:861 EST] 002e SolrResourceL I
org.apache.solr.core.SolrResourceLoader locateSolrHome Using JNDI
solr.home: /usr/local/pfs/
Hello all,
I am facing a problem on how to structure Sol cluster and indexes.
Current problem:
We developed an DMP (Data Management Platform) for online advertisement
purposes... and we are currently using Solr in order to index all the data
we collect and provide "ultra-fast user segmentation"
agree with darren here... setting up solr cloud is way too complicated ..
moreover if you are using tomcat. Do we have any ticket to simplify the
solr cloud installation ? I would love to include my suggestions in it.
Thanks
Varun
On Mon, Feb 25, 2013 at 7:24 PM, darren wrote:
> Ok. But its way
Most of my customers switch to 4.0 without jumping on SolrCloud. Reason being
that it currently is harder to set up. And when you have the need for one shard
with 1-2 replicas it's pretty simple with old-style.
I think when managing the Solr config in ZK becomes easier more will want to
migrate
If you add debugQuery=true to your search you will see how it is parsed:
1. +(text:medical)
2. +(((text:medical) (text:retired))~2)
3. +((text:medical) (text:retired)
-(text:working))
4. +(+(text:medical) +(text:retired)
-(text:working))
I think you're seeing an effect of
https://issues.apache
I need to write some tests which I hope to do tonight and then I think
it'll get into 4.2
On Tue, Feb 26, 2013 at 6:24 AM, Nicholas Ding wrote:
> Thanks Amit, that's cool! So it will also be fixed on Solr 4.2, right?
>
> On Mon, Feb 25, 2013 at 6:04 PM, Amit Nithian wrote:
>
> > Yeah I had a si
On Feb 26, 2013, at 4:35 PM, varun srivastava wrote:
> Do we have any ticket to simplify the
> solr cloud installation ? I would love to include my suggestions in it.
Please, throw some thoughts out on the list or start a new JIRA issue.
- Mark
On Feb 26, 2013, at 5:25 PM, varun srivastava wrote:
> Hi All,
> I have some questions regarding role of zookeeper in solrcloud runtime,
> while processing the queries .
>
> 1) Is zookeeper cluster referred by solr shards for processing every
> request, or its only used to copy config on startu
hi ,
Anyone know if this patch works in distributed environment and if is reliable?
https://issues.apache.org/jira/browse/SOLR-2242
Hi all,
I am using solrmarc + Vufind to index marc records.
solr version is 3.5.0
I am having issues trying to make solrmarc to suggest possible
alternative spellings when querying the database BUT using a non Latin
Based alphabet.
Quering *ounix* suggests *unix* alright
however when query
Hi Mark,
One more question
While doing solr doc update/add what information is required from zookeeper
? Can you tell what all information is stored in zookeeper other than the
startup configs.
Thanks
Varun
On Tue, Feb 26, 2013 at 3:09 PM, Mark Miller wrote:
>
> On Feb 26, 2013, at 5:25 PM, v
Thanks for the heads up. I"ll keep an eye on the issue. Based on the
comments, there's no workaround, I suppose?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Unexpected-Results-tp4043107p4043239.html
Sent from the Solr - User mailing list archive at Nabble.com.
ZooKeeper
/
/clusterstate.json - info about the layout and state of the cluster -
collections, shards, urls, etc
/collections - config to use for the collection, shard leader voting zk nodes
/configs - sets of config files
/live_nodes - ephemeral nodes, one per Solr node
/overseer - work queu
So does it means while doing "document add" the state of cluster is fetched
from zookeeper and then depending upon hash of docid the target shard is
decided ?
Assume we have 3 shards ( with no replicas) in which 1 went down while
indexing , so will all the documents will be routed to remaining 2 s
Is there any page following for solr cloud ?
http://wiki.apache.org/solr/SolrTomcat
Can we set -zkHost and -zkTimeout in
tomcat/webapps/solr/META_INF/context.xml
Thanks
Varun
On Tue, Feb 26, 2013 at 3:04 PM, Mark Miller wrote:
>
> On Feb 26, 2013, at 4:35 PM, varun srivastava
> wrote:
>
> >
On Feb 26, 2013, at 7:01 PM, varun srivastava wrote:
> Is there any page following for solr cloud ?
> http://wiki.apache.org/solr/SolrTomcat
>
Not that I know of. The main hitch with tomcat is that the hostPort in solr.xml
is setup to be set by the jetty.port system property. So you either ne
I dont like setting parameters as system properties, but I am happy if i
can setup these fields inside solr.xml . So you mean following config will
work
Thanks
Varun
On Tue, Feb 26, 2013 at 4:09 PM, Mark Miller wrote:
>
> On Feb 26, 2013, at 7:01 PM, varun srivastava
> wrote:
>
> > Is ther
On Feb 26, 2013, at 7:15 PM, varun srivastava wrote:
> I dont like setting parameters as system properties,
They are nice for the example, and often if you are using shell scripts or
something to manage your cluster when you are screwing around, but yeah, many
people will be happy to just put
Hi we have just upgraded our Dev lab with Solr 4.1 with no Cloud. so our
implementation is like
Master - Repeater - Slaves(2). In production we have large cluster so there
will be 8 slaves.
What observed is
1. Slave 1 replicating index from master show correct version number
2. Slave 2 replica
Hi Mark,
specifying zkHost in solr.xml is not working. It seems only system
property -DzkHost works. Can you confirm the param name is zkHost in
solr.xml ?
Thanks
Varun
On Tue, Feb 26, 2013 at 4:24 PM, Mark Miller wrote:
>
> On Feb 26, 2013, at 7:15 PM, varun srivastava
> wrote:
>
> > I dont
On Feb 26, 2013, at 8:17 PM, adityab wrote:
> the commit happening on Repeater after replication is increasing the version
> number as repeater
Yeah, there was a problem that is fixed in 4.2, and that commit no longer
happens. A couple other little bugs around master->slave were also fixed (an
Yup, unless there is some bug (I'm not seeing one visually looking at the
code), that should work:
zkHost = cfg.get("solr/@zkHost", null);
- Mark
On Feb 26, 2013, at 8:26 PM, varun srivastava wrote:
> Hi Mark,
> specifying zkHost in solr.xml is not working. It seems only system
> property
Hi Mark,
How to provide solr-plugin directory to solr collection. I have my plugins
in solr_home/lib directory but still collection creation command failing as
its not getting the plugin classes
(/solr/admin/collections?action=CREATE&name=europe-collection&numShards=2&replicationFactor=1)
Thanks
I see. It's set for the
On Feb 26, 2013, at 8:33 PM, Mark Miller wrote:
> Yup, unless there is some bug (I'm not seeing one visually looking at the
> code), that should work:
>
> zkHost = cfg.get("solr/@zkHost", null);
>
> - Mark
>
> On Feb 26, 2013, at 8:26 PM, varun srivastava wrote
You should create a new thread for a new question.
If it's in solr_home/lib, make sure that's the case on every node, and make
sure the shardLib option in solr.xml is turned on.
- Mark
On Feb 26, 2013, at 8:35 PM, varun srivastava wrote:
> Hi Mark,
> How to provide solr-plugin directory to so
thanks for the update Mark. looking fwd for this fix. Will try the trunk 4.x
Is it safe to assume that even with the incorrect version numbers the data
on slave (replicated from Repeater) is same as whats on master ? At-least
the files what i see in index dir are same.
As per our schedule we are
Yes, you can have the same indexes and different versions. The version is
basically a timestamp added to the index at commit time. If you just do a
commit, you can avoid changing data, but update the version. So if the files
look the same, you should be good.
- Mark
On Feb 26, 2013, at 9:25 PM
Hey guys,
I wanted to see who's running SolrCloud out there, and at what scales?
I'd start the thread off but I am merely at the R&D phases.
Cheers!
Tim
What I really miss in the SimpleFaceting component is the ability to get
facets not of the full term, but grouped by the first letter(s). I wrote a
Jira issue on this ( https://issues.apache.org/jira/browse/SOLR-4496). I
also wrote a patch with a rather simplistic first try of an implementation.
N
I'm now having a different problem. In my master-repeater-2slaves
architecture I have these generations versions:
Master: 29147
Repeater: 29147
Slaves: 29037
When I go to slaves logs it shows "Slave in sync with master". That is
apparently because if I do
http://localhost:17045/solr/replication?co
69 matches
Mail list logo