Tried my book? It should explain that. You can see the collections with
examples in GitHub:
https://github.com/arafalov/solr-indexing-book/tree/master/published
Start from collection1.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandre
Hi Erick,
Thanks a lot for the pointer.
I looked at the LowerCaseFilterFactory class [1] and it's parent abstract
class AbstractAnalysisFactory API [2] , and modified my custom filter
factory class as below;
public class ContentFilterFactory extends TokenFilterFactory {
public ContentFilterFacto
Hi All,I need your help to find a solution to one of the issue I am facing
with the keyword search.We have to provide a keyword search functionality,
on our website i.e. searching of a word will get you all the indexed
documents where a match is found for that word./ (Not specific to any
particular
Erick,
Thanks for replying :-).
If I were to do that, we are trying to set string value to int and Solr
throws an error.
Oh wait, I guess it works because Solr would automatically parse based on
the data type of the field. :-).
As I could see from the exception
java.lang.NumberFormatException.
Eric,
Just a question :-), wouldn't it be easy to use DIH to pull data from
multiple data sources.
I do use DIH to do that comfortably. I have three data sources
- MySQL
- URLDataSource that returns XML from an .NET application
- URLDataSource that connects to an API and return XML
Here is par
On 11/7/2013 6:40 PM, Bill Bell wrote:
So no Jetty 9 until Solr 5? Java 7 is at rel 40 Is that our commitment to
not require Java 7 until Solr 5?
Most people are probably already on Java 7...
Solr 4.x runs perfectly on Java 6 and has from day one. That doesn't
affect you, me, or the like
So no Jetty 9 until Solr 5? Java 7 is at rel 40 Is that our commitment to
not require Java 7 until Solr 5?
Most people are probably already on Java 7...
Bill Bell
Sent from mobile
> On Nov 7, 2013, at 1:29 AM, Furkan KAMACI wrote:
>
> Here is an issue points to that:
> https://issues.ap
Digging a bit more, I think I have answered my own questions. Can someone
please say if this sounds right?
http://wiki.apache.org/solr/LotsOfCores looks like a pretty good solution. If
I give each user his own shard, each query can be run in only one shard. The
effect of the filter query wil
On 11/7/2013 4:34 PM, Software Dev wrote:
I too want to be in control of everything that is created.
Here is what I'm trying to do.
1) Start up a cluster of 5 Solr Instances
2) Import the configuration to Zookeeper
3) Manually create a collection via the collections api with number of
shards an
I too want to be in control of everything that is created.
Here is what I'm trying to do.
1) Start up a cluster of 5 Solr Instances
2) Import the configuration to Zookeeper
3) Manually create a collection via the collections api with number of
shards and replication factor
Now there are some iss
On 11/7/2013 2:52 PM, Software Dev wrote:
Sorry about the confusion. I meant I created my config via the ZkCLI and
then I wanted to create my core via the CollectionsAPI. I *think* I have it
working but was wondering why there are a crazy amount of core names under
the admin "Core Selector"?
Whe
Well, the example you linked to is based on 3.6, and things have
changed assuming you're using 4.0.
It's probably that your ContentFilter isn't implementing what it needs to
or it's not subclassing from the correct class for 4.0.
Maybe take a look at something simple like LowerCaseFilterFactory
a
I'm trying to boost results slightly on a price (not currency) field that are
closer to a certain value. I want results that are not too expensive or too
inexpensive to be favored. Here is what we currently are trying:
bf=sub(1,abs(sub(15,price)))^0.2
where 15 is that "median" I want to boost
Right, the other thing to be wary of is special characters. It _might_
also have worked to escape the colon since that's a meta-character.
Quoting the string should be fine too
Best,
Erick
On Thu, Nov 7, 2013 at 1:07 PM, Jack Park wrote:
> Spoke too soon. Hacking rocks!
> Finally landed on
Hmmm, not really, you have to kind of take it on faith I'm afraid.
You can check the Solr logs and you should see messages about
cores unloading, but that's not very satisfactory.
Actually sounds like a JIRA. See SOLR-5430
On Thu, Nov 7, 2013 at 12:43 PM, Vinay B, wrote:
> As I understand it,
Thanks Michael, haven't tried that yet.
Anybody has suggestions on what might be the problem there? SOLR cache?
Disk&I/O? Something else..?
--roman
On Wed, Nov 6, 2013 at 9:41 PM, Michael Della Bitta <
michael.della.bi...@appinions.com> wrote:
> Wow, that's pretty weird. Have you tried tur
Hi Erick,
About the size of filter cache, previously we set it to 4,000.
After we faced this problem, we changed it to 10,000.
Still at size of 10,000 (always full), hitratio was 0.78 and "eviction" was as high as
"insertion".
About 100% Cpu, yes, it was Solr using it.
I profiled an app, it was
Thanks, that link is very helpful, especially the section, "Leapfrog, anyone?"
This actually seems quite slow for my use case. Suppose we have 10,000 users
and 1,000,000 documents. We search for "hello" for a particular user and let's
assume that the fq set for the user is cached. "hello" is
Sorry about the confusion. I meant I created my config via the ZkCLI and
then I wanted to create my core via the CollectionsAPI. I *think* I have it
working but was wondering why there are a crazy amount of core names under
the admin "Core Selector"?
When I create X amount of shards via the bootst
A few options:
1) Check what the response times are if you return only a small number of
fields from the query (e.g, just the "id" field). If the response times improve
greatly, you are probably returning some very long fields, and you may be able
to drop some of these from the query result.
2
On 11/7/2013 1:58 PM, Mark wrote:
If I create my collection via the ZkCLI
(https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities) how
do I configure the number of shards and replicas?
I was not aware that you could create collections with zkcli. I did
not think that was p
If I create my collection via the ZkCLI
(https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities) how
do I configure the number of shards and replicas?
Thanks
I'm trying to used a normalized score in a query as I described in a recent
thread titled "Re: How to get similarity score between 0 and 1 not relative
score"
I'm using this query:
select?qq={!edismax v='news' qf='title^2
body'}&scaledQ=scale(product(query($qq),1),0,1)&q={!func}sum(product(0.75,$s
I have a Solr Cloud setup with 220 million records.They are separated into 2
shards without any replica.I have not changed any caching and every setting
is a default one.In one case I have to return get top 5 candidates form
the Solr. The response time approximately 50 seconds which is too high
Hi All,
I'm a novice in Solr and I'm continuously bumping into problems with my
custom filter I'm trying to use for analyzing a fieldType during indexing
as below;
Below is my custom FilterFactory class;
*public class ContentFilterFactory extends TokenFilterFactory {*
* publ
Spoke too soon. Hacking rocks!
Finally landed on this heuristic, and it works:
resourceURL:"http://someotherserver.org/";
On Thu, Nov 7, 2013 at 9:52 AM, Jack Park wrote:
> Figuring out a google query to gain an answer seems difficult given
> the ambiguity;
>
> I have a field:
>
>
>
> into whic
You can, of course, us a function range query:
select?q=text:news&fq={!frange l=0 u=100}sum(x,y)
http://lucene.apache.org/solr/4_5_1/solr-core/org/apache/solr/search/FunctionRangeQParserPlugin.html
This will give you a bit more flexibility to meet your goal.
On Nov 7, 2013, at 7:26 AM, Erik Hat
David,
I find Mike McCandless’ blog article to be very informative. Give it a go and
let us know if you are still seeking clarification:
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
Jason
On Nov 7, 2013, at 5:09 AM, david.dav...@correo.aeat.es wrote:
> Hi,
>
Figuring out a google query to gain an answer seems difficult given
the ambiguity;
I have a field:
into which I store a URL
which, when displayed as a result of a query, looks like this in the
admin console:
"resourceURL": "http://someotherserver.org/";,
The query "resourceURL:*" will find a
As I understand it, the "lots of cores" features enables dynamic loading and
unloading of cores
This is how I set up my solr.xml for a test where I created more cores than
the transientCacheSize.
Here is a link to the config in case it doesn't format well via this post.
https://gist.github.com/ano
As I understand it, the "lots of cores" features enables dynamic loading
and unloading of cores
This is how I set up my solr.xml for a test where I created more cores than
the transientCacheSize.
Here is a link to the config in case it doesn't format well via this post.
https://gist.github.com/ano
Hi Eric,
Solr configuration can certainly be confusing at first. And for some time
after. :P
If you're running start.jar from the example folder (which is fine for
testing, and I've known some people to use it for production systems) then
the default solr home is example/solr. This contains solr
I would like to make a facet on a date field with the following tree:
2013
4.Quartal
December
November
Oktober
3.Quartal
September
August
Juli
2.Quartal
June
Mai
April
1. Quartal
March
February
January
2012 .
Same as above
So far I have this in solrconfig.xml:
Function queries score (all) documents, but don't filter them. All documents
effectively match a function query.
Erik
On Nov 7, 2013, at 1:48 PM, Peter Keegan wrote:
> Why does this function query return docs that don't match the embedded
> query?
> select?qq=text:news&q={!func}sum
Ah, thanks Markus. I think I'll just add the Boolean operators to the
stopwords list in that case.
Tom
On 7 November 2013 12:01, Markus Jelsma wrote:
> This is an ancient problem. The issue here is your mm-parameter, it gets
> confused because for separate fields different amount of tokens ar
Sorry if this is obvious (because it isn't for me)
I want to build a solr (4.5.1) + nutch (1.7.1) environment. I'm doing this on
amazon linux (I may put nutch on a separate server eventually).
Please let me know if my thinking is sound or off base
in the example folder are a lot of files and f
Hi,
I have an index very big, with 337 G more or less. I am using Solr 4.2.
The problem we have is related with the size of segments: this is the size
of the biggest ones:
324 G, 3.7G, 3.6 G, 1.6 G, 1.6 G, 465 M ... We have
LogByteSizeMergePolicy with 10 as MergeFactor in our solrconfig.
Reall
Why does this function query return docs that don't match the embedded
query?
select?qq=text:news&q={!func}sum(query($qq),0)
Hi;
I've written a patch to get Statistics from SolrCloud. However my
implementation was based on Solrj and after I got feedback from Shalin
Shekhar I come up to write a new patch that based on distributed search
components. I can add that capability and improve my patch with that.
--
Thanks;
Fur
Vikram:
An experiment I've found useful: Just comment out the
server.add() bit and run it. That won't index anything, but if
that's also slow then your problem is acquiring the data and
you know where to concentrate your efforts. I've seen this
be the problem with slow indexing more often than not
Hey Erick,
I have tried upping the timeouts quite a bit now, and have tried upping the
zkTimeout setting in Solr itself (I found a few old posts on the mailing list
suggesting this).
I realise this is a sort of weird situation, where we are actually trying to
work around some horrible hardware
First, please start a new thread when changing
topics, see "thread hijacking" here
http://people.apache.org/~hossman/#threadhijack
But do be aware that scores are NOT comparable
between different queries on the _same_ corpus.
A score of .75 on one query has no relation to a
score of .75 on another
Yeah, Solr's fq cache is pretty simple-minded,
order matters. There's no good way to improve
that except try to write your fq queries in the
same order. It's actually quite tricky to
disassemble/reassemble arbitrary queries to fix
this problem.
But in your case, you could write a custom query
comp
did you solve this eventually?
Aditya Sakhuja wrote
> How does one recover from an index corruption ? That's what I am trying to
> eventually tackle here.
>
> Thanks
> Aditya
>
> On Thursday, September 19, 2013, Aditya Sakhuja wrote:
>
>> Hi,
>>
>> Sorry for the late followup on this. Let me p
Your servlet container logs often have this number, or your
app can easily record them, I don't know of another way
to do that.
The variant here is that what's actually being reported is "QTime",
which is also exclusive of actually gathering up the data to put
in the return packet, it's just the r
Right, can you up your ZK timeouts significantly? It sounds like
your ZK timeout is short enough that when your system slows
down, the timeout is exceeded and it's throwing Solr
into a tailspin
See zoo.cfg.
Best,
Erick
On Tue, Nov 5, 2013 at 3:33 AM, Henrik Ossipoff Hansen <
h...@entertainm
Rob:
What I think you're missing is that you are responsible
for pulling the data from your separate sources and
pushing it to solr via an update command. You can
do this in SolrJ, PHP, or any other package that supports
a Solr client. You simply address your requests (both
update and query) to th
This is an ancient problem. The issue here is your mm-parameter, it gets
confused because for separate fields different amount of tokens are
filtered/emitted so it is never going to work just like this. The easiest
option is not to use the stopfilter.
http://lucene.472066.n3.nabble.com/Dismax-M
U, put a valid number in your default, not the empty string? LIke
default="5"
Best,
Erick
On Thu, Nov 7, 2013 at 2:57 AM, manju16832003 wrote:
> How do I set default value for int fields
> ex
>
> multiValued="false" default=""/>
>
> While indexing lets say if I have not set the value for m
Hi all,
Thanks for the help and advice I've got here so far!
Another question - I want to support stopwords at search time, so that e.g.
the query "oscar and wilde" is equivalent to "oscar wilde" (this is with
lowercaseOperators=false). Fair enough, I have stopword "and" in the query
analyser cha
You have several things here.
First, changing the number of replicas is easy, just create
another node and associate it with a shard of an existing
collection. See the shard= param on the solrcloud page
when creating nodes. If you don't specify a shard, it'll
just be assigned to one of the existin
Sorry about the reposts here, but I can't seem to get on the mailing list...
Hi
I've been trying to play around with block join queries in the Solr 4.5
release and I was wondering if anyone else has any experience doing this?
Basically I'm trying to create a parent->child->grandchild structure a
Hi All,
When following the above tutorial [1], to write a custom FilterFactory, I
had to extend TokenFilterFactory instead of BaseTokenFilterFactory as per
the API change in new lucene-analyzer-common library.
Below is my custom TokenFilterFactory class:
public class ContentFilterFactory extends
Hi,
I'm having a problem with one of my shards. Since yesterday, SOLR keeps
repeating the same exception over and over for this shard.
The webinterface for this SOLR instance is also not working (it hangs on the
Loading indicator).
Nov 7, 2013 9:08:12 AM org.apache.solr.update.processor.LogUpda
Thanks Anuj,
The jar containing the class can be found here :
http://www.java2s.com/Code/JarDownload/lucene/lucene-analyzers-common-4.2.0.jar.zip
On Thu, Nov 7, 2013 at 2:18 PM, Anuj Kumar wrote:
>
> http://stackoverflow.com/questions/13149627/where-did-basetokenfilterfactory-go-in-solr-4-0
>
>
Hi Rob,
mlti-core approach is different. You could have two cares lets say
marketing-core [Has its own schema.xml and data-config.xml]
magento-core [Has its own schema.xml and data-config.xml]
each core have their own schema.xml and data-config.xml
If you go by multi-core approach I guess you won'
http://stackoverflow.com/questions/13149627/where-did-basetokenfilterfactory-go-in-solr-4-0
On Thu, Nov 7, 2013 at 1:05 PM, Dileepa Jayakody
wrote:
> Hi All,
>
> I am writing a custom TokenFilter to post a token value to Apache Stanbol
> for enhancement. In this Custom TokenFilter I'm trying to
Here is an issue points to that:
https://issues.apache.org/jira/browse/SOLR-4839
2013/11/7 William Bell
> When are we moving Solr to Jetty 9?
>
> --
> Bill Bell
> billnb...@gmail.com
> cell 720-256-8076
>
58 matches
Mail list logo