Hi Dan,
1) Are you looking
for http://wiki.apache.org/solr/HighlightingParameters#hl.fragsize ?
2) Hundreds of words in a field should not be a problem for highlighting. But
it sounds like this long field may contain content that corresponds to N
different pages in a publication and you would
Hey Brendan, Hey Hector,
That was very helpful. :)
Thanks,
Shiv Deepak
On 17-Dec-2011, at 07:52 , Hector Castro wrote:
> Hi Shiv,
>
> For me, a combination of the following has helped me learn a lot about Solr
> in a short period of time:
>
> * Apache Solr 3 Enterprise Search Server:
> ht
Hi Shiv,
For me, a combination of the following has helped me learn a lot about Solr in
a short period of time:
* Apache Solr 3 Enterprise Search Server:
http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
* Solr Wiki: http://wiki.apache.org/solr/
* Pretty much every single pos
There is an update to that book for Solr 3:
http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
I actually bought it recently, but haven't looked at it yet.
Good luck.
Brendan
On Dec 16, 2011, at 9:01 PM, Shiv Deepak wrote:
> I am looking for a good book to read from and get a
I am looking for a good book to read from and get a better understanding of
solr.
On amazon, all the books on Solr have average rating (which I supposed no one
tried them or bothered to post a review) but this one: "Solr 1.4 Enterprise
Search Server by David Smiley, Eric Pugh" has a pretty dec
I am very very sorry. My mail client was not working from work and it looked
like it was not being delivered, that's why I tried a few times. Sorry
everybody!
-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
Sent: Friday, December 16, 2011 3:23 PM
To: solr-user
Right, are you falling afoul of the recursive shard thing? That is,
if you shards point back to itself. As far as I understand, your
shards parameter in your request handler shouldn't point back
to itself
But I'm guessing here.
Best
Erick
On Fri, Dec 16, 2011 at 4:27 PM, ku3ia wrote:
>>> OK
: So if we use some sort of weekly or daily sharding, there needs to be
: some mechanism in place to dynamically add the new shard when the
: current one fills up. (Which would also ideally know where to put the
: new shards on what server, etc.) Since SOLR does not implement that I
: was thi
Hi!
I have a solrconfig.xml like:
all
0
10
ABC
score desc,rating asc
CUSTOM FQ
2.2
CUSTOM FL
validate
CUSTOM ABC QUERY COMPONENT
stats
debug
all
0
1
XYZ
: The list would be unreadable if everyone spammed at the bottom their
: email like Otis'. It's just bad form.
If you'd like to debate project policy on what is/isn't acceptible on any
of the Lucene mailing lists, please start a new thread on general@lucene
(the list that exists precisely for
Hey Vikram,
I finally got around to getting Solr-RA installed but I'm having trouble
getting the NRT to work. Could you help me out?
I added these four lines immediately after in solrconfig.xml:
true
rankingalgorithm
true
rankingalgorithm
Is that correct? I also read something about
Maria: sending the same email 4 times in less the 48 hours isn't really a
good way to encourange people to help you -- it just means more total mail
people have to wade thorugh which slows them down and makes them less
likeely to want to help.
: In ABC QUERY COMPONENT, I customize prepare() an
Wow ... either i'm a huge idiot and everyone has just been really polite
about it in most threads, or something about this thread in particular
made me really stupid.
(Luis: i'm sorry for all the things i have said so far in this email
thread that were a complete waste of your time - hopefully
Hi!
I have a solrconfig.xml like:
all
0
10
ABC
score desc,rating asc
CUSTOM FQ
2.2
CUSTOM FL
validate
CUSTOM ABC QUERY COMPONENT
stats
debug
all
0
1
XYZ
I've been doing a fair amount of reading and experimenting with Solr
lately. I find that it does a good job of indexing very structured
documents. However, the application I have in mind is build around
long EPUB documents.
Of course, I found the Extract components useful for indexing the
EPUBs. H
>> OK, so your speed differences are pretty much dependent upon whether you
specify
>> rows=2000 or rows=10, right? Why do you need 2,000 rows?
Yes, big difference is 10 v. 2K records. Limit of 2K rows is setted by
manager and I can't decrease it. It is a minimum row count needed to process
data.
We still disagree.
On Fri, Dec 16, 2011 at 12:29 PM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:
> Ted,
>
> The list would be unreadable if everyone spammed at the bottom their
> email like Otis'. It's just bad form.
>
> Jason
>
> On Fri, Dec 16, 2011 at 12:00 PM, Ted Dunning
> wrote:
Ted,
The list would be unreadable if everyone spammed at the bottom their
email like Otis'. It's just bad form.
Jason
On Fri, Dec 16, 2011 at 12:00 PM, Ted Dunning wrote:
> Sounds like we disagree.
>
> On Fri, Dec 16, 2011 at 11:56 AM, Jason Rutherglen <
> jason.rutherg...@gmail.com> wrote:
>
On Fri, Dec 16, 2011 at 8:14 AM, Jamie Johnson wrote:
> What is the most appropriate way to configure Solr when deploying in a
> cloud environment? Should the core name on all instances be the
> collection name or is it more appropriate that each shard be a
> separate core, or should each solr i
: Chris, you replied:
:
: > : But there is a workaround:
: > : 1) Do a normal query without facets (you only need to request doc ids
: > : at this point)
: > : 2) Collect all the IDs of the documents returned
: > : 3) Do a second query for all fields and facets, adding a filter to
: > : restrict
OK, so your speed differences are pretty much dependent upon whether you specify
rows=2000 or rows=10, right? Why do you need 2,000 rows?
Or is the root question why there's such a difference when you specify
qt=requestShards? In which case I'm curious to see that request
handler definition...
Be
Sounds like we disagree.
On Fri, Dec 16, 2011 at 11:56 AM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:
> Ted,
>
> "...- FREE!" is stupid idiot spam. It's annoying and not suitable.
>
> On Fri, Dec 16, 2011 at 11:45 AM, Ted Dunning
> wrote:
> > I thought it was slightly clumsy, but it
Ted,
"...- FREE!" is stupid idiot spam. It's annoying and not suitable.
On Fri, Dec 16, 2011 at 11:45 AM, Ted Dunning wrote:
> I thought it was slightly clumsy, but it was informative. It seemed like a
> fine thing to say. Effectively it was "I/we have developed a tool that
> will help you so
Ha, sorry Hoss. Thought i hit user@nutch, gmail did the replace and I
wasn't paying attention.
-- Chris
On Fri, Dec 16, 2011 at 2:46 PM, Chris Hostetter
wrote:
>
> : http://wiki.apache.org/nutch/Crawl
> :
> : This script no longer works. See:
>
> If you have a question about something on the
: http://wiki.apache.org/nutch/Crawl
:
: This script no longer works. See:
If you have a question about something on the nutch wiki, or included in
the nutch release, i would suggest you email the nutch user list.
-Hoss
I thought it was slightly clumsy, but it was informative. It seemed like a
fine thing to say. Effectively it was "I/we have developed a tool that
will help you solve your problem". That is responsive to the OP and it is
clear that it is a commercial deal.
On Fri, Dec 16, 2011 at 10:02 AM, Jason
Hi, Erick, thanks for your reply
Yeah, you are right - document cache is default, but I tried to decrease and
increase values but I didn't get the desired result.
I tried the tests. Here are results:
>>1> try with "&rows=10"
successfully started at 19:48:34
Queries interval is: 10 queries per mi
http://wiki.apache.org/nutch/Crawl
This script no longer works. See:
echo "- Index (Step 5 of $steps) -"
$NUTCH_HOME/bin/nutch index crawl/NEWindexes crawl/crawldb crawl/linkdb \
crawl/segments/*
The "index" call doesn't existso what does this line get replaced
with? Is there an
Wow the shameless plugging of product (footer) has hit a new low Otis.
On Fri, Dec 16, 2011 at 7:32 AM, Otis Gospodnetic
wrote:
> Hi Yury,
>
> Not sure if this was already covered in this thread, but with N smaller cores
> on a single N-CPU-core box you could run N queries in parallel over small
Nice!
May be good to upload some screenshots there...
Otis
Performance Monitoring SaaS for Solr -
http://sematext.com/spm/solr-performance-monitoring/index.html
- Original Message -
> From: Alexander Valet | edelight
> To: solr-user@lucene.apache.org
> Cc:
> Sent: Thursday, De
That was a little confusing!
" there's always exactly one token at position 0."
Of course. What I meant to say was there is
always exactly one token in a non-tokenized
field and it's offset is always exactly 0. There
will never be tokens at position 1.
So asking to match phrases, which is based
Please start another thread and provide some details, there's not enough
information here to say anything. You might review:
http://wiki.apache.org/solr/UsingMailingLists
Best
Erick
On Thu, Dec 15, 2011 at 11:50 PM, Pawan Darira wrote:
> Thanks. I re-started from scratch & at least things have s
A side note: specifying qt and defType on the same query is probably
not what you
intend. I'd just omit the qt bit since you're essentially passing all
the info you intend
explicitly...
I see the same behavior when I specify a non-tokenized field in 3.5
But I don't think this is a bug since it do
We actually have a system that uses weekly shards but that is all .NET
(Lucene.NET) and has lots of code to manage adding new indexes. We want to
move to SOLR for performance and maintenance reasons.
So if we use some sort of weekly or daily sharding, there needs to be some
mechanism in plac
The thing that jumps out at me is "&rows=2000". If you documentCache in
solrconfig.xml is still the defaults, it only holds 512. So you're running
all over your disk gathering up the fields to return, especially since
you also specified "fl=*,score". And if you have large fields stored, you're
doin
Just to add to it, I'm using Suggester component to implement Auto Complete
http://wiki.apache.org/solr/Suggester
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-AutoComplete-Address-Search-tp3590112p3592017.html
Sent from the Solr - User mailing list archive at Nabble.c
Hi,
We've done a fair number of such things over the years. :)
If daily shards don't work for you, why not weekly or monthly?
Have a look at Zoie's Hourglass concept/code.
Some Solr alternatives are currently better suited to handle this sort of
setup...
Otis
Performance Monitoring SaaS fo
Hi,
>I'm using 3.2 because I can't get velocity to run on 3.5.
Maybe this is worth asking about in a separate thread or maybe you already
did that.
>I've changed my writeLockTimeout from 1000 to 1, and my
>commitLockTimeout from 1 to 5
>
>Running on a large ec2 box, which has
Hi Yury,
Not sure if this was already covered in this thread, but with N smaller cores
on a single N-CPU-core box you could run N queries in parallel over smaller
indices, which may be faster than a single query going against a single big
index, depending on how many concurrent query requests t
Hi Otis,
I'm using 3.2 because I can't get velocity to run on 3.5.
I've changed my writeLockTimeout from 1000 to 1, and my
commitLockTimeout from 1 to 5
Running on a large ec2 box, which has 2 virtual cores. I don't know how to
find out the # of concurrent indexer threads. Is that
Hi,
I used to think this, too, but have learned this not to be entirely true. We
had a customer with a query rate of a few hundred QPS and 32 or 64 GB RAM
(don't recall which any more) and a pretty large JVM heap. Most queries were
very fast, but once in a while a query would be very slow. G
Hi,
Hm, I don't know what this could be caused by. But if you want to get rid of
it, remote that Linux server our of the load balancer pool, stop Solr, remove
the index, and restart Solr. Then force replication and put the server back in
the load balancer pool.
If you use SPM (see link in my
Hi Eric,
And you are using the latest version of Solr, 3.5.0?
What is the timeout in solrconfig.xml?
How many CPU cores does the machine have and how many concurrent indexer
threads do you have running?
Otis
Performance Monitoring SaaS for Solr -
http://sematext.com/spm/solr-performance-m
You can disable stemming in a copy field. So you need to define one field
with your input data on which stemming will be done and the other field
(copy field), on which stemming will not be done. Then on the client you
can decide which field to search against.
Dmitry
On Fri, Dec 16, 2011 at 2:00
Hi,
I'm doing a lot reads and writes into a single solr server (on the
magnitude of 50ish per second), and have around 300,000 documents in the
index.
Now every 5 minutes I get this exception:
SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain
timed out: NativeFSLock@./solr/da
On 12/16/2011 5:57 AM, mechravi25 wrote:
I would like to know how can we disable the commit and optimize operation is
called by deafult after addition of few documents through dataimport
handlers.
Add this to the url you use to call the handler:
&commit=false&optimize=false
Thanks,
Shawn
What is the most appropriate way to configure Solr when deploying in a
cloud environment? Should the core name on all instances be the
collection name or is it more appropriate that each shard be a
separate core, or should each solr instance be a separate core (i.e.
master1, master1-replica are 2
Hi,
I would like to know how can we disable the commit and optimize operation is
called by deafult after addition of few documents through dataimport
handlers.
In our application, the master solr instance is used for indexing purpose
and the slave solr is for user search request. Hence the replic
Oh, yes on windows, using java 1.6 and Solr 1.4.1.
Ok let me try that one...
Thank you so much.
Regards,
Rajani
2011/12/16 Tomás Fernández Löbbe
> Are you on Windows? There is a JVM bug that makes Solr keep the old files,
> even if they are not used anymore. The files are going to be eventu
Are you on Windows? There is a JVM bug that makes Solr keep the old files,
even if they are not used anymore. The files are going to be eventually
removed, but if you want them out of there immediately try optimizing
twice, the second optimize doesn't do much but it will remove the old files.
On F
My full-data import stopped working all of a sudden. Afaik I have not made
any changes that would cause this.
The response is:
0
0
wedding-data-config.xml
full-import
busy
A command is still running...
0:6:4.112
1
0
0
0
2011-12-16 13:12:29
This response format is experimental. It is li
These parameters are commented in my solr config.xml
see the parameters attached.
When i do optimize on index of size 400 mb , it reduces the size of data
folder to 200 mb. But when data is huge it doubles it.
Why is that so?
Optimization : Actually should reduce the size of the dat
Maybe you are generating a snapshot of your index attached to the optimize ???
Look for post-commit or post-optimize events in your solr-config.xml
De: Rajani Maski [rajinima...@gmail.com]
Enviado el: viernes, 16 de diciembre de 2011 11:11
Para: solr-user@l
Hi All,
I am using Stemming in my solr , but i don't want to apply stemming always
for each search request. i am thinking of to disable stemming on one
specific query parser , can i do this?
Any help much appreciated.
Thanks in Advance
--
View this message in context:
http://lucene.472066.n3.
Hi,
When we do optimize, it actually reduces the data size right?
I have index of size 6gb(5 million documents). Index is already created
with commits for every 1 documents.
Now I was trying to do optimization with http optimize command. When i
did that, data size became - 12gb. Why th
55 matches
Mail list logo