Thanks Ron. Actually, I'm developing a Web search engine. Would that
matter?
Thanks.
2010/2/16 Ron Chan
>
> I'd doubt if a performance benchmark would be very useful, it ultimately
> depends on what you are trying to do and what you are comfortable with.
>
> We've had successful deployments on
: > I belive Koji was mistaken. looking at DocumentBuilder.toDocument, the
: > boosts have been propogated to copyField destinations since that method was
: > added in 2007 (initially it didn't deal with copyfields at all, but once
: > that was fixed it copied the boosts as well.)
...
: Hm
Hi,
Solr home: 1.3.0/examples/multicore
Type of Queries: Recursive e.g. I search in the index for some name that
returns some rows. For each row there is a field called parentid which is a
unique key for some other row in the index. The next queries search the
index for the parentid . This continue
I've actually run into this issue; huge, 30 minute warm up times. I've
found that reducing the auto-warm count on caches (and the general size
of the cache) helped a -lot-, as did making sure my warm up query wasn't
something like:
q=*:*&facet=true&facet.field=somethingWithAWholeLotOfTerms
T
On Wed, Feb 17, 2010 at 8:03 AM, Chris Hostetter
wrote:
>
> : I have a small worry though. When I call the full-import functions, can
> : I configure Solr (via the XML files) to make sure there are rows to
> : index before wiping everything? What worries me is if, for some unknown
> : reason, we h
These are some very large numbers. 700k ms is 70 seconds, 4M ms is 4k
seconds or 66 minutes. No Solr installation should take this long to
warm up.
There is something very wrong here. Have you optimized lately? What
queries do you run to warm it up? And, the basics: how many documents,
how much da
Norms are generally not calculated. You need to change the field you
want with this attribute: omitNorms="false".
On Tue, Feb 16, 2010 at 2:38 PM, Ahmet Arslan wrote:
>> After getting aware of all
>> these combinations, it seems not
>> wise to proceed blindly by punushing what ever we want.
>> Th
This is the CheckIndex program in Lucene. I don't have a link handy
for running it, but it is in the lucene-core jar file in solr/lib.
On Tue, Feb 16, 2010 at 11:08 AM, dipti khullar wrote:
> Hi All
>
> Is there any tool to analyze corrupted data in Solr. I am aware of luke.
> But does it shows s
When you change an index you do not have to copy the entire index
again. The new part of the index is in separate files and the
replication code knows to only pull the differences.
Indexing on a master and copying to slaves works very well - there are
thousands of Solr installations using that tec
The data copied from title to content is exactly the strings that you
give. The data is copied around, then each field is analyzed. Changing
'title' from text to string makes no difference.
On Mon, Feb 15, 2010 at 6:48 AM, adeelmahmood wrote:
>
> I am just trying to understand the difference betw
You can add a static content container to the jetty example. This is
a patch against example/etc/jetty.xml. You then make a directory
example/webapp/ROOT. This works the same as ROOT in tomcat:
http://localhost:8983/image.png comes from webapp/ROOT/image.png. It
is static and the files are not cop
A problem is that your profanity list will not stop growing, and with
each new word you will want to rescrub the index.
We had a thousand-word NOT clause in every query (a filter query would
be true for 99% of the index) until we switched to another
arrangement.
Another small problem was that I k
It's generally a bad idea to try to think of
various SOLR/Lucene indexes in a database-like
way, Lucene isn't built to do RDBMS-like stuff. The
first suggestion is usually to consider flattening
your data. That would be something like
adding NY and "New York" in each document.
If that's not possib
Thanks for bringing closure.
Erick
On Tue, Feb 16, 2010 at 7:13 PM, uwdanny wrote:
>
> update - found the answer
>
> API getExplainList in org.apache.solr.util.SolrPluginUtils
>
> works.
>
>
>
>
> uwdanny wrote:
> >
> > Hi,
> >
> > I was trying to get the detailed "explain" info in (java) code
: no but you can set a default for the qf parameter with the same value
good call...
https://issues.apache.org/jira/browse/SOLR-1776
-Hoss
: But still i cant stop thinking about this.
: i deleted my entire index and now i have 0 documents.
:
: Now if i make a query with accrd i still get a suggestion of accord even
: though there are no document returned since i deleted my entire index. i
: hope it also clear the spell check index fi
: I'm interested in using Solr with a custom Lucene Filter (like the one
: described in section 6.4.1 of the Lucene In Action, Second Edition
: book). I'd like to filter search results from a Lucene index against
: information stored in a relational database. I don't want to move the
: relat
: I have a small worry though. When I call the full-import functions, can
: I configure Solr (via the XML files) to make sure there are rows to
: index before wiping everything? What worries me is if, for some unknown
: reason, we have an empty database, then the full-import will just wipe
: t
Chris Hostetter wrote:
: According to this email exchange between Koji and Mat Brown,
:
: http://www.mail-archive.com/solr-user@lucene.apache.org/msg23759.html
:
: The boost value from copyField's shouldn't be accumulated into the boost for
: the text field, can anyone else verify this? This s
Jan Hoydal / Otis,
First off, Thanks for mentioning us. We do use some utility functions from
SOLR but our index engine is built on top of Lucene only, there are no Solr
cores involved. We do have a JOIN operator that allows us to perform
relational searches while still acting like a search en
:i have indexed some data on solr 1.3.0. Now i wanna upgrade to solr
: 1.4.0 but on the same data.
: so here are the following steps i performed:
: 1. extract solr 1.4.0
: 2. copied the conf and data folder of my index from solr
: 1.3.0/examples/multicore to solr1.4.0/examples/multicore/
:
: I want to know How can I set request timeout through perl by
: webservice::solr end or solr end so that I could hanlde request timeout
I've never used WebService::Solr, but it's docs say it takes in a user
agent object, (ie: LWP::UserAgent) so that's where you can specify the
client side time
Greetings,
It's time for another awesome Seattle Hadoop/Lucene/Scalability/NoSQL Meetup!
As always, it's at the University of Washington, Allen Computer
Science building, Room 303 at 6:45pm. You can find a map here:
http://www.washington.edu/home/maps/southcentral.html?cse
Last month, we had a g
: I need to do a search that will search 3 different fields and combine
: the results. First, it needs to not break the phrase into tokens, but
: rather treat it is a phrase for one field. The other fields need to be
: parsed with their normal analyzers.
your description of your goal is a littl
: According to this email exchange between Koji and Mat Brown,
:
: http://www.mail-archive.com/solr-user@lucene.apache.org/msg23759.html
:
: The boost value from copyField's shouldn't be accumulated into the boost for
: the text field, can anyone else verify this? This seem to go against what
I
> It seems that when I do a search with a wildcard (eg,
> +text:abc*) the Solr
> standard SearchHandler will construct a ConstantScoreQuery
> passing in a
> Filter, so all the documents in the result set are scored
> the same. Is there
> a way to make Solr construct a BooleanQuery instead so that
>
Hi,
It seems that when I do a search with a wildcard (eg, +text:abc*) the Solr
standard SearchHandler will construct a ConstantScoreQuery passing in a
Filter, so all the documents in the result set are scored the same. Is there
a way to make Solr construct a BooleanQuery instead so that scoring ba
update - found the answer
API getExplainList in org.apache.solr.util.SolrPluginUtils
works.
uwdanny wrote:
>
> Hi,
>
> I was trying to get the detailed "explain" info in (java) code using the
> APIs, see codes below,
>
> -
> ResponseBuilder rb (from some inherited proces
> After getting aware of all
> these combinations, it seems not
> wise to proceed blindly by punushing what ever we want.
> Thank you very
> much for letting me know.
Generally most of the people are happy with default solr scoring. Especially in
web like search.
I am not sure but you can find t
Hi Israel (et al),
I don't think that I need an Update Handler; I don't intend to change the
values in the search index (in fact, the goal is to build a Lucene index with
Hadoop and then point a Solr instance at it).
What I'm trying to do is split the document into two locations: one is the
Lu
Hi all!
I'm trying to join 2 indexes together to produce a final result using only Solr
+ Velocity Response Writer.
The problem is that each "hit" of the main index contains references to some
common documents located in another index. For example, the hit could have a
field that describes in
Hi Jon,
You will need to write a plugin
You will need custom Query parser and an Update Handler depending on what
you are doing.
The implementation of an Update Handler or Update Request Processor is not
recommended because it is considered to be advanced.
Take a look at the following links for
Thanks Hoss
Apology for flooding the post.
But still i cant stop thinking about this.
i deleted my entire index and now i have 0 documents.
Now if i make a query with accrd i still get a suggestion of accord even
though there are no document returned since i deleted my entire index. i
hope it al
Problem solved. I wasn't quoting the value. Since I was using names such as
'Gary Bettman' solr must have been giving all the Garys.
-Original Message-
From: Nagelberg, Kallin [mailto:knagelb...@globeandmail.com]
Sent: Tuesday, February 16, 2010 3:22 PM
To: 'solr-user@lucene.apache.org'
Hi
Ups, sorry. I didn't recognized the answer because it was in the bulk
folder.
I though with this procedure it will be a lot faster and less overhead.
Just two lines of shell script.
What do you think?
Regards,
Peter.
This should work on Linux. The rsync based replication scripts used to
Hi,
I've read very interesting interview with Ryan,
http://www.lucidimagination.com/Community/Hear-from-the-Experts/Podcasts-and
-Videos/Interview-Ryan-McKinley
Another finding is
https://issues.apache.org/jira/browse/SOLR-773
(lucene/contrib/spatial)
Is there any more staff going on for SOLR
Hi everyone,
I am attempting to implement a faceted drill down feature with Solr. I am
having problems explaining some results of the fq parameter.
Let's say I have two fields, 'people' and 'category'. I do a search for 'dog'
and ask to facet on the people and category fields.
I am told that t
Hello ,
Thanks for your detailed explaination.
> Do you want to punish *more* long documents?
Not alot, but a bit more than default implementation. It seems
"lengthNorm" is field based and pinushing lengthy fields does fit most
of the cases in our project.
> There will be a trade-off
Hello,
I'm interested in using Solr with a custom Lucene Filter (like the one
described in section 6.4.1 of the Lucene In Action, Second Edition book). I'd
like to filter search results from a Lucene index against information stored in
a relational database. I don't want to move the relationa
I've setup a simple DIH import handler with Solr that connects via a database
to my data.
I have a small worry though. When I call the full-import functions, can I
configure Solr (via the XML files) to make sure there are rows to index before
wiping everything? What worries me is if, for some u
Hi erick, thanks for the reply.
my query url includes "debugQuery=on" and the result page is correctly
showing all the debug / explain info. the problem I'm facing is that I
cannot get the same debug/explain info in code. I've been trying
IndexSearcher.explain(Weight, int ) API, as well as Search
Hi All
Is there any tool to analyze corrupted data in Solr. I am aware of luke.
But does it shows somehow that the data is corrupted?
Like some segments are missing or whether some documents have been corrupted
- not fully indexed?
Thanks
Dipti
Any details? This is pretty ambiguous
tacking debugQuery=true to a URL brings back some stuff
in Lucene, IndexSearcher.explain()?
Erick
On Tue, Feb 16, 2010 at 1:21 PM, uwdanny wrote:
>
> any hints?
> --
> View this message in context:
> http://old.nabble.com/How-to-retrieve-relevance
Hi,
any hints or suggestions?
Does anyone do the updating this way?
Regards,
Peter.
Hi solr community!
Is it recommended to replace the data directory of a heavy used solr
instance?
(I am aware of the http queries, but that will be too slow)
I need a fast way to push development data to pr
any hints?
--
View this message in context:
http://old.nabble.com/How-to-retrieve-relevance-%22debug-explain%22-info-in-code--tp27602530p27612814.html
Sent from the Solr - User mailing list archive at Nabble.com.
I've got a task open to upgrade to 0.6. Will try to get to it this week.
Upgrading is usually pretty trivial.
On Feb 14, 2010, at 12:37 AM, Liam O'Boyle wrote:
> Afternoon,
>
> I've got a large collections of documents which I'm attempting to add to
> a Solr index using Tika via the Extracti
It definitely had something to do with omitTermFreqAndPosition. As soon as I
disabled the option and re-indexed, my queries starting working as expected.I
suspect it has to something to do with terms occupying the same position and
losing that information by using omitTermFreqAndPositions, but I
Hi @all,
I am getting the same recursive-concatenated results as the guys in the
comments (http://issues.apache.org/jira/browse/SOLR-64). I couldn't get
hiefacets working wether with release-1.4.0 nor with branch-1.4.0. I've got
a 1.4.0-dev incl. SOLR-64 running and in parallel a 1.4.0-final. I w
Cool, thanks - just wanted to make sure I'm not insane. Makes sense
that there would be a difference if the index is built fresh in that
case.
On Tue, Feb 16, 2010 at 11:59, Mark Miller wrote:
> Mat Brown wrote:
>> Hi all,
>>
>> Trying to debug a very sneaky bug in a small Solr extension that I
>
> Hello ,
> Thanks. That clears my
> doubts. Coming to the point two, Can
> you please tell me which part of the Similarity takes care
> of the
> same. Is it possible to implement in such a way that we
> give more
> preference to "number of found terms".
public float coord(int overlap,
Mat Brown wrote:
> Hi all,
>
> Trying to debug a very sneaky bug in a small Solr extension that I
> wrote, and I've come across an odd situation. Here's what my test
> suite does:
>
> deleteByQuery("*:*");
> // add some documents
> commit();
> // test the search
>
> This works fine. The test suite
Hi all,
Trying to debug a very sneaky bug in a small Solr extension that I
wrote, and I've come across an odd situation. Here's what my test
suite does:
deleteByQuery("*:*");
// add some documents
commit();
// test the search
This works fine. The test suite that exposed the error (which is
actua
On a related note. Maybe it'd be good to have wiki page of
experiences and possibly stats of various SSD drives? Either on
Lucene or Solr wiki sites?
2010/2/16 Tim Terlegård :
> 2010/2/15 Toke Eskildsen :
>> From: Tim Terlegård [tim.terleg...@gmail.com]
>>> If the index size is more than you can
thanks .
Is it possible to do date faceting on multiple solr shards?
I am using index created in two different shards to do date faceting on
field "DATE"
*
http://localhost:8983/solr/1_13_1_3/select?&shards=localhost:8983/solr/index1/,localhost_two:8983/solr/index/&start=0&rows=20&q=*&facet=true&
that has answered my concern about the index size/duplicated data.
but the other one is about presenting the search results, results should be
one with list of files. so in this case I would need to write some logic
before showing the results right? (may be like comparing each result
solrdocument/
Unless you have *evidence* that the indexing each pdf with
the form data as a single SOLR document is a problem,
I would just index the fields with each document rather
than try to index the PDFs as multivalued. The space
used by duplicating the form field data is probably a
tiny fraction of the da
Hi,
Can anybody tell me if [1] still applies as of version trunk 03/02/2010 ? I
am removing documents from my index using deletedPkQuery and a deltaimport.
I can tell from the logs that the removal seems to be working:
16-Feb-2010 15:32:54 org.apache.solr.handler.dataimport.DocBuilder
collectDelt
Hello ,
Thanks. That clears my doubts.Coming to the point two, Can
you please tell me which part of the Similarity takes care of the
same. Is it possible to implement in such a way that we give more
preference to "number of found terms". Also, here in our case we need
to give more import
Hi,
When we index using SOLR, we have an option called multivalued. How does
that work with multiple files associated with same document.
For example: submiting a form with some fields + list of pdf files
index process:
1) considering all the form fields as individual solr input document fields
(
How can we get instance of IndexSchema object in Tokenizer subclass?
I'd doubt if a performance benchmark would be very useful, it ultimately
depends on what you are trying to do and what you are comfortable with.
We've had successful deployments on both.
Any difference in performance is far outweighed by ease of setup/support that
you personally find in each
On Tue, Feb 16, 2010 at 2:04 PM, NarasimhaRaju wrote:
> Hi,
>
> using filterQuery(fq) is more efficient because SolrIndexSearcher will make
> use of filterCache
> and in your case it returns entire set from the cache instead of searching
> from the entire index.
> more info about solrCaches at
Hi,
using filterQuery(fq) is more efficient because SolrIndexSearcher will make use
of filterCache
and in your case it returns entire set from the cache instead of searching from
the entire index.
more info about solrCaches at
http://wiki.apache.org/solr/SolrCaching#filterCache
Regards,
P.N
Hi there,
Is there any analysis out there that may help to choose between Tomcat and
Jetty to deploy Solr? I wonder wether there's a significant difference
between them in terms of performance.
Any advice would be much appreciated,
-Steve
Hi,
i have indexed some data on solr 1.3.0. Now i wanna upgrade to solr
1.4.0 but on the same data.
so here are the following steps i performed:
1. extract solr 1.4.0
2. copied the conf and data folder of my index from solr
1.3.0/examples/multicore to solr1.4.0/examples/multicore/
3. started
Hi Shalin!
Thanks for quick response. Sadly it tells me, that i have to look elsewhere to
fix the problem.
Anyone an idea what could cause the increasing warmup-Times? If required I can
post some stats.
Thanking you in anticipation!
Regards,
Sven
Feed: Solr-Mailing-List
Berei
Hi everyone,
in our app we sometimes use solr programmatically to retrieve all the
elements that have a certain value in a single-valued single-token
field ( brand:xxx).
Since we are not interested in scoring this results, I was thinking
that maybe this should be performed as a filterQuery (fq="br
Hi,
I have to set up a SOLR cluster with some availability concept (is allowed to
require manual interaction on fault, however, if there is a better way, I'd be
interested in recommendations).
I have two servers (A and B for the example) at my disposal.
What I was thinking about was the follo
2010/2/15 Toke Eskildsen :
> From: Tim Terlegård [tim.terleg...@gmail.com]
>> If the index size is more than you can have in RAM, do you recommend
>> to split the index to several servers so it can all be in RAM?
>>
>> I do expect phrase queries. Total index size is 107 GB. *prx files are
>> total
On Mon, Feb 15, 2010 at 3:30 PM, Peter Karich wrote:
> Hi solr community!
>
> Is it recommended to replace the data directory of a heavy used solr
> instance?
> (I am aware of the http queries, but that will be too slow)
>
> I need a fast way to push development data to production servers.
> I tr
On Tue, Feb 16, 2010 at 1:06 PM, Bohnsack, Sven wrote:
> Hey IT-Crowd!
>
> I'm dealing with some performance issues during warmup the
> queryResultCache. Normally it tooks about 11 Minutes (~700.000 ms), but
> now it tooks about 4 MILLION and more ms. All I can see in the solr.log
> ist that the
On Tue, Feb 16, 2010 at 4:23 AM, wojtekpia wrote:
>
> Is there a way to 'discover' slaves using ReplicationHandler? I'm writing a
> quick dashboard, and don't have access to a list of slaves, but would like
> to show some stats about their health.
>
No, the master does not know about any slave.
72 matches
Mail list logo