I'd like to boost every query using {!boost b=log(popularity)}. But I'd rather
not have to prepend that to every query. It'd be much cleaner for me to
configure Solr to use that as default.
My plan is to make DisMaxRequestHandler the default handler and add the
following to solrconfig.xml:
Avlesh:
I am currently working on some of kind rules in front
(application side) of our solr instance. These rules are more application
specific and are not general. Like deciding which fields to facet, which
fields to return in response, which fields to highlight, boost value for
each f
On Wed, Jan 6, 2010 at 2:51 AM, Giovanni Fernandez-Kincade
wrote:
> http://wiki.apache.org/solr/SolrReplication
>
> I've been looking over this replication wiki and I'm still unclear on a two
> points about Solr Replication:
>
> 1. If there have been small changes to the index on the master,
jars are not replicated. It is by design. But that is not to say that
we can't do it. open an issue .
On Wed, Jan 6, 2010 at 6:20 AM, Ryan Kennedy wrote:
> Will the built-in Solr replication replicate extension JAR files in
> the "lib" directory? The documentation appears to indicate that only
>
>
> Your question appears to be an "XY Problem" ... that is: you are dealing
> with "X", you are assuming "Y" will help you, and you are asking about "Y"
> without giving more details about the "X" so that we can understand the full
> issue. Perhaps the best solution doesn't involve "Y" at all? Se
Hoss,
Thanks for your reply.
As you pointed out the Terms Component alone with the terms.maxcount did the
trick for single terms.
And ShingleFilter did the trick for phrases.
I have not ventured into Hadoop just yet - any examples you could point me
to of simple map/reduce jobs?
Thanks - I was overlooking the Terms Component and given I can specify
terms.maxcount I can live without the ascending order.
-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: Tuesday, January 05, 2010 2:56 AM
To: solr-user@lucene.apache.org
Subject: Re
Will the built-in Solr replication replicate extension JAR files in
the "lib" directory? The documentation appears to indicate that only
the index and any specified configuration files will be replicated,
however if your solrconfig.xml references a class in a JAR file added
to the lib directory the
Hello community,
I wrote another mail today, but I think something goes wrong (I can't find
my post in the mailinglist) - if not, I am sorry for posting a "doublepost"
- I am using a maillist for the first time.
I have created a custom analyzer, which contains on a LowerCaseTokenizer, a
StopFilt
I've tracked this problem down to the fact that I'm using the
WordDelimiterFilter. I don't quite understand what's happening, but if I
add preserveOriginal="1" as an option, everything looks fine. I think it has
to do with the period being stripped in the token stream.
On Tue, Jan 5, 2010 at 2:05
> Thanks to both of you for the quick
> answers,
>
> analysis.jsp shows that the WordDelimiterFilterFactory is
> performing the
> split
>
> I was experimenting around with the delimiters for the last
> two days but am
> still unable to obtain the desired result.
>
> I tried entirely kicking sol
Hi,
On Tue, Jan 5, 2010 at 5:17 PM, Erick Erickson wrote:
> We need to back up, this is looking like an XY problem. That is,
> you're asking for specifics when what would probably be more
> helpful is for you to describe *what* the problem you're trying
> to solve is rather than *how* to make a s
http://wiki.apache.org/solr/SolrReplication
I've been looking over this replication wiki and I'm still unclear on a two
points about Solr Replication:
1. If there have been small changes to the index on the master, does the
slave copy the entire contents of the index files that were affecte
: For ex : which jar of solr contains org.xml.sax. .. package.
none of them. it's an "endoresed standard" provided by the JRE (but
overridable at runtime if you'd like to use an alternate
implementation) ...
http://java.sun.com/j2se/1.5.0/docs/guide/standards/
-Hoss
Hello..,
1) Yeah, I have found that before. But, which .jar file of
Solr ( should be one of the jars inside Solr ) contains all the
supporting classes related to xml parsing.
For ex : which jar of solr contains org.xml.sax. .. package.
2) Do you mean, I can straightly use SAX api, f
: I am planning to build a rules engine on top search. The rules are database
: driven and can't be stored inside solr indexes. These rules would ultimately
: two do things -
:
:1. Change the order of Lucene hits.
:2. Add/remove some results to/from the Lucene hits.
:
: What should be my
: > So, in general, there is no *significant* performance difference with using
: > dynamic fields. Correct?
:
: Correct. There's not even really an "insignificant" performance difference.
: A dynamic field is the same as a regular field in practically every way on the
: search side of things.
: Subject: Reload synonyms
: References: <00b501ca8db9$7e119c70$0301a...@cgifederal.com>
: <69de18141001042355l4c98e147r8cd0ae73d3836...@mail.gmail.com>
: In-Reply-To: <69de18141001042355l4c98e147r8cd0ae73d3836...@mail.gmail.com>
http://people.apache.org/~hossman/#threadhijack
Thread Hijacking o
Really? Doesn't it have to be delimited differently, if both the file contents
and the document metadata will be part of the POST data? How does Solr Cell
tell the difference between the literals and the start of the file? I've tried
this before and haven't had any luck with it.
-Original
Hello,
I'm using Solr 1.4, and I'm trying to get the regex fragmenter to parse
basic sentences, and I'm running into a problem.
I'm using the default regex specified in the example solr configuration:
[-\w ,/\n\"']{20,200}
But I am using a larger fragment size (140) with a slop of 1.0.
Given th
Config.java (which parses e.g. solrconfig.xml) in the solr core code has:
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;
import org.apache.solr.common.SolrException;
import org.apache.solr.common.util.DOMUtil;
import javax.xml.parsers.*;
import javax.xml.xpa
I haven't tried it, but you might be able to use either (and this is just me
thinking aloud):
DataImportHandler with the FileEntityProcessor
Remote Streaming - (you might have to write out Solr XML or do something else)
-Grant
On Jan 5, 2010, at 4:05 AM, Mark N wrote:
> SolrInputDocument doc1
The attached screenshot shows the transition on a master search server
when we updated from a Solr 1.4 dev build (revision 779609 from
2009-05-28) to the Solr 1.4.0 released code. Every 3 hours we have a
cron task to log some of the data from the stats.jsp page from each
core (about 100 cores, mos
The version of Tika in the 1.4 release definitely parses the most current
Office formats (.docx, .pptx, etc.) and they index as expected.
-Jay
On Mon, Jan 4, 2010 at 6:02 PM, Peter Wolanin wrote:
> You must have been searching old documentation - I think tika 0,3+ has
> support for the new MS f
We need to back up, this is looking like an XY problem. That is,
you're asking for specifics when what would probably be more
helpful is for you to describe *what* the problem you're trying
to solve is rather than *how* to make a specific behavior
happen. Although re-reading your original e-mail do
Thanks to both of you for the quick answers,
analysis.jsp shows that the WordDelimiterFilterFactory is performing the
split
I was experimenting around with the delimiters for the last two days but am
still unable to obtain the desired result.
I tried entirely kicking solr.WordDelimiterFilterFact
Hello ,
There are some project specific schema xml files which should
be parsed. I have used Jdom API for the same. But it seems more clean
to shift to xml parser used by Solr itself. I have gone through source
codes.Its a bit confusing. I have found javax.xml package and also
org.xml.sax
(In Lucene) I break the document into smaller pieces, then add each
piece to the Document field in a loop. This seems to work better, but
will mess-around with analysis like term offsets.
This should work in your example.
In Lucene, you can also add the field using a Reader to the file in question
Hi Yonik!
I've tried recreating the problem now to get some log-output and the problem
just doesn't seem to be there anymore... This puzzles me abit, as the
problem WAS definitely there before.
I've done one change and that is to optimize the index on one of the
servers. But should that impact thi
The issue was sometimes null result during facet navigation or simple
search, results were back after a refresh, we tried to changed the cache to
. But same behaviour.
That is strange. Just to make sure, you were using the same LBHttpSolrServer
instance for all requests, weren't you?
Is there any way to specify to solr only to bring back facet filter options
where the frequency is less than the total results found? I found facets
which match the result count are not helpful to the user, and produce noise
within the UI to filter results.
I can obviously do this within the vie
On Mon, Jan 4, 2010 at 7:13 PM, Patrick Sauts wrote:
> The issue was sometimes null result during facet navigation or simple
> search, results were back after a refresh, we tried to changed the cache to
> . But same behaviour.
>
>
That is strange. Just to make sure, you were using the same LBHttpS
On Tue, Jan 5, 2010 at 2:24 PM, Peter A. Kirk wrote:
> Thanks for the answer. How does one "reload" a core? Is there an API, or a
> url one can use?
>
I think this should be it - http://wiki.apache.org/solr/CoreAdmin#RELOAD
--
- Siddhant
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( "Fulltext", strContent);
strContent is a string variable which contains contents of text file.
( assume that text file is located in c:\files\abc.txt )
In my case abc.text ( text files ) could be very huge ~ 2 GB so it is not
a
Thanks for the answer. How does one "reload" a core? Is there an API, or a url
one can use?
Med venlig hilsen / Best regards
Peter Kirk
E-mail: mailto:p...@alpha-solutions.dk
-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: 5. januar 2010 21:46
To:
On Tue, Jan 5, 2010 at 2:03 PM, Peter A. Kirk wrote:
>
> Is it possible to reload the synonym list, if for example "synonyms.txt" is
> changed, without having to restart the server? Is the same possible with
> stop-words?
>
>
Yes you can reload a core but there are two catches:
1. Reloading a
Hi
Is it possible to reload the synonym list, if for example "synonyms.txt" is
changed, without having to restart the server? Is the same possible with
stop-words?
Thanks,
Peter
37 matches
Mail list logo