looking at http://wiki.apache.org/solr/SpatialSearchDev
I would think I could index a lat,lon pair into a GeoHashField (that
works) and then retrieve the field value to see the computed geohash.
however, that doesn't seem to work. If I index: 21.4,33.5
The retrieved value is not a hash, but ap
When I retrieve the value the lat/lon pair that comes out is not
exactly the same as what I indexed, which made be think it was
actually stored as the hash and then transformed back?
Anyhow - I'm trying to understand the actual use case for the field as
it exists - essentially you are saying I cou
I've seen a number of users fail to get Solr working correctly in
combination with the Drupal client code when using the .deb installer
so I have been strongly recommending against it personally.
It's also a rather stale version of Solr, generally.
-Peter
On Sun, Oct 2, 2011 at 4:04 AM, Gora Moh
A colleague came to be with a problem that intrigued me. I can see
partly how to solve it with Solr, but looking for insight into solving
the last step.
The problem:
1) Start from a set of text transcriptions of videos where there is a
timestamp associated with each word.
2) Index into Solr wit
Assuming you are using Drupal for the website, you can have Solr set
up and integrated with Drupal in < 5 minutes for local development
purposes.
See: https://drupal.org/node/1358710 for a pre-configured download.
-Peter
On Mon, Dec 5, 2011 at 11:46 AM, Achebe, Ike, JCL
wrote:
> Hi,
> My name i
In IRC trying to help someone find Polish-language support for Solr.
Seems lucene has nothing to offer? Found one stemmer that looks to be
compatibly licensed in case someone wants to take a shot at
incorporating it: http://www.getopt.org/stempel/
-Peter
--
Peter M. Wolanin, Ph.D.
Momentum Sp
We have a content access control system that works well for the actual
search results, but we see that the spellcheck suggestions include
words that are not within the set of documents the current user is
allowed to access. Does anyone have an approach to this problem for
Solr 1.4.x? Anything new
t Group
> (615) 213-4311
>
>
> -Original Message-
> From: Peter Wolanin [mailto:peter.wola...@acquia.com]
> Sent: Thursday, October 07, 2010 9:00 AM
> To: solr-user@lucene.apache.org
> Subject: access control for spellcheck suggestions?
>
> We have a content access c
Trying to maintain the Drupal integration module across multiple versions
of 3.x, we've gotten a bug report suggesting that Solr 3.6 needs this
change to solrconfig:
-
org.apache.lucene.index.LogByteSizeMergePolicy
+
I don't see this mentioned in the release notes - is the second format
use
browse/SOLR-1052. The second format works
> in all 3.x versions.
>
> -Michael
>
> -Original Message-
> From: Peter Wolanin [mailto:peter.wola...@acquia.com]
> Sent: Friday, April 13, 2012 12:32 PM
> To: solr-user@lucene.apache.org
> Subject: mergePolicy el
Does your servlet container have the URI encoding set correctly, e.g.
URIEncoding="UTF-8" for tomcat6?
http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config
Older versions of Jetty use ISO-8859-1 as the default URI encoding,
but jetty 6 should use UTF-8 as default:
http://docs.codehaus.org/d
Looking at the example schema:
http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_3/solr/example/solr/conf/schema.xml
the solr.PointType field type uses double (is this just an example
field, or used for geo search?), while the solr.LatLonType field uses
tdouble and it's unclear ho
l 27, 2011 at 9:01 AM, Peter Wolanin
> wrote:
>> Looking at the example schema:
>>
>> http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_3/solr/example/solr/conf/schema.xml
>>
>> the solr.PointType field type uses double (is this just an example
>
Looking at how we could upgrade some of our infrastructure to Solr 4.0
- I would really like to take advantage of distributed updates to get
NRT, but we want to keep our fixed master and slave server roles since
we use different hardware appropriate to the different roles.
Looking at the solr 4.0
gt; Otis
> --
> Search Analytics - http://sematext.com/search-analytics/index.html
> Performance Monitoring - http://sematext.com/spm/index.html
>
>
> On Sun, Nov 11, 2012 at 7:42 PM, Peter Wolanin
> wrote:
>
>> Looking at how we could upgrade some of our infrastruct
down, but adding
HA there would be helpful in some cases.
-Peter
On Tue, Nov 13, 2012 at 9:12 PM, Peter Wolanin wrote:
> Yes, basically I want to at least avoid leader election and the other
> dynamic behaviors. I don't have any experience with ZK, and a lot of
> "magic" beha
Sadly, I had to muis the meetup in NYC, but looking over the slides
(http://files.meetup.com/1482573/YonikSeeley_NYCMeetup_solr14_features.pdf)
I see:
Solr Cell:
Integrates Apache Tika (v0.4) into Solr
My current checkout of solr still has tika 0.3, and I don't see a jira
issue for updating to 0
Looks like we better update our schema for the Drupal module - what
rev of Solr incorporates this change?
-Peter
On Fri, Jul 24, 2009 at 8:38 AM, Koji Sekiguchi wrote:
> David,
>
> Try to change solr.CharStreamAwareWhitespaceTokenizerFactory to
> solr.WhitespaceTokenizerFactory
> in your schema.
I just copied this information to the wiki at
http://wiki.apache.org/solr/SolrRequestHandler
-Peter
On Fri, Sep 11, 2009 at 7:43 PM, Jay Hill wrote:
> RequestHandlers are configured in solrconfig.xml. If no components are
> explicitly declared in the request handler config the the defaults are u
There are some open issues (not for 1.4 at this point) to make dismax
more flexible or add wildcard handling, e.g:
https://issues.apache.org/jira/browse/SOLR-756
https://issues.apache.org/jira/browse/SOLR-758
You might participate in those to try to get this in a future version
and/or get a worki
This fairly recent blog post:
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
describes the use of the solr.EdgeNGramFilterFactory as the tokenizer
for the index. I don't see any mention of that tokenizer on the Solr
wiki - is it just waiting t
analyzer in the wild a few months
> back.
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> - Original Message
>> From: Peter Wolanin
>> To: so
fferent than the normal n-gram
> tokenizer.
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> - Original Message
>> From: Peter Wolanin
>> To: sol
different than the normal n-gram
>> tokenizer.
>> >
>> > Otis
>> > --
>> > Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>> >
>> >
>>
I'm testing out the final release of Solr 1.4 as compared to the build
I have been using from around June.
I'm using hte dismax handler for searches. I'm finding that
highlighting is completely broken as compared to previously. Much
more text is returned than it should for each string in , but t
Apparently one of my conf files was broken - odd that I didn't see any
exceptions. Anyhow - excuse my haste, I don't see the problem now.
-Peter
On Fri, Nov 13, 2009 at 11:06 PM, Peter Wolanin
wrote:
> I'm testing out the final release of Solr 1.4 as compared to the build
&
Take a look at the example schema - you can have dynamic fields that
are used based on wildcard matching to the field name if a field
doesn't mtach the name of an existing field.
-Peter
On Sun, Nov 15, 2009 at 10:50 AM, yz5od2 wrote:
> Thanks for the reply:
>
> I follow the schema.xml concept, b
I'm trying to take advantage of the Solr 1.4 Xinclude feature to
include a different xml fragment (e.g. a different analyzer chain in
schema.xml) for each core in a multi-core setup. When the Xinclude
operates on a relative path, it seems to NOT be acting relative to the
xml file with the Xinlclud
I'm trying to determine if it's possible to use Xinclude to (for
example) have a base schema file and then substitute various pieces.
It seems that the schema fieldTypes throw exceptions if there is an
unexpected attribute?
SEVERE: java.lang.RuntimeException: schema fieldtype
text(org.apache.solr
Follow-up: it seems the schema parser doesn't barf if you use
xinclude with a single analyzer element, but so far seems like it's
impossible for a field type. So this seems to work:
...
...
On Sat, Nov 28, 2009 at 1:40 PM, Peter Wola
I've recently started working on the Drupal integration module for
SOLR, and we are looking for suggestions for how to address this
question: how do we boost the importance of a subset of terms within
a field.
For example, we are using the standard request handler for queries,
and the default fie
nt way of encoding the byte array and putting it into the XML
> format, such that one can send in payloads when indexing. It's not
> particularly hard, but no one has done it yet.
>
> -Grant
>
>
> On Nov 29, 2008, at 10:45 PM, Peter Wolanin wrote:
>
>> I've
We have been having this problem also. and have resorted to just
stripping control characters before sending the text for indexing:
preg_replace('@[\x00-\x08\x0B\x0C\x0E-\x1F]@', '', $text);
-Peter
On Tue, Dec 9, 2008 at 7:59 AM, knietzie <[EMAIL PROTECTED]> wrote:
>
> hi joshua,
>
> i'm having
I'm seeing a weird effect with a '*' field. In the example
schema.xml, there is a commented out sample:
We have this un-commented, and in the schema browser via the admin
interface I see that all non-dynamic fields get a type of "ignored".
I see this in the Solr admin interface:
Field:
-Yonik
>
>
> On Thu, Dec 18, 2008 at 3:20 PM, Peter Wolanin
> wrote:
>> I'm seeing a weird effect with a '*' field. In the example
>> schema.xml, there is a commented out sample:
>>
>>
>>
>>
>> We have this un-commented, and
For documents we are indexing via the PHP client, we are currently
using the following regex to strip control characters from each field
that might contain them:
function apachesolr_strip_ctl_chars($text) {
// See: http://w3.org/International/questions/qa-forms-utf-8.html
// Printable utf-8 d
We have been trying to figure out how to construct, for example, a
directory page with an overview of available facets for several
fields.
Looking at the issue and wiki
http://wiki.apache.org/solr/TermsComponent
https://issues.apache.org/jira/browse/SOLR-877
It would seem like this component wou
It *looks* as though Solr supports returning the results of arbitrary
calculations:
http://wiki.apache.org/solr/SolrQuerySyntax
However, I am so far unable to get any example working except in the
context of a dismax bf. It seems like one ought to be able to write a
query to return the doc match
Sure, we are doing essentially that with our Drupal integration module
- each search result contains a link to the "real" content, which is
stored in MySQL, etc, and presented via the Drupal CMS.
http://drupal.org/project/apachesolr
-Peter
On Tue, Feb 17, 2009 at 11:57 AM, roberto wrote:
> Hell
In the example schema.xml, there is a field type 'ignored' which it is
suggested can be used with the wildcard * to prevent errors when a
document contains fields that don't match any in the schema. My
experience recently in using this is that it does not worked as
desired if the unmatched field
If some stuff is asked over and over again, it would be great to grab
some reasonable responses and add them to the wiki.
I've edited it a few times when I've struggled with what's there and
found something that wasn't covered or was out of date - even the best
forum or mailing list will not repli
My colleague Paul opened this issue and supplied a patch and I
commented on it regarding a potential security weakness in the admin
interface:
https://issues.apache.org/jira/browse/SOLR-1031
--
Peter M. Wolanin, Ph.D.
Momentum Specialist, Acquia. Inc.
peter.wola...@acquia.com
We are working on integration with the Drupal CMS, and so are writing
code that carries out operations that might only be relevant for only
a small subset of the sites/indexes that might use the integration
module. In this regard, I'm wondering if adding to the query (using
the dismax or mlt handl
We are using Solr trunk (1.4) - currently " nightly exported - yonik
- 2009-02-05 08:06:00"
-Peter
On Mon, Feb 23, 2009 at 8:07 AM, Koji Sekiguchi wrote:
> Jacob,
>
> What Solr version are you using? There is a bug in SolrHighlighter of Solr
> 1.3,
> you may want to look at:
>
> https://issues.
- und Anwender-)
−
die Entstehungsgeschichte des Portals) auch dokumentiert worden, denn
Ihr vermutet schon richtig, daß da
You can see the "strong" tags each get offset one character more from
where they are supposed to be.
-Peter
On Mon, Feb 23, 2009 at 8:24 AM, Peter Wolan
can see the "strong" tags each get offset one character more from
> where they are supposed to be.
>
>
> -Peter
>
>
>
> On Mon, Feb 23, 2009 at 8:24 AM, Peter Wolanin
> wrote:
>> We are using Solr trunk (1.4) - currently " nightly exported - yonik
>&g
, but looks like the
real bug is in Solr.
-Peter
On Tue, Feb 24, 2009 at 4:28 PM, Peter Wolanin wrote:
> So - something in the highlighting code is counting bytes when it
> should be counting characters. Looks like a lucene bug, so I'm
> surprised others have not hit this before.
Trying to set up a server to host multiple Solr cores, we have run
into the issue of too many open files a few times. The 2nd ed "Lucene
in Action" book suggests using the compound file format to reduce the
required number of files when having multiple indexes, but mentions a
possible ~10% slow-do
This doesn't seem to match what I'm seeing in terms of using bq -
using any value > 0 increases the score. For example, with no bq:
solr
title,score,type
2.2
1.6885357
Building a killer search for Drupal
wikipage
1.5547959
New Solr module available for testing
story
I had problems with this when trying to set this up with multiple
cores - I had to set the shared lib as:
in example/solr/solr.xml in order for it to find the jars in example/solr/lib
-Peter
On Wed, Apr 22, 2009 at 11:43 AM, Grant Ingersoll wrote:
>
> On Apr 20, 2009, at 12:46 PM, francisco t
For the Drupal Apache Solr Integration module, we are exploring the
possibility of doing facet browsing - since we are using dismax as
the default handler, this would mean issuing a query with an empty q
and falling back to to q.alt='*:*' or some other q.alt that matches
all docs.
However, I noti
Possibly this issue is related: https://issues.apache.org/jira/browse/SOLR-825
Though it seems that might affect the standard handler, while what I'm
seeing is more sepcific to the dismax handler.
-Peter
On Thu, May 7, 2009 at 8:27 PM, Peter Wolanin wrote:
> For the Drupal Apa
.alt using the params q and qf. Highlight will work in that case (I
> sorted it out doing that)
>
> Peter Wolanin-2 wrote:
>>
>> Possibly this issue is related:
>> https://issues.apache.org/jira/browse/SOLR-825
>>
>> Though it seems that might affect the standard
Indeed - that looks nice - having some kind of conditional includes
would make many things easier.
-Peter
On Wed, May 13, 2009 at 4:22 PM, Otis Gospodnetic
wrote:
>
> This looks nice and simple. I don't know enough about this stuff to see any
> issues. If there are no issues.?
>
> Otis
>
I think that if you have in your index any documents with norms, you
will still use norms for those fields even if the schema is changed
later. Did you wipe and re-index after all your schema changes?
-Peter
On Fri, May 15, 2009 at 9:14 PM, vivek sar wrote:
> Some more info,
>
> Profiling the
Building Solr last night from updated svn, I'm now getting the
exception below when I use any fq parameter searching a pre-existing
index. So far, I cannot fix it by tweak config files, but I had to
delete and re-index.
I note that Solr was recently updated to the latest lucene build, so
maybe so
you can use the lucene jar with solr to invoke the CheckIndex method -
this will possibly allow you to recover if you pass the -fix param.
You may lose some docs, however, so this is only viable if you can,
for example, query to check what's missing.
The command looks like (from the root of the
Is this a known bug? When I try to unload a core that does not exist,
Solr throws a NullPointerException
java.lang.NullPointerException
at
org.apache.solr.handler.admin.CoreAdminHandler.handleUnloadAction(CoreAdminHandler.java:319)
at
org.apache.solr.handler.admin.CoreAdminHandl
I did not find any relevant issue, so here's a new issue with a patch:
https://issues.apache.org/jira/browse/SOLR-1200
-Peter
On Wed, Jun 3, 2009 at 4:56 PM, Peter Wolanin wrote:
> Is this a known bug? When I try to unload a core that does not exist,
> Solr throws a NullPoint
I had the same problem - I think the answer is that highlighting is
not currently supported with q.alt and dismax.
http://www.nabble.com/bug--No-highlighting-results-with-dismax-and-q.alt%3D*%3A*-td23438048.html#a23438048
-Peter
On Sun, Jun 7, 2009 at 7:51 AM, Fouad Mardini wrote:
> Hello,
>
>
Looking at the new examples of solr.TrieField
http://svn.apache.org/repos/asf/lucene/solr/trunk/example/solr/conf/schema.xml
I see that all have indexed="true" stored="false" in the field tpye
definition. Does this mean that yo cannot ever store a value for one
of these fields? I.e. if I want t
A question for anyone familiar with the details of the time-based
autocommit mechanism in Solr:
if I am running several core on the same server and send updates to
each core at the same time, what happens? If all the cores have
their autocommit time run out at the same time, will every core try
So for now would it make sense to spread out the autocommit times for
the different cores?
Thanks.
-Peter
On Thu, Jun 18, 2009 at 7:07 PM, Yonik Seeley wrote:
> On Thu, Jun 18, 2009 at 4:27 PM, Peter Wolanin
> wrote:
>> I think I understand
>> that all the pending changes a
Seems like this might be approached using a Lucene payload? For
example where the original string is stored as the payload and
available in the returned facets for display purposes?
Payloads are byte arrays stored with Terms on Fields. See
https://issues.apache.org/jira/browse/LUCENE-755
Solr se
I had been assuming that I could choose among possible tika output
formats when using the extracting request handler in extract-only mode
as if from the CLI with the tika jar:
-x or --xmlOutput XHTML content (default)
-h or --html Output HTML content
-t or --text Ou
ee SOLR-284)
> A quick patch to specify the output format should make it into 1.4 -
> but you may want to wait until I finish.
>
> -Yonik
> http://www.lucidimagination.com
>
> On Sat, Jul 11, 2009 at 5:39 PM, Peter Wolanin
> wrote:
>> I had been assuming that I could cho
I have been getting exceptions thrown when users try to send boolean
queries into the dismax handler. In particular, with a leading 'OR'.
I'm really not sure why this happens - I thought the dsimax parser
ignored AND/OR?
I'm using rev 779609 in case there were recent changes to this. Is
this a k
t issue (is there another)?
https://issues.apache.org/jira/browse/SOLR-874
-Peter
On Mon, Jul 13, 2009 at 4:12 PM, Mark Miller wrote:
> It doesn't ignore OR and AND, though it probably should. I think there is a
> JIRA issue for it somewhere.
>
> On Mon, Jul 13, 2009 at 4:10 PM, Peter W
I can still generate this error with Solr built from svn trunk just now.
http://localhost:8983/solr/select/?qt=dismax&q=OR+vti+OR+foo
I'm doubly perplexed by this since 'or' is in the stopwords file.
-Peter
On Mon, Jul 13, 2009 at 3:15 PM, Peter Wolanin wrote:
> I have b
Assuming that you know the unique ID when constructing the query
(which it sounds like you do) why not try a boost query with a high
boost for 2 and a lower boost for 1 - then the default sort by score
should match your desired ordering, and this order can be further
tweaked with other bf or bq ar
AWS provides some standard data sets, including an extract of all
wikipedia content:
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2345&categoryID=249
Looks like it's not being updated often, so this or another AWS data
set could be a consistent basis for benchmarking?
-Pe
I think you can just tell the spellchecker to only supply "more
popular" suggestions, which would naturally omit these rare
misspellings:
true
-Peter
On Wed, Jul 15, 2009 at 7:30 PM, Jay Hill wrote:
> We had the same thing to deal with recently, and a great solution was posted
> to the lis
Actually, if you have a server enabled as a replication master, the
stats.jsp page reports the index size, so that information is
available in some cases.
-Peter
On Sat, Jul 18, 2009 at 8:14 AM, Erik Hatcher wrote:
>
> On Jul 17, 2009, at 8:45 PM, J G wrote:
>>
>> Is it possible to obtain the SOL
Related to the difference between rsync and native Solr replication -
we are seeing issues with Solr 1.4 where search queries that come in
during a replication request hang for excessive amount of time (up to
100's of seconds for a result normally that takes ~50 ms).
We are replicating pretty ofte
At the NOVA Apache Lucene/Solr Meetup last May, one of the speakers
from Near Infinity (Aaron McCurry I think) mentioned that he had a
patch for lucene that enabled unlimited depth memory-efficient paging.
Is anyone in contact with him?
-Peter
On Thu, Dec 24, 2009 at 11:27 AM, Grant Ingersoll w
You must have been searching old documentation - I think tika 0,3+ has
support for the new MS formats. but don't take my word for it - why
don't you build tika and try it?
-Peter
On Sun, Jan 3, 2010 at 7:00 PM, Roland Villemoes
wrote:
> Hi All,
>
> Anyone who knows how to index the latest MS o
The attached screenshot shows the transition on a master search server
when we updated from a Solr 1.4 dev build (revision 779609 from
2009-05-28) to the Solr 1.4.0 released code. Every 3 hours we have a
cron task to log some of the data from the stats.jsp page from each
core (about 100 cores, mos
Config.java (which parses e.g. solrconfig.xml) in the solr core code has:
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;
import org.apache.solr.common.SolrException;
import org.apache.solr.common.util.DOMUtil;
import javax.xml.parsers.*;
import javax.xml.xpa
patch.
> So far he has 2 x +1 from Grant and me to stick his patch in JIRA.
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>
>
>
> - Original Message
>> From: Peter Wolanin
>> To: solr-user@lucene.apache.org
>> Sent: S
I recently noticed the same sort of thing.
The attached screenshot shows the transition on a search server
when we updated from a Solr 1.4 dev build (revision 779609 from
2009-05-28) to the Solr 1.4.0 released code. Every 3 hours we have a
cron task to log some of the data from the stats.jsp page
ripped by
> ML manager. Maybe upload it somewhere?
> Otis
> --
> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>
>
>
> ----- Original Message
>> From: Peter Wolanin
>> To: solr-user@lucene.apache.org
>> Sent: Thu, January 7, 2010 9:32:
Having worked quite a bit on the Drupal integration - here's my quick take:
If you have someone help you the first time, you can have a basic
implementation running in Jetty in about 15 minutes. On your own, a
couple hours maybe. For a non-public site (intranet) with modest
traffic and no require
Sorry for not following up sooner- been a busy last couple weeks.
We do see a significant instanity count - could this be due to
updating indexes from the dev Solr build? E.g. on one server I see
61
and entries like:
SUBREADER: Found caches for de
It doesn't really work with the schema.xml - I beat my head on it for
a few hours not long ago - maybe I sent an e-mail to this list about
it?
Yes, here:
http://www.lucidimagination.com/search/document/ba68aa6f2f7702c3/is_it_possible_to_use_xinclude_in_schema_xml
-Peter
On Wed, Jan 6, 2010 at
Yes, we do have some fields (like the creation date) that we use for
both sorting and faceting.
-Peter
On Tue, Jan 26, 2010 at 8:55 PM, Yonik Seeley
wrote:
> On Tue, Jan 26, 2010 at 8:49 PM, Peter Wolanin
> wrote:
>> Sorry for not following up sooner- been a busy last couple wee
Can you tell me more about the rord() performance issues? I'm one of
the maintainers of the Drupal module, so I'd like to switch if there
is a better option.
Thanks,
Peter
On Wed, Feb 10, 2010 at 12:00 AM, Lance Norskog wrote:
> The admin/form.jsp is supposed to prepopulate fl= with '*,score'
The Drupal schema and solrconfig and the example schema and solrconfig
have different fields and defaults, and likely Drupal won't find the
fields its looking for and might not be even using the right query
perser.
-Peter
On Thu, Feb 11, 2010 at 3:19 PM, jaybytez wrote:
>
> So I got it to work b
Ran into an odd situation today searching for a string like a domain
name containing a '.', the Solr 1.4 analyzer tells me that I will get
a match, but when I enter the search either in the client or directly
in Solr, the search fails. Our default handler is dismax, but this
also fails with the st
Hi Mitch,
I am also seeing this locally with the exact same solr.war,
solrconfig.xml, and schema.xml running under Jetty, as well as on 2
different production servers with the same content indexed.
So this is really weird - this seems to be influenced by the surrounding text:
"would be great to
If I empty the stopword file and re-index, all expected matches
happen. So maybe that provides a further suggestion of where the
problem is. This certainly feels like a Solr bug (or lucene bug?).
-Peter
On Sat, Mar 27, 2010 at 3:05 PM, Peter Wolanin wrote:
> Hi Mitch,
>
> I am al
The output on the analysis screen does look correct. Here are 2 screen shots:
empty stopwords: http://img.skitch.com/20100327-rcsjdih4bn3y8ahajqa5wjwybd.png
standard stopwords:
http://img.skitch.com/20100327-1w5ct1wr25jkir4sji8kumefn1.png
-Peter
On Sat, Mar 27, 2010 at 4:13 PM, MitchK wrote:
>
ct to have that directive
here, or is this a bug?
-Peter
On Sat, Mar 27, 2010 at 4:25 PM, Peter Wolanin wrote:
> The output on the analysis screen does look correct. Here are 2 screen shots:
>
> empty stopwords: http://img.skitch.com/20100327-rcsjdih4bn3y8ahajqa5wjwybd.png
>
> s
ens, not a phrase).
-Peter
On Sat, Mar 27, 2010 at 4:32 PM, Peter Wolanin wrote:
> The stopwords stanza looks like:
>
> ignoreCase="true"
> words="stopwords.txt"
> enablePositionIncrements="true"
>
Created a new issue: https://issues.apache.org/jira/browse/SOLR-1852
further discussion there.
-Peter
On Sat, Mar 27, 2010 at 5:51 PM, Peter Wolanin wrote:
> Discussing this with Mark Miller in IRC - we are honing in on the problem.
>
> Looks as though Identi.ca is treated as phrase
I think it is clearly a bug - see comments on the issue by Robert
Muir. https://issues.apache.org/jira/browse/SOLR-1852
The patch is a backport by Mark Miller of Robert's fixes for other
problems for the WordDelimiterFilter in Solr trunk. Those fixes also
fix this bug as a side effect.
-Peter
A very abbreviated list of sites using Apache Solr + Drupal here:
http://drupal.org/node/447564
-Peter
On Thu, Apr 29, 2010 at 2:10 PM, Daniel Baughman wrote:
> Hi I'm new to the list here,
>
>
>
> I'd like to steer someone in the direction of Solr, and I see the list of
> companies using solr,
96 matches
Mail list logo