date:20080930

Re: Extending Solr with custom filter

2008-09-30 Thread Jarek Zgoda

Wiadomość napisana w dniu 2008-09-12, o godz. 17:58, przez Andrzej Bialecki: ok .. that? I recommend using Stempelator (or Morfologik) for Polish stemming and lemmatization. It provides a superset of Stempel features, namely in addition to the algorithmic stemming it provide

Re: Question about facet.prefix usage

2008-09-30 Thread Erik Hatcher

If I'm not mistaken, doesn't facet.query accomplish what you want? Erik On Sep 29, 2008, at 5:43 PM, Simon Hu wrote: I also need the exact same feature. I was not able to find an easy solution and ended up modifying class SimpleFacets to make it accept an array of facet prefixes

Re: Running Solr1.3 with multicore support

2008-09-30 Thread RaghavPrabhu

Hi Saurabh Bhutyani, Is it show the two core links in ur solr home page like Admin core0 Admin core1 if not,the problem is you are upgrading the solr from 1.2 to 1.3. Better stop the server delete all the floders in %Tomcat_Home%\work\Catalina\localhost location and restart it. Ho

Howto concatenate tokens at index time (without spaces)

2008-09-30 Thread Batzenmann

Hi, I'm looking for a way to create a fieldtype which will apart from the whitespacedtokenized tokens also store concatenated versions of the tokens. The ShingleFilter does s.th. very similar but keeps spaces in between words. In german a shoe(Schuh) you wear in your 'spare time'(Freizeit) is ac

spellcheck: substitutions, but no inserts or deletes

2008-09-30 Thread Jason Rennie

I've been testing the SpellCheckComponent for use on StyleFeeder. It seems to do a great job of suggesting character substitutions, but I haven't seen any deletion/insertion suggestions. I've tried decreasing the "accuracy" parameter to 0.5. Some queries I've tried are: bluea: suggests "blues"

Re: spellcheck: buildOnOptimize?

2008-09-30 Thread Jason Rennie

On Fri, Sep 26, 2008 at 9:33 AM, Shalin Shekhar Mangar < [EMAIL PROTECTED]> wrote: > Jason, can you please open a jira issue to add this feature? > Done. https://issues.apache.org/jira/browse/SOLR-795 Jason

Re: Indexing Multiple Fields with the Same Name

2008-09-30 Thread KyleMorrison

That was indeed the error, I apologize for wasting your time. Thank you very much for the help. Kyle Shalin Shekhar Mangar wrote: > > Is that a mis-spelling? > > mulitValued="true" > > On Thu, Sep 25, 2008 at 12:12 AM, KyleMorrison <[EMAIL PROTECTED]> wrote: > >> >> I'm trying to index fiel

Re: Howto concatenate tokens at index time (without spaces)

2008-09-30 Thread Otis Gospodnetic

I haven't used the German analyzer (either Snowball or the one we have in Lucene's contrib), but have you checked if that does the trick of keeping words together? Or maybe the compound tokenizer has this option? (check Lucene JIRA, not sure now where the compound tokenizer went) Otis -- Sema

French synonyms & Online synonyms

2008-09-30 Thread Pierre Auslaender

Hello, I'm sure these questions have been raised a million times, I'll try one more: 1/ Is there any general-purpose, free, French synonyms file out there? 2/ Is there a Solr or Lucene analyser class that could tap an on-line resource for synoynms at index-time? And by the same token, mainta

Re: French synonyms & Online synonyms

2008-09-30 Thread Otis Gospodnetic

Pierre, 1) I don't know, but a good place to check and see what previous answers to this questions were is markmail.org 2) I don't think there is such a thing, but I also don't think there are sites that make this data freely available (answer to 1?) Otis -- Sematext -- http://sematext.com/ --

Re: French synonyms & Online synonyms

2008-09-30 Thread Walter Underwood

Synonyms are domain-specific, so general-purpose lists are not very useful. Ultraseek shipped a British-American synonym list as an example, but even that wasn't very general. One of our customers was a chemical company and was very surprised when the search "rocket fuel" suggested "arugula", even

Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread KyleMorrison

I apologize for spamming this mailing list with my problems, but I'm at my wits end. I'll get right to the point. I have an xml file which is ~1GB which I wish to index. If that is successful, I will move to a larger file of closer to 20GB. However, when I run my data-config(let's call it dc.xml)

Re: Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread Mark Miller

Exception indicates a threading bug, not a scaling issue... I'm sure the issue will be illuminated on soon though. KyleMorrison wrote: I apologize for spamming this mailing list with my problems, but I'm at my wits end. I'll get right to the point. I have an xml file which is ~1GB which I wis

Re: commit not fired

2008-09-30 Thread Chris Hostetter

: When I check my commit.log nothings is runned commit.log is only updated by the bin/commit script ... not by Solr itself. you'll see Solr log commits in whatever logs are kept by your servlet container. : My snapshooter too: but no log in snapshooter.log : : : ./data/solr/b

Re: French synonyms & Online synonyms

2008-09-30 Thread Pierre Auslaender

True, synonyms can be grouped in cliques based on the strength of their "resemblence" given a specific context. But what I'm indexing is the text content of TV programs produced by a public television, so the context is very large and non-specific. What I want is to find "automobile" for "car"

Calculated Unique Key Field

2008-09-30 Thread Jim Murphy

My unique key field is an MD5 hash of several other fields that represent identity of documents in my index. We've been calculating this externally and setting the key value in documents but have found recurring bugs as the number and variety of inserting consumers has grown... So I wanted to mo

Discarding undefined fields in query

2008-09-30 Thread Jérôme Etévé

Hi All, I wrote a customized query parser which discards non-schema fields from the query (I'm using the schema field names from req.getSchema().getFields().keySet() ) . This parser works fine in unit tests. But still I have an error from the webapp when I try to query my schema with non exis

Re: Calculated Unique Key Field

2008-09-30 Thread Jim Murphy

It may not be all that relevant but our Update handler extends from DirectUpdateHandler2. -- View this message in context: http://www.nabble.com/Calculated-Unique-Key-Field-tp19747955p19748032.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Dismax , "query phrases"

2008-09-30 Thread Chris Hostetter

: That's why I was wondering how Dismax breaks it all apart. It makes sense...I : suppose what I'd like to have is a way to tell dismax which fields NOT to : tokenize the input for. For these fields, it would pass the full q instead of : each part of it. Does this make sense? would it be useful at

Re: Applying Stop words for Field Type String

2008-09-30 Thread Chris Hostetter

: Question : Is it possible to do the same for String type or not, since the StrField doesn't support an analyzer like TextField does, but if you define "string" to be a TextField using KeywordTokenizer it will preserve the whole value as a single token and you can then use the StopWordFilterF

Re: Calculated Unique Key Field

2008-09-30 Thread Shalin Shekhar Mangar

On Wed, Oct 1, 2008 at 12:08 AM, Jim Murphy <[EMAIL PROTECTED]> wrote: > > Question1: Is this the best place to do this? This sounds like a job for http://wiki.apache.org/solr/UpdateRequestProcessor -- Regards, Shalin Shekhar Mangar.

Re: Monitoring solr stats with munin?

2008-09-30 Thread Chris Hostetter

: > has anyone had the need and maybe already written a munin plugin to graph : > some informations from e.g. admin/stats.jsp ? : Something like that, though I havn't seen anything available publicly yet. Its Anything exposed via stats.jsp should also be available via JMX (if you enable JMX) ...

Re: Searching Question

2008-09-30 Thread Jake Conk

How would I write a custom Similarity factor that overrides the TF function? Is there some documentation on that somewhere? On Sat, Sep 27, 2008 at 5:14 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > > On Sep 26, 2008, at 2:10 PM, Otis Gospodnetic wrote: > >> It might be easiest to store the thr

Re: Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread Shalin Shekhar Mangar

Hmm, strange. This is Solr 1.3.0, right? Do you have any transformers applied to these multi-valued fields? Do you have stream="true" in the entity? On Tue, Sep 30, 2008 at 11:01 PM, KyleMorrison <[EMAIL PROTECTED]> wrote: > > I apologize for spamming this mailing list with my problems, but I'm

Re: Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread KyleMorrison

Yes, this is the most recent version of Solr, stream="true" and stopwords, lowercase and removeDuplicate being applied to all multivalued fields? Would the filters possibly be causing this? I will not use them and see what happens. Kyle Shalin Shekhar Mangar wrote: > > Hmm, strange. > > This

Re: Integrating external stemmer in Solr and pre-processing text

2008-09-30 Thread Jaco

Hi, The suggested approach with a TokenFilter extending the BufferedTokenStream class works fine, performance is OK - the external stemmer is now invoked only once for the complete search text. Also, from a functional point of view, the approach is useful, because it allows for other filtering (i.

Re: Searching Question

2008-09-30 Thread Otis Gospodnetic

The easiest thing is to look at Lucene javadoc and look for Similarity and DefaultSimilarity classes. Then have a peek at Lucene contrib to get some other examples of custom Similarity. You'll just need to override one method, for example: -- Sematext -- http://sematext.com/ -- Lucene - Sol

Re: Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread KyleMorrison

As a follow up: I continued tweaking the data-config.xml, and have been able to make the commit fail with as little as 3 fields in the sdc.xml, with only one multivalued field. Even more strange, some fields work and some do not. For instance, in my dc.xml: . . . and in the schema.xml: . . .

Re: Searching Question

2008-09-30 Thread Otis Gospodnetic

I hit ctrl-S by mistake. This is the method you are after: http://lucene.apache.org/java/2_3_2/api/core/org/apache/lucene/search/DefaultSimilarity.html#tf(float) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Otis Gospodnetic <[EMAIL PRO

Re: Discarding undefined fields in query

2008-09-30 Thread Yonik Seeley

On Tue, Sep 30, 2008 at 2:42 PM, Jérôme Etévé <[EMAIL PROTECTED]> wrote: > But still I have an error from the webapp when I try to query my > schema with non existing fields in my query ( like foo:bar ). > > I'm wondering if the query q is parsed in a very simple way somewhere > else (and independe

Re: Are facet searches slower on large indexes?

2008-09-30 Thread Chris Hostetter

the time factor has more to do with teh number of distinct values in the field being faceted on then it does the number of documents. with 1 million documents there are probably a lot more indexed terms in the "contents" field then there are with only 1000 documents. As an inverted index, the

Re: Question about facet.prefix usage

2008-09-30 Thread Simon Hu

not really. facet.query filters the result set. Here we need to filter the facet counts by multiple facet prefixes. facet.query would work only if the faceted field is not a multi-value field. Erik Hatcher wrote: > > If I'm not mistaken, doesn't facet.query accomplish what you want? > >

Re: Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread Noble Paul നോബിള്‍ नोब्ळ्

I guess it is a threading problem. I can give you a patch. you can raise a bug --Noble On Wed, Oct 1, 2008 at 2:11 AM, KyleMorrison <[EMAIL PROTECTED]> wrote: > > As a follow up: I continued tweaking the data-config.xml, and have been able > to make the commit fail with as little as 3 fields in th

Re: Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread Noble Paul നോബിള്‍ नोब्ळ्

this patch is created from 1.3 (may apply on trunk also) --Noble On Wed, Oct 1, 2008 at 9:56 AM, Noble Paul നോബിള്‍ नोब्ळ् <[EMAIL PROTECTED]> wrote: > I guess it is a threading problem. I can give you a patch. you can raise a bug > --Noble > > On Wed, Oct 1, 2008 at 2:11 AM, KyleMorrison <[EMAIL

Re: How to select one entity at a time?

2008-09-30 Thread con

Hi guys, In the URL, http://localhost:8983/solr/select/?q= :bob&version=2.2&start=0&rows=10&indent=on&wt=json q=: applies to a field and not to an entity. So If I have 3 entities like:

Re: How to select one entity at a time?

2008-09-30 Thread Noble Paul നോബിള്‍ नोब्ळ्

The entity and the select query has no relationship The entity comes into picture when you do a dataimport eg: http://localhost:8983/solr/dataimport?command=full-import&enity=user This is an indexing operation On Wed, Oct 1, 2008 at 11:26 AM, con <[EMAIL PROTECTED]> wrote: > > Hi guys, > In the

Does Solr Indexing Websites possible?

2008-09-30 Thread RaghavPrabhu

Hi all, I want to enable the search functionality in my website. Can i use solr for indexing the website? Is there any option in solr.Pls let me know as soon as possible. Thanks in advance Prabhu.K -- View this message in context: http://www.nabble.com/Does-Solr-Indexing-Websites-possible--t

Re: How to select one entity at a time?

2008-09-30 Thread con

Of course I agree. But while performing a search, if I want to search only the data from USER table, how can I acheive it. Suppose I have a user name bob in both USER and MANAGER tables. So when I perform http://localhost:8983/solr/dataimport?command=full-import , all the USER and MANAGER values

38 matches

Mail list logo