Re: Consequences for using multivalued on all fields

2010-12-21 Thread J.J. Larrea
Someone please correct me if I am wrong, but as far as I am aware index format is identical in either case. One benefit of allowing one to specify a field as single-valued is similar to specifying that a field is required: Providing a safeguard that index data conforms to requirements. So maki

Re: Merging multiple Solr Indexes

2007-10-08 Thread J.J. Larrea
At 9:51 PM -0700 10/7/07, Chris Hostetter wrote: >: Thanks for the pointer. After two silent days waiting for reply, >: I decided to implement a command line for that. Works like a charm !!! > >well, sometimes people just don't post because they don't know the >answer to something (better then 50

Re: unable to figure out nutch type highlighting in solr....

2007-10-05 Thread J.J. Larrea
At 9:32 PM +1000 10/5/07, Adrian Sutton wrote: >From what people are suggesting though you'd be better off converting to plain >text before indexing it with Solr. Something like JTidy (http://jtidy.sf.net) >can parse most HTML that's around and you can iterate over the DOM to extract >the text f

Re: unable to figure out nutch type highlighting in solr....

2007-10-04 Thread J.J. Larrea
At 3:45 PM -0700 10/4/07, Mike Klaas wrote: >I'm actually somewhat surprised that several people are interested in this but >none have have been sufficiently interested to implement a solution to >contribute: > >http://issues.apache.org/jira/browse/SOLR-42 I just devised a workaround earlier in

Re: Converting German special characters / umlaute

2007-09-27 Thread J.J. Larrea
At 12:13 PM -0400 9/27/07, Steven Rowe wrote: >Chris Hostetter wrote: >> : is there an analyzer which automatically converts all german special >> : characters to their specific dissected from, such as ü to ue and ä to >> : ae, etc.?! >> >> See also the ISOLatin1TokenFilter which does this regardle

Re: Solr and FieldCache

2007-09-20 Thread J.J. Larrea
At 5:30 PM +0200 9/20/07, Walter Ferrara wrote: >I have an index with several fields, but just one stored: ID (string, >unique). >I need to access that ID field for each of the tops "nodes" docs in my >results (this is done inside a handler I wrote), code looks like: > > Hits hits = searcher.se

Re: NFS Stale handle in a distributed SOLR deployment

2007-09-13 Thread J.J. Larrea
Sometimes one has to make things work in the environment one is handed (e.g. virtualized servers, ALL storage resources resident on a SAN and accessed via NFS, read-only mounts on the deployment instances with only the production indexers having write access). While I agree that fast local inde

Re: DismaxRequestHandler reports sort by score as invalid

2007-06-21 Thread J.J. Larrea
Because "score desc" is the default Lucene & Solr behavior when no explicit sort is specified, QueryParsing.parseSort() returns a null sort so that the non-sort versions of the query execution routines get called. However the caller SolrPluginUtils.parseSort issues that warning whenever it gets

Re: Wildcards / Binary searches

2007-06-06 Thread J.J. Larrea
Hi, Hoss. I have a number of things I'd like to post... but the generally-useful stuff is unfortunately a bit interwoven with the special-case stuff, and I need to get out of breathing-down-my-back deadline mode to find the time to separate them, clean up and comment, make test cases, etc. Hop

Re: Wildcards / Binary searches

2007-06-06 Thread J.J. Larrea
At 4:40 PM +0100 6/6/07, galo wrote: >1. I want to use solr for some sort of live search, querying with incomplete >terms + wildcard and getting any similar results. Radioh* would return >anything containing that string. The DisMax req. hander doesn't accept >wildcards in the q param so i'm tryi

Re: Using Solr without using a web-app

2007-05-16 Thread J.J. Larrea
At 4:29 PM -0400 5/15/07, Yonik Seeley wrote: >On 5/15/07, bhecht <[EMAIL PROTECTED]> wrote: >>[...] the function parseRules in SynonymFilterFactory is private > >If you start using Solr's configuration, you drag more of Solr in. > >You can add the synonyms to the SynonymMap yourself, or if you wan

Re: Results per user

2007-04-13 Thread J.J. Larrea
I wrote the following after hurriedly reading Grant Ingersoll's question, and I completely missed the "to remove results that have already been viewed" bit. Which leads me to think what I wrote may have no bearing on this issue... but perhaps it may have bearing on someone else's issue? - J

Re: Performance penalty for Multivalued field?

2007-03-16 Thread J.J. Larrea
Perhaps not relevant in this case, but for the record there is one more SOLR behavior affected by multiValued: 3) when faceting, a multiValued field always uses the TermEnum algorithm rather than the FieldCache algorithm. depending on the data, this can have a dramatic effect on faceting perf

Re: listing/enumerating field information

2007-01-14 Thread J.J. Larrea
Hoss, I'm delighted to have annoyed you, if only *slightly*! ;-) - J.J. PS: +1 on Yonik's subsequent comment. At 8:04 PM -0800 1/14/07, Chris Hostetter wrote: >: - Apply the faceting criteria (e.g. facet.zeros, though facet.mincount >: would have been a more flexible option in all cases) > >yo

Re: listing/enumerating field information

2007-01-13 Thread J.J. Larrea
At 5:06 AM -0500 1/12/07, Erik Hatcher wrote: >What the user-interface needs is a way to ask Solr for terms that begin with a >specified prefix, as the user types. Paging via start/rows is necessary, and >also sorting by frequency given some specified constraints. I like the >start/end term i

Re: Searching multiple indices (solr newbie)

2007-01-09 Thread J.J. Larrea
+2 cents: At 2:43 PM +0530 1/9/07, Mekin Maheshwari wrote: >In general I felt that smaller indexes with different requirements >might be more flexible than 1 large index (Would a 3G index >considered large ?). eg. backing up the index, deploying a fresh >index, etc. But Solr does address most of

Re: Facet Performance

2006-12-08 Thread J.J. Larrea
Andrew Nagy, ditto on what Yonik said. Here is some further elaboration: I am doing much the same thing (faceting on Author etc.). When my Author field was defined as a solr.TextField, even using solr.KeywordTokenizerFactory so it wasn't actually tokenized, the faceting code chose the QueryFilt

Multi-Valued Faceting

2006-12-06 Thread J.J. Larrea
based on actual query results. Does anyone have any insight on how efficient that may or may not be? And if I have gotten something dreadfully wrong in my understanding of current implementation or proposed enhancement, I would appreciate getting straightened out. Thanks, J.J. Larrea

Re: Simple Faceted Searching out of the box

2006-09-22 Thread J.J. Larrea
Regarding XML databases, there is an excellent open-source XML database 'eXist' which currently uses indexes to speed up both structure-based and content-based retrieval via XQuery; there are plans on their development roadmap to replace parts of the indexing mechanism, particularly fulltext ana