Re: sort problem

2007-08-30 Thread Mike Klaas
the field you are trying to sort by, and what kinds of values are indexed therein? -Mike Any idea? thanks Java heap space java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.index.SegmentTermEnum.termInfo (SegmentTermEnum.java:170) at

Re: XML output for Analysis admin functionality

2007-08-30 Thread Mike Klaas
ou tried it? Otherwise, I would be tempted to copy the (java) code from analysis.jsp and use it directly. -Mike

Re: performance questions

2007-08-30 Thread Mike Klaas
r caches in most DBs. Another reason why people use stored procs is to prevent multiple round-trips in a multi-stage query operation. This is exactly what complex RequestHandlers do (and the equivalent to a custom stored proc would be writing your own handler). -Mike

Re: minimum occurances of term in document

2007-08-30 Thread Mike Klaas
queries is to "skip" past a document that doesn't meet the criteria you want and not return any score for it at all. Good to know. Thanks Hoss. -Mike

Re: minimum occurances of term in document

2007-08-30 Thread Mike Klaas
tially give you want you want. Note too that by default solr only indexes the first 10k tokens, so this should work for all documents in the index. -Mike

Re: Heap size error during indexing

2007-09-01 Thread Mike Klaas
you launch tomcat. You can also reduce indexing memory usage by reducing maxBufferedDocs in solrconfig.xml (say, from 1000 to 100), and by committing once in a while (eg. autoCommit/maxDocs=50) -Mike

Re: Indexing longer documents using Solr...memory issue after index grows to about 800 MB...

2007-09-04 Thread Mike Klaas
7;t seem to have any effect. Does the field contain a match against one of the terms you are querying for? -Mike

Re: solr.py problems with german "Umlaute"

2007-09-06 Thread Mike Klaas
unicode character directly: >>> u'\u00e9' u'\xe9' >>> print u'\u00e9' é This is less complicated in the usual case of reading data from a file, because the encoding should be known (terminal encoding issues are much trickier). Use codecs.open() to get a unicode-output text stream. -Mike

Re: Indexing very large files.

2007-09-06 Thread Mike Klaas
you jvm while indexing such hugeness. (Note that other input methods, like cvs, might behave better, but I haven't examined them to verify.) -Mike java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCap

Re: Slow response

2007-09-06 Thread Mike Klaas
ed to only have one value per doc, this could greatly accelerate your faceting. There are probably fewer unique subjects, so strategy 1 is likely fine. To use strategy 2, just make sure that multivalued="false" is set for those fields in schema.xml -Mike

Re: Slow response

2007-09-06 Thread Mike Klaas
On 6-Sep-07, at 3:25 PM, Mike Klaas wrote: There are essentially two facet computation strategies: 1. cached bitsets: a bitset for each term is generated and intersected with the query restul bitset. This is more general and performs well up to a few thousand terms. 2. field

Re: Indexing very large files.

2007-09-07 Thread Mike Klaas
ven't heard any suggestions as to how to do this with a stock Solr install, other than increase vm memory, I'll assume it will have to be done with a custom solution. Well, have you tried the CSV importer? -Mike

Re: adding without overriding dups - DirectUpdateHandler2.java does not implement?

2007-09-07 Thread Mike Klaas
overwrites that switching to DUH is probably a win.) DUH also does not implement many newer update features, like autoCommit. -Mike

Re: How to patch

2007-09-10 Thread Mike Klaas
che/solr/update' ) Patches should be generally applied from the top-level solr directory with 'patch -p0' -Mike

Re: Solr and KStem

2007-09-10 Thread Mike Klaas
button. It is also important to verify that you have the legal right to grant the code to ASF (since it is probably your employer's intellectual property). Legal issues are a hassle, but are unavoidable, I'm afraid. Thanks again, -Mike On 10-Sep-07, at 10:22 AM, Wagner,Har

Re: New user question: How to show all stored fields in a result

2007-09-10 Thread Mike Klaas
wn framework that is generating multiple entries for this input case? glad to hear you figured it out, -Mike

Re: DirectSolrConnection, write.lock and Too Many Open Files

2007-09-10 Thread Mike Klaas
ing). In the future, segment merging will occur in a separate thread, further improving concurrency. -Mike

Re: Removing lengthNorm from the calculation

2007-09-10 Thread Mike Klaas
ethod is filled with clauses like: } else if ("whatever".equals(fieldName)) { return super.lengthNorm(fieldName, / Math.max(numTokens, MIN_LENGTH)); where MIN_LENGTH can be quite long for some fields. -Mike

Re: Is there a way to change default filter query parser operator

2007-09-11 Thread Mike Klaas
ery parser behaviour is mostly designed to make sense for parsing _user_-entered queries. You can achieve AND behaviour for filter queries by specifying multiple fq parameters, or by prepending each in a series of clauses by +. -Mike

Re: Solr 1.2 or nightly?

2007-09-11 Thread Mike Klaas
work committed in the last week in trunk, so you may want to use a snapshot from two weeks ago if you want oozing- rather than bleeding-edge. -Mike

Re: Solr commit takes too long

2007-09-11 Thread Mike Klaas
akes a few milliseconds, but the commit takes about 1 minute. Could you please recommend what we should check for? Or perhaps some tuning parameters? Could be the cache auto-warming. Try reducing this to zero. -Mike

Re: multiple indices

2007-09-11 Thread Mike Klaas
names... There just might be something like that in 1.3... -Mike

Re: multiple indices

2007-09-11 Thread Mike Klaas
solr webapps within a single container/ process/jvm In the future, (1.3 or farther down the line), another option will be: 3. multiple indices within a single solr webapp, added/removed on the fly. -Mike

Re: Problem with word 'Repository' in facets

2007-09-13 Thread Mike Klaas
"Repositori". You are faceting on a field that is analyzed with a stemmer (PorterFilterStemmer). If you do not want that behaviour (but want it for searchign), use copyField to index in another field that does not have unexpected analysis (preferably, none). -Mike

Re: Batch indexing a large number of records

2007-09-14 Thread Mike Klaas
hours over http. Just batch a few (10) docs per http POST, and use around N+1 threads (N=# processors). -Mike

Re: Slow response

2007-09-14 Thread Mike Klaas
On 14-Sep-07, at 3:38 PM, Tom Hill wrote: Hi Mike, Thanks for clarifying what has been a bit of a black box to me. A couple of questions, to increase my understanding, if you don't mind. If I am only using fields with multiValued="false", with a type of "

Re: commit, concurrency, full text search

2007-09-17 Thread Mike Klaas
ed it does. see http://lucene.apache.org/java/docs/queryparsersyntax.html http://wiki.apache.org/solr/SolrQuerySyntax -Mike

Re: Indexing Speed

2007-09-17 Thread Mike Klaas
see above). 2) If docs are sent asynchronously, how well could Solr can index? As long as you don't send 1.7million docs at once, you should see a performance improvement. -Mike

Re: Formula for open file descriptors

2007-09-18 Thread Mike Klaas
On 18-Sep-07, at 5:39 PM, Lance Norskog wrote: Hi- In early June Mike Klass posted a formula for the number of file descriptors needed by Solr: For each segment, 7 + num indexed fields per segment. There should be log_{base mergefactor}(numDocs) * mergeFactor segments

Re: Exact phrase highlighting

2007-09-19 Thread Mike Klaas
CENE-794? page=com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanel#action_12526803), but it is currently not integrated. It would make a great project to get one's hands dirty contributing, though :) -Mike

Re: Exact phrase highlighting

2007-09-19 Thread Mike Klaas
On 19-Sep-07, at 2:39 PM, Marc Bechler wrote: Hi Mike, thanks for the quick response. > It would make a great project to get one's hands dirty contributing, though :) ... sounds like giving a broad hint ;-) Sounds challenging... I'm not sure about that--it is supposed to

Re: How can i make a distribute search on Solr?

2007-09-19 Thread Mike Klaas
want to know whether there is a component existed can do the distributed search based on Solr. https://issues.apache.org/jira/browse/SOLR-303? page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel regards, -Mike

Re: Weird bug in query

2007-09-21 Thread Mike Klaas
ntax (+ == clause is required): articol_tag:pilonul ii AND articol_tag:facultative == +:ii +articol_tag:facultative articol_tag:pilonul articol_tag:facultative AND articol_tag:pilonul ii == +articol_tag:facultative +articol_tag:pilonul :ii try: articol_tag:facultative AND articol_tag:"pilonul ii" -Mike

Re: Scripts not working on cron - always asking for password

2007-09-21 Thread Mike Klaas
link, but trying to cover too many unix basics will clutter up the documentation. -Mike

Re: olap with solr (math operations on facets)

2007-09-21 Thread Mike Klaas
deparment1: 100 (the sum of each value) Is it clear? Currently this is not possible out of the box with Solr. -Mike

Re: olap with solr (math operations on facets)

2007-09-21 Thread Mike Klaas
On 21-Sep-07, at 2:42 PM, Rafael Rossini wrote: Thanks for the reply Mike. Is there any plans on doing some like this? Or some direction anyone could give? Probably the easiest thing to do is write a custom request handlers that iterates over the field cache and computes the statistics

Re: dataset parameters suitable for lucene application

2007-09-26 Thread Mike Klaas
No search software can search 2.5 billion docs (assuming web-sized documents) in 5ms on a single server. You certainly could build such a system with Solr distributed over 100's of nodes, but this is not built into Solr currently. -Mike

Re: custom sorting

2007-09-26 Thread Mike Klaas
e not made on behalf of the firm. Sorry, I'm afraid the above email is already irrevokably publicly archived. -Mike

Re: Selecting Distinct values?

2007-09-27 Thread Mike Klaas
n try? Or, can I adjust the facets somehow to make this work? http://wiki.apache.org/solr/ SimpleFacetParameters#head-1b281067d007d3fb66f07a3e90e9b1704cbc59a3 cheers, -Mike

Re: maxBufferedDocs vs autoCommit->maxDocs

2007-09-27 Thread Mike Klaas
maxBufferedDocs > autoCommit it does not have any effect. cheers, -Mike

Re: Index multiple languages with multiple analyzers with the same field

2007-09-28 Thread Mike Klaas
age-indexing-and-searching- tf3885324.html#a11012939> -Mike

Re: searching remote indexes

2007-09-28 Thread Mike Klaas
Solr's main interface is http, so you can connect to that remotely. Query each machine and combine the results using you own business logic. Alternatively, you can try out the query distribution code being developed in <http://issues.apache.org/jira/browse/SOLR-303> -Mike O

Re: Color search

2007-09-28 Thread Mike Klaas
tokenizer side of things. If there is a consensus on a sensible way of doing this, I could contribute the bits of code that I have. HTH, -Mike

Re: Letter-number transitions - can this be turned off

2007-10-01 Thread Mike Klaas
versa) and ignores it. Another approach that I am using locally is to maintain the transitions, but force tokens to be a minimum size (so r2d2 doesn't tokenize to four tokens but arrrdeee does). There is a patch here: http://issues.apache.org/jira/browse/SOLR-293 If you vote for it, I promise to get it in for 1.3 -Mike

Re: correlation between score and term frequency

2007-10-01 Thread Mike Klaas
/scoring.html to start out, in particular the link to the Similarity class javadocs. -Mike

Re: Indexing HTML

2007-10-04 Thread Mike Klaas
o strip html if you want). I recommend stripping the html yourself, and putting titles, anchors, etc in separate fields. I believe that it would be possible to write this as a Solr update- handler plugin, if you wanted it to all run in one place. -Mike

Re: unable to figure out nutch type highlighting in solr....

2007-10-04 Thread Mike Klaas
On 2-Oct-07, at 12:52 AM, Ravish Bhagdev wrote: I see that you're using the HTML analyzer. Unfortunately that does not play very well with highlighting at the moment. You may get garbled output. -Mike

Re: Solr - Lucene Query

2007-10-04 Thread Mike Klaas
sould be found. If you want _only_ that document to match, you should try something like a phrase query with a bit fo slop: trade1:"the appraisal station"~10 -Mike

Re: unable to figure out nutch type highlighting in solr....

2007-10-04 Thread Mike Klaas
for highlighting is: 1. hl=true 2. hl.fl=myfield _If_ that field matches one of the query terms, you should see snippets in the generated response. EVen if not, you should see a section of the response (it will be empty). regards, -Mike

Re: unable to figure out nutch type highlighting in solr....

2007-10-04 Thread Mike Klaas
ewhat surprised that several people are interested in this but none have have been sufficiently interested to implement a solution to contribute: http://issues.apache.org/jira/browse/SOLR-42 -Mike

Re: Handling empty query

2007-10-04 Thread Mike Klaas
which is used as the query if the queyr string is emtpy. To return all documents, set "alt.q=*:*" -Mike

Re: unable to figure out nutch type highlighting in solr....

2007-10-05 Thread Mike Klaas
inst improving Solr's handling of HTML data, but it is the type of thing that is unlikely to happen unless someone who cares about it steps up. Patches welcome :) -Mike

Re: Best way to change weighting based on the presence of a field

2007-10-05 Thread Mike Klaas
you know at index time that the document is shady, the easiest way to de-emphasize it globally is to set the document boost to some value other than one. ... cheers, -Mike

Re: Best way to change weighting based on the presence of a field

2007-10-05 Thread Mike Klaas
n the value stored in a field (which could represent a range of 'badness'). This can be used directly in the dismax handler using the bf (boost function) query parameter. -Mike

Re: Facets and running out of Heap Space

2007-10-09 Thread Mike Klaas
ugh quite close in space requirements for a 30-ary field on your index size). -Mike

Re: Facets and running out of Heap Space

2007-10-09 Thread Mike Klaas
On 9-Oct-07, at 7:53 PM, Stu Hood wrote: Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_ Mike, how did you calculate that value? I'm trying to tune my caches, and any equations that could be used to dete

Re: Cache Memory Usage (was: Facets and running out of Heap Space)

2007-10-09 Thread Mike Klaas
how could this be when it is storing the same information as the filterCache, but with the addition of sorting? Solr caches only the top N documents in the queryResultCache (boosted by queryResultWindowSize), which amounts to 40-odd ints, 40-odd float, and change. -Mike

Re: Facets and running out of Heap Space

2007-10-10 Thread Mike Klaas
g minDf to a very high value should always outperform such an approach. -Mike DW -Original Message- From: Stu Hood [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 09, 2007 10:53 PM To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space Using the fi

Re: quick allowDups questions

2007-10-10 Thread Mike Klaas
false? If not, what do I need to do to make sure allowDups is set to false when I'm adding these docs? It is the normal mode of operation for Solr, so I'd be surprised if it wasn't the default in solrj (but I don't actually know). -Mike

Re: start tag not allowed in epilog

2007-10-10 Thread Mike Klaas
r end. If the deletes are doc ids, then you can collect a bunch at once and do id:xxx id:yyy id:zzz id:aaa id:bbb to perform them all at once. -Mike

Re: Facets and running out of Heap Space

2007-10-10 Thread Mike Klaas
y have millions of unique values). It would be helpful to know which field is causing the problem. One way would be to do a sorted query on a quiescent index for each field, and see if there are any suspiciously large jumps in memory usage. -Mike -Original Message- From:

Re: Facets and running out of Heap Space

2007-10-10 Thread Mike Klaas
two fields. Have you tried setting multivalued=true without reindexing? I'm not sure, but I think it will work. -Mike

Re: showing results per facet-value efficiently

2007-10-11 Thread Mike Klaas
prefix. If you want to facet multiple doclists from different queries in one request, just write your own request handler that takes a multi- valued q param and facets on each. I didn't answer all the questions in your email, but I hope this clarifies things a bit. Good luck! -Mike

doubled/halved performance?

2007-10-11 Thread Mike Klaas
y processing: 101.0 total time 1.0 setup/query parsing 68.0 main query 30.0 faceting 0.0 pre fetch 2.0 debug 201.0 total time 1.0 setup/query parsing 138.0 main query 58.0 faceting 0.0 pre fetch 4.0 debug I can't really think of a plausible explanation. Fortuitous instruction pipelining? It is hard to imagine a cause that wouldn't exhibit consistency. -Mike

Re: doubled/halved performance?

2007-10-11 Thread Mike Klaas
On 11-Oct-07, at 2:37 PM, Yonik Seeley wrote: On 10/11/07, Mike Klaas <[EMAIL PROTECTED]> wrote: I'm seeing some interesting behaviour when doing benchmarks of query and facet performance. Note that the query cache is disabled, and the index is entirely in the OS disk cache. fil

Re: Instant deletes without committing

2007-10-11 Thread Mike Klaas
ly: it doesn't actually make the deletes visible until -Mike

Re: solr, snippets and stored field in nutch...

2007-10-11 Thread Mike Klaas
t there was a way of providing summaries without storing doc contents, I would pee my pants with happiness and it would be in Solr faster than you can say "diaper". cheers, -Mike On 11-Oct-07, at 3:48 PM, Ravish Bhagdev wrote: Hey guys, Checkout this thread I opened on nutch mail

Re: solr, snippets and stored field in nutch...

2007-10-11 Thread Mike Klaas
On 11-Oct-07, at 4:34 PM, Ravish Bhagdev wrote: Hi Mike, Thanks for your reply :) I am not an expert of either! But, I understand that Nutch stores contents albeit in a separate data structure (they call segment as discussed in the thread), but what I meant was that this seems like much more

Re: autowarm static queries

2007-10-11 Thread Mike Klaas
y reason why you are faceting on a field that you are restricting? Clearly, the answer will be '1001644' --> , (all other categories) -> 0. Just use numFound. Also, if there can only be one category per doc, make sure you are using the fieldCache method for category_id. -Mike

Re: autowarm static queries

2007-10-11 Thread Mike Klaas
e as I'm not the one who originally wrote the code. Nevermind. I was thinking of category as being single-valued. For multi-valued category, it is still necessary to do faceting to find sub-categories. Sorry! -Mike

Re: Will turning off the stored setting on a field remove it from the index?

2007-10-12 Thread Mike Klaas
ing documents due to config changes. -Mike

Re: Solr, operating systems and globalization

2007-10-18 Thread Mike Klaas
easy is changing Solr's interpretation of NOW in DateMath to be UTC. What is the correct way to go about this? -Mike

Re: FunctionQuery, DisMax and Highlighting

2007-10-18 Thread Mike Klaas
27;m pleased to inform you that DisMax already provides highlighting, in exactly the same was as does StandardRequestHandler. -Mike

Re: Solr + Tomcat Undeploy Leaks

2007-10-18 Thread Mike Klaas
I'm not sure that many people are dynamically taking down/starting up Solr webapps in servlet containers. I certainly perfer process-level management of my (many) Solr instances. -Mike On 18-Oct-07, at 10:40 AM, Stu Hood wrote: Any ideas? Has anyone had experienced this problem

Re: Solr, operating systems and globalization

2007-10-18 Thread Mike Klaas
f your user's are all over the world, you'd ideally want to round to _their_ timezone, but I don't see how this is realistic. thanks, -Mike

Re: Solr + Tomcat Undeploy Leaks

2007-10-18 Thread Mike Klaas
On 18-Oct-07, at 1:01 PM, Stu Hood wrote: I'm running SVN r583865 (1.3-dev). Mike: when you say 'process level management', do you mean starting them statically? Or starting them dynamically, but using a different container for each instance? I have a large number o

Re: Solr + Tomcat Undeploy Leaks

2007-10-19 Thread Mike Klaas
On 19-Oct-07, at 7:19 AM, Ed Summers wrote: On 10/18/07, Mike Klaas <[EMAIL PROTECTED]> wrote: I realize this is a bit off-topic -- but I'm curious what the rationale was behind having that many solr instances on that many machines and how they are coordinated. Is it a master/sla

Re: Forced Top Document

2007-10-24 Thread Mike Klaas
what you want. If you use dismax, you can add the boost to the 'bq' parameter to affect scoring only (will not match the doc if it wouldn't have been matched anyway). -Mike

Re: Payloads for multiValued fields?

2007-10-24 Thread Mike Klaas
that point, the information about the structure of the document is not available. It is computable given sufficient effort, but certainly not something Solr should provide by default. Have you considered storing each section as a separate Solr Document? -Mike

Re: Payloads for multiValued fields?

2007-10-25 Thread Mike Klaas
On 24-Oct-07, at 12:39 PM, Alf Eaton wrote: Mike Klaas wrote: On 24-Oct-07, at 7:10 AM, Alf Eaton wrote: Yes, I was just trying that this morning and it's an improvement, though not ideal if the field contains a lot of text (in other words it's still a suboptimal workaround).

Re: SOLR 1.3 Release?

2007-10-25 Thread Mike Klaas
we are already close to finishing. If you mean that there are a lot of small tweaks that the community doesn't have access to because we haven't done a release, I'm inclined to agree that that would be ideal. It is more work to do maintain that kind of release schedule (requires work on multiple branches at once). -Mike

Re: Phrase Query Performance Question

2007-10-30 Thread Mike Klaas
deed--phrase matching uses a completely different part of the index, so that needs to be warmed too. One thing to try is solr trunk: it contains some speedups for phrase queries (though perhaps not as substantial as you hope for). -MIke

Re: Phrase Query Performance Question

2007-10-31 Thread Mike Klaas
ave you tried other queries? 937ms seems a little high, even for phrase queries. Anyway I will collect the statistic on linux first and try out other options. Have you tried using the performance enhancements present in solr-trunk? -Mike

Re: Phrase Query Performance Question

2007-11-01 Thread Mike Klaas
ll be better. Anyone has experience on that? Unlikely, though it might help you slightly at a high query rate with high cache hit ratios. -Mike

Re: Phrase Query Performance Question

2007-11-02 Thread Mike Klaas
25%. It still feels to me that you are trying doing something unique with your phrase queries. Unfortunately, you still haven't said what you are trying to do in general terms, which makes it very difficult for people to help you. -Mike

Re: Solr and Lucene Indexing Performance

2007-11-02 Thread Mike Klaas
scenes if you aren't using multiple threads. Some possible differences: 1. Solr has more aggressive default buffering settings (maxBufferedDocs, mergeFactor) 2. solr trunk (if that is what you are using) is using a more recent version of Lucene than the released 2.2 -Mike

Re: sorting on dynamic fields - good, bad, neither?

2007-11-05 Thread Mike Klaas
ev (lucene)? I think someone once implemented a solution using hashmaps for sorting, but I can't recall the issue #. -Mike

Re: Score of exact matches

2007-11-05 Thread Mike Klaas
when stemming, you'd store (account accountant) (account accounts), etc., when filtering, (epee épée) (fantome fantôme), etc. Now when querying, transform your query into ^10: épée -> epee épée^10 accountant -> account accountant^10 A bit of work to do in general, though. -Mike

Re: Facets queries are not caching

2007-11-06 Thread Mike Klaas
to house the # of unique values you are faceting on? Check the cache statistics on the admin gui. Are there large numbers of evictions? Alternatively, is company_facet multi- or -single-valued? If the latter, the filter cache is not used at all. -Mike More generally, does anyone have a

Re: Using Embedded and HTTP Post alternatively

2007-11-06 Thread Mike Klaas
ven't been discovered yet. I'm using it in production. More important than any claims we make is running it against your own application's test suite, of course. -Mike

Re: uniqueKey type

2007-11-06 Thread Mike Klaas
ments that have the same resulting token will be considered "the same"). If this is violated, the behaviour is undefined (but I wouldn't be surprised if the first token was used). -Mike

Re: start.jar -Djetty.port= not working

2007-11-07 Thread Mike Davies
Hi Brian, Found the SVN location, will download from there and give it a try. Thanks for the help. On 07/11/2007, Mike Davies <[EMAIL PROTECTED]> wrote: > > I'm using 1.2, downloaded from > > http://apache.rediris.es/lucene/solr/ > > Where can i get the trunk ver

start.jar -Djetty.port= not working

2007-11-07 Thread Mike Davies
8983. Any suggestions? Also, I'd really like to get hold of the source code to the start.jar but I cant seem to find it anywhere. Again, any suggestions? Thanks Mike

Re: start.jar -Djetty.port= not working

2007-11-07 Thread Mike Davies
I'm using 1.2, downloaded from http://apache.rediris.es/lucene/solr/ Where can i get the trunk version? On 07/11/2007, Brian Whitman <[EMAIL PROTECTED]> wrote: > > > On Nov 7, 2007, at 10:00 AM, Mike Davies wrote: > > java -Djetty.port=8521 -jar start.jar > >

Re: restricting search to a set of documents

2007-11-07 Thread Mike Klaas
On 7-Nov-07, at 2:27 PM, briand wrote: I need to perform a search against a limited set of documents. I have the set of document ids, but was wondering what is the best way to formulate the query to SOLR? add fq=docId:(id1 id2 id3 id4 id5...) cheers, -Mike

Re: Delte all docs in a SOLR index?

2007-11-09 Thread Mike Klaas
ld only be possible in the event of cold process termination (like power loss). -Mike -Original Message- From: David Neubert [mailto:[EMAIL PROTECTED] Sent: Friday, November 09, 2007 10:56 AM To: solr-user@lucene.apache.org Subject: Re: Delte all docs in a SOLR index? Thanks!

Re: LuceneInAction.zip?

2007-11-13 Thread Mike Klaas
deprecated methods in external libraries as well. I don't think so, but I suggest asking this question on java- [EMAIL PROTECTED], which has a much broader lucene-related audience. -Mike

Re: Faceting over limited result set

2007-11-13 Thread Mike Klaas
hen doing lots of faceting on huge indices, if N is low (say, 500-1000). One problem with the implementation above is that it stymies the query caching in SolrIndexSearcher (since the generated DocList is > the cache upper bound). -Mike

Re: Solr java tutorial

2007-11-13 Thread Mike Klaas
Not really--there have been a few threads on this topic recently. Perhaps in a couple months? It may depend on the timing of the lucene release. -MIke On 13-Nov-07, at 3:41 PM, Dave C. wrote: Ah... :( Is there a timeline for the 1.3 release? - david Date: Tue, 13 Nov 2007 18:33:01

Re: Query and heap Size

2007-11-13 Thread Mike Klaas
web.xml ...etc... Perhaps check your cache statistics on the admin gui. Is it possible that you have set the capacity high and they are just filling up? Another thing to look out for is if you tend to sort on many different fields, but rarely. -Mike

<    1   2   3   4   5   6   7   8   9   10   >