Re: Specifying explicit FacetQuery w/ a normal query?

2008-07-22 Thread Mike Klaas
I'm somewhat perplexed, under what circumstances would you be able to send one query to Solr but not two? -Mike On 21-Jul-08, at 8:37 PM, Jon Baer wrote: Well that's my problem ... I can't :-) When you put a fq=doctype:news in there your can't get an explicit facet

Re: pf nixes fl

2008-07-22 Thread Mike Klaas
s behavior is? I'm using solr 1.2. What exact url did you send to Solr? I bet there is a missing '&'. -Mike

Re: Seeking Anecdotes: Solr Plugins

2008-07-22 Thread Mike Klaas
ther features like query-injected filter queries. This type of extension is largely obsolete with QueryComponents Let me know if you want more detail--most of this is relative to a somewhat older version of Solr, so it might not all apply. cheers, -Mike

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Mike Klaas
same amount of ram. The situation you are experiencing is one-seek-per-doc, which is performance death. -Mike On 28-Jul-08, at 1:34 PM, Yonik Seeley wrote: That's a bit too tight to have *all* of the index cached...your best bet is to go to 4GB+, or figure out a way not to have to retrie

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Mike Klaas
doc data should probably be all in cache. One way to mitigate this is to partition the fields like I suggested in the other reply. -Mike

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-29 Thread Mike Klaas
like these extra fields should just be stored in a separate file/database. I also wonder if solving the underlying problem really requires storing 10k values per doc (you haven't given us many clues in this regard)? -Mike

Re: Solr Logo thought

2008-08-08 Thread Mike Klaas
To me, the release timing doesn't much affect what logo we decided to use or when to adopt it. Surely the most visible, important location for the logo is on the website, that we can replace at any time? -Mike On 8-Aug-08, at 7:30 AM, Otis Gospodnetic wrote: I think you are right

Re: adds / delete within same 'transaction'..

2008-08-12 Thread Mike Klaas
e add happens before delete, in which case i end up with no more doc id=1 ? As long as you are sending these requests on the same thread, they will occur in order. -Mike

Re: shards and performance

2008-08-19 Thread Mike Klaas
ly. If 1.3, is the nightly build the best one to grab bearing in mind that we would want any protocols around distributed search to be as stable as possible? Or just wait for the 1.3 release? Go for the nightly build. The release will look very similar to it. -Mike

Re: Clarification on facets

2008-08-19 Thread Mike Klaas
(rawText:python)=27) 2.581456 = idf(docFreq=16017) 0.03125 = fieldNorm(field=rawText, doc=950285) The =27 is the number of times 'python' appears in this document. You could also write a custom component that included in this information in the response. -Mike On 18-Aug-08,

Re: shards and performance

2008-08-19 Thread Mike Klaas
On 19-Aug-08, at 12:58 PM, Phillip Farber wrote: So you experience differs from Mike's. Obviously it's an important decision as to whether to buy more machines. Can you (or Mike) weigh in on what factors led to your different take on local shards vs. shards distributed acros

Re: Solr Logo thought

2008-08-20 Thread Mike Klaas
Nice job Lukas; the professionalism and quality of work is evident. I like aspects of the logo, but too am having trouble getting past the eye-looking O. Is it intentional (eye:look:search, etc)? -Mike On 20-Aug-08, at 5:25 AM, Mark Miller wrote: I went through the same thought process

Re: Solr Logo thought

2008-08-21 Thread Mike Klaas
I thought the plan was to run more of a logo contest? -Mike On 21-Aug-08, at 9:29 AM, Otis Gospodnetic wrote: One more +1 for the eye/sun O. I don't think I thought "eye" when i saw it, but I think having an eye there is actually a cool little detail. I think Shalin sh

Re: Buffer overflow attack on solr seen in the wild

2008-08-21 Thread Mike Klaas
Hi Jim, Looks like a sql injection attack that is automatically entered into search forms. Solr should not be affected, but it could affect you if you insert the raw/unescaped query into a sql database (for logging, etc.). -Mike On 21-Aug-08, at 3:30 PM, Jim Hurst wrote: Hey folks

Re: Solr FAQ entry about "Dynamically calculated range facet" topic

2008-08-22 Thread Mike Klaas
sure someone would find that useful. -Mike 2008/8/22 Chris Hostetter <[EMAIL PROTECTED]> : I would like to know if I can add a FAQ entry about this topic, the : motivation, ideas and workarounds used. If yes, I would like do it with help : from all guys that faced this problem. Anyone

Re: Querying Greater Than and Less Than

2008-08-26 Thread mike topper
you can also use queries like field:[* to Z] or field:[Z TO *] -Mike Jake Conk wrote: Hello, I was trying to figure out how to query ranges greater than and less than. The closest solution I could find was using the range format: field:[x TO z] While this solution works for querying

Re: Highlighting Unindexed Fields

2008-09-03 Thread Mike Klaas
ormance benefit from indexing the field, is there? I guess if you have indexed and termVectors and termPositions then you'll see a highlighting speedup, but not from indexed alone. True. -Mike

Re: Custom scoring example

2008-09-10 Thread Mike Klaas
stead of max. Any custom scoring example will help. (On one hand, DisjunctionMaxQuery itself is an example :-). It is too professional :-) DisjunctionMaxQuery takes the max plus (tiebreak)*sum(others). So, if you set tie=1.0, dismax becomes exactly what you are seeking. -Mike

Re: What's the bottleneck?

2008-09-11 Thread Mike Klaas
restrictive fq, you might try an approach similar to the one in https://issues.apache.org/jira/browse/SOLR-407 . -Mike

Re: dismax and long phrases

2008-10-06 Thread Mike Klaas
ou can "fake" it by only using fieldsets (qf) that have a consistent set of stopwords. -Mike

Re: dismax and long phrases

2008-10-09 Thread Mike Klaas
On 7-Oct-08, at 9:27 AM, Jon Drukman wrote: Mike Klaas wrote: On 6-Oct-08, at 11:20 AM, Jon Drukman wrote: is there any way i could 'fake' it by adding a second field without stopwords, or something like that? Yep, you can "fake" it by only using fieldsets (qf) that

Re: NOT NULL Query

2008-10-15 Thread mike topper
I think you can do field:["" TO *] to grab everything that is not null. -Mike John E. McBride wrote: Hello All, I need to run a query which asks: field = NOT NULL should this perhaps be done with a filter? I can't find out how to do NOT NULL from the documentation, would

Re: Lucene 2.4 released

2008-10-17 Thread Mike Klaas
I don't think that there is any outstanding work to do on this issue. 2.4.0 should be compatible with the Solr 1.3 release; simply drop the lucene jars in solr's lib directory if you want to use the (slightly newer) version of lucene. -Mike On 15-Oct-08, at 10:00 AM, Feak,

Re: DocSet: BitDocSet or HashDocSet ?

2008-11-03 Thread Mike Klaas
gher than this is a net loss. -Mike

Re: SOLR Performance

2008-11-03 Thread Mike Klaas
If you never execute any queries, a gig should be more than enough. Of course, I've never played around with a .8 billion doc corpus on one machine. -Mike On 3-Nov-08, at 2:16 PM, Alok Dhir wrote: in terms of RAM -- how to size that on the indexer? --- Alok K. Dhir Symplicity Corpor

Re: Solr 1.3 stack overflow when accessing solr/admin page

2008-11-12 Thread Mike Robins
n but http://localhost:7001/solr/admin/luke works fine. Regards, Mike btw I don't have a solr.xml -- View this message in context: http://www.nabble.com/Solr-1.3-stack-overflow-when-accessing-solr-admin-page-tp20157873p20460991.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 1.3 stack overflow when accessing solr/admin page

2008-11-13 Thread Mike Robins
hossman wrote: > > > i don't have time to really dig into the code right now, but out of > curiosity what happens when you hit http://localhost:7001/solr/admin/ > and/or http://localhost:7001/solr/admin/index.jsp ? > > I get the same exception when going to both of those. -- View this mes

Re: Deadlock with DirectUpdateHandler2

2008-11-18 Thread Mike Klaas
this needs to be fixed. It isn't as easy as synchronizing didCommit/didRollback, though--this would introduce definite deadlock scenarios. Mark, is there any chance you could post the thread dump for the deadlocked process? Do you issue manual commits during insertion? -Mike

Re: Deadlock with DirectUpdateHandler2

2008-11-18 Thread Mike Klaas
On 18-Nov-08, at 12:18 PM, Mark Miller wrote: Mike Klaas wrote: autoCommitCount is written in a CommitTracker.synchronized block only. It is read to print stats in an unsynchronized fashion, which perhaps could be fixed, though I can't see how it could cause a problem lastAdde

Re: Software Announcement: LuSql: Database to Lucene indexing

2008-11-18 Thread Mike Klaas
& as performant as it can be. Is there a test SQL database that is used to test Solr, so I might try to do some comparisons? Actually, I think that Solr's multithreaded indexing could be improved. It is really only analysis that is parallelizable ATM. -Mike

Re: Filtering on blank fields

2008-11-20 Thread Mike Klaas
field, you are enumerating every possible value of that field and excluding the docs containing it). The solution is to store a token indicating that the field is empty, such as "" (I think that "" works too). Then change your fq to fq=-comments:"" It should be much faster. -Mike

Re: solr.WordDelimiterFilterFactory

2008-11-20 Thread Mike Klaas
d list of tokens not to tokenize like EnglishPorterFilter ? That's a possibility. Another is to add code to filter out short tokens from being generated, and use catenateAll=true -Mike

Re: No search result behavior (a la Amazon)

2008-11-20 Thread Mike Klaas
ay to do it is to issue multiple queries to Solr. -Mike

Re: Highlighting wildcards

2008-11-21 Thread Mike Klaas
On 21-Nov-08, at 3:45 AM, Mark Miller wrote: To do it now, you'd have to switch the query parser to using the old style wildcard (and/or prefix) query, which is slower on large indexes and has max clause issues. An alternative is to query for q=tele?*, which forces wildcardquery

Re: [VOTE] Community Logo Preferences

2008-11-26 Thread Mike Klaas
https://issues.apache.org/jira/secure/attachment/12394350/solr.s4.jpg https://issues.apache.org/jira/secure/attachment/12394268/apache_solr_c_red.jpg https://issues.apache.org/jira/secure/attachment/12393995/sslogo-solr-70s.png https://issues.apache.org/jira/secure/attachment/12393936/logo_remake

Re: Deleting indices

2008-11-27 Thread Mike Klaas
the old docs are gone? Try wiping the index completely: deleteByQuery *:* (it is also more efficient to do this first if you are going to re- index everything). -Mike

Re: Smaller filterCache giving better performance

2008-12-05 Thread Mike Klaas
iltercache code--have you tried the concurrent filter cache impl? -Mike I have posted my setup here: http://www.nabble.com/Throughput-Optimization-td20335132.html. My original filterCache was 700,000. Reducing it to 20,000, I found: - Average response time decreased by 85% - Average throughpu

Re: Get All terms from all documents

2008-12-18 Thread Mike Klaas
particular word, how can i do that? Sounds like you want query autocomplete. The best way to do this (including if you want the box filled with some queries), is to use the query logs, not the documents. -Mike

Re: debugging long commits

2009-01-07 Thread Mike Klaas
y the index until the merge is complete. But I am not familiar enough with this code in lucene to be sure. -Mike On 2-Jan-09, at 10:17 AM, Brian Whitman wrote: I think I'm getting close with this (sorry for the self-replies) I tried an optimize (which we never do) and it took 30m and s

Re: Subscribe Me

2009-01-07 Thread Mike Klaas
Kalidoss, You can subscribe here: http://lucene.apache.org/solr/mailing_lists.html regards, -Mike On 5-Jan-09, at 4:19 AM, kalidoss wrote: Thanks, kalidoss.m, ** DISCLAIMER ** Information contained and transmitted by this E-MAIL is proprietary to Sify Limited and is

Re: Solr on a multiprocessor machine

2009-01-08 Thread Mike Klaas
y case, Jetty). Note that if these instances are sharing a single disk, and your RAM is low, then they will be competing over the slowest resource on your machine and the query could be IO bound, in which case sharding is useless. -Mike

Database permissions integration and Sub documents

2009-01-11 Thread Mike Shredder
all sub-docs in a result set. Which interface to I needs to implement to achieve this ? 3) if I do duping , my total result count will be off , what is the right way to return an estimated total doc count ... Thanks Mike

Re: non fix highlight snippet size

2009-01-13 Thread Mike Klaas
e of the highlighter (which first generates fragments and only then determine whether they are snippets that contain the keyword(s)) -Mike

Re: OOME diagnosis - possible to disable caching?

2009-01-19 Thread Mike Klaas
ate of the filtercache mem usage by looking at its size. -Mike

Re: Highlighting does not work?

2009-01-27 Thread Mike Klaas
They are documented in http://wiki.apache.org/solr/ FieldOptionsByUseCase and in the FAQ , but I agree that it could be more readily accessible. -Mike On 27-Jan-09, at 5:26 AM, Jarek Zgoda wrote: Finally found that the fields have to have an analyzer to be highlighted. Neat. Can I ask

Re: Highlighting does not work?

2009-01-28 Thread Mike Klaas
Well, both pages I listed are in the search results :). But I agree that it isn't obvious to find, and that it should be improved. (The Wiki is a community-created site which anyone can contribute to, incidentally.) cheers, -Mike On 28-Jan-09, at 1:11 AM, Jarek Zgoda wrote: I sw

Re: Highlighting does not work?

2009-01-29 Thread Mike Klaas
Thanks, Jarek. -Mike On 29-Jan-09, at 12:20 AM, Jarek Zgoda wrote: Added appriopriate amendment to FAQ, but I'd consider reorganizing information in the whole wiki, like creating a section titled "Common Tasks". Bit of redundancy does not hurt if it comes to documentati

Re: Searching on field A gives spurious highlights in field B

2009-02-06 Thread Mike Klaas
'm using solr 1.3. Try hl.requireFieldMatch=true http://wiki.apache.org/solr/HighlightingParameters -Mike

Re: why don't we have a forum for discussion?

2009-02-18 Thread Mike Klaas
solr-user using the web, try nabble: http://www.nabble.com/Solr---User-f14480.html -Mike

Re: why don't we have a forum for discussion?

2009-02-18 Thread Mike Klaas
to justify splitting the list into sub- lists (or sub-fora) Fora have the same problems as do mailinglists in terms of people asking the same questions. -Mike

indexing entire text but only storing first N characters?

2009-02-19 Thread Mike Topper
, one that is indexed and not stored and one that is stored and not indexed and only send the first N characters to the stored field? -Mike

Re: indexing entire text but only storing first N characters?

2009-02-19 Thread Mike Topper
Cool, we are actually still on 1.2 but were planning on upgrading to 1.3 is this a feature of 1.3 or just on the nightly builds? -Mike Koji Sekiguchi wrote: > Mike Topper wrote: >> Hello, >> >> In one of the fields in my schema I am sending somewhat large texts. I >&g

Re: public apology for company spam

2009-03-05 Thread Mike Klaas
that I'll make all efforts to prevent it from happening again. It would be forgivable if only the email didn't contain the misspelling "Lucen" :) -Mike

1.4 Release Date?

2009-04-22 Thread Mike Hayes
cally release by then, or should I stick with 1.3? Thanks, Mike

Re: Solr vs Sphinx

2009-05-14 Thread Mike Klaas
the central problems of computer science. -Mike

Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-17 Thread Mike Klaas
Hi Jayson, It is on my list of things to do. I've been having a very busy week and and am also working all weekend. I hope to get to it next week sometime, if no-one else has taken it. cheers, -mike On 8-May-09, at 10:15 PM, jayson.minard wrote: First cut of updated handler n

Re: Lock problems: Lock obtain timed out

2010-01-25 Thread mike anderson
uld I consider changing the lock timeout settings (currently set to defaults)? If so, I'm not sure what to base these values on. Thanks in advance, mike On Wed, Nov 4, 2009 at 8:27 PM, Lance Norskog wrote: > This will not ever work reliably. You should have 2x total disk space > for the

Re: solr application for website crawling and indexing html, pdf, word, ... files

2010-01-25 Thread mike anderson
I think you might be looking for Apache Tika. On Mon, Jan 25, 2010 at 3:55 PM, Frank van Lingen wrote: > I recently started working with solr and find it easy to setup and tinker > with. > > I now want to scale up my setup and was wondering if there is an > application/component that can do the

implementing profanity detector

2010-01-28 Thread Mike Perham
pipeline again. That's a lot of overheard AFAIK. - A TokenFilter would allow me to tap into the existing analysis pipeline so I get the tokens for free but I can't access the document. Any suggestions on how to best implement this? Thanks in advance, mike

Re: Best OCR API for solr

2010-02-04 Thread mike anderson
There might be an OCR plugin for Apache Tika (which does exactly this out of the box except for OCR capability, i believe). http://lucene.apache.org/tika/ -mike 2010/2/4 Krantiā„¢ K K Parisa > Hi, > > Can anyone list the best OCR APIs available to use in combination with > SOLR.

Bigram term vectors and weights possible with Solr?

2010-02-09 Thread Mike Hughes
Hello, One of the commercial search platforms I work with has the concept of 'document vectors', which are 1-gram and 2-gram phrases and their associated tf/idf weights on a 0-1 scale, i.e. ["banana pie", 0.99] means banana pie is very relevant for this document. During the ingest/indexing proces

Re: Bigram term vectors and weights possible with Solr?

2010-02-09 Thread Mike Hughes
Thank you Ahmet, this is exactly what I was looking for. Looks like the shingle filter can produce 3+-gram terms as well, that's great. I'm going to try this with both western and CJK language tokenizers and see how it turns out. On Tue, Feb 9, 2010 at 5:07 PM, Ahmet Arslan wrote: >> I've been l

implementing profanity detector

2010-02-10 Thread Mike Perham
on how to implement this efficiently with Lucene/Solr. mike On Thu, Jan 28, 2010 at 4:31 PM, Otis Gospodnetic wrote: > > How about this crazy idea - a custom TokenFilter that stores the safe flag in > ThreadLocal? > > > > - Original Message > > From: M

term frequency vector access?

2010-02-11 Thread Mike Perham
In an UpdateRequestProcessor (processing an AddUpdateCommand), I have a SolrInputDocument with a field 'content' that has termVectors="true" in schema.xml. Is it possible to get access to that field's term vector in the URP?

Re: implementing profanity detector

2010-02-12 Thread Mike Perham
false; } } termAtt.setTermBuffer("n", 0, 1); return false; } mike

Re: Solr Performance Issues

2010-03-11 Thread Mike Malloy
disclosure I work at New Relic.) Mike Siddhant Goel wrote: > > Hi everyone, > > I have an index corresponding to ~2.5 million documents. The index size is > 43GB. The configuration of the machine which is running Solr is - Dual > Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2

solr-ruby with clustering

2010-03-22 Thread mike anderson
y the clustering output is nowhere to be found. I noticed that the clustering output has a data type of "Arr", where the response and other components have output of type "Lst", could this be the problem? If anyone can think of some other debugging I could try I'd love to hear it. Thanks in advance, Mike

Re: solr-ruby with clustering

2010-03-22 Thread mike anderson
false alarm, on the client side I was specifically setting a shard, and this was causing my query/solr-ruby/solr to think it was a distributed request, which isn't supported by the clustering component. cheers, mike On Mon, Mar 22, 2010 at 8:53 PM, mike anderson wrote: > Has anybody

DIH: Indexing multiple datasources with the same schema

2010-06-07 Thread Mike Copenhafer
Hi, I don't think my problem is unique, but I couldn't find any answers after an hour of searching... I have two databases with identical schemas and different data. I want to use DIH to index both into a single Solr index (right now, I have them in separate indexes, but I find this cumbersome).

Re: solr index problem

2007-07-18 Thread Mike Klaas
f docs, it's probably not maxBufferedDocs, but when a big luicene merge is triggered. Could happen when doDeleting the pending docs too. James: try sending commit every 500k docs or so. -Mike

Re: solr index problem

2007-07-18 Thread Mike Klaas
On 18-Jul-07, at 2:58 PM, Yonik Seeley wrote: On 7/18/07, Mike Klaas <[EMAIL PROTECTED]> wrote: Could happen when doDeleting the pending docs too. James: try sending commit every 500k docs or so. Hmmm, right... some of the memory usage will be related to the treemap keeping tr

Re: Filter Search!

2007-07-22 Thread Mike Klaas
bedded mode, but the main problem is likely the values of your parameters. you probably want: Subject:(Math English) Grade:(junior senior) -Mike

Re: Filter Search!

2007-07-23 Thread Mike Klaas
The problem is that your query/filter syntax is incorrect. subject:Math English does _not_ search for subject=Math OR subject=English. It searches for subject=Math OR 'English' in the default search field. You need to use subject:Math subject:English or subject:(Math English) reg

Re: example solr configurartion file

2007-07-27 Thread Mike Klaas
after indexing is done, especially if an is done. If we change the value, do I have to reindex it? 1 This is the only setting that affects search, and it is the maximum length of searchable documents. You will have to reindex to see the changes here. -Mike

Re: Please help! Solr 1.1 HTTP server stops responding

2007-07-30 Thread Mike Klaas
E warnings too--you may have more than two Searchers open at once. -Mike

Re: searching multiple fields

2007-07-30 Thread Mike Klaas
to express requirements. This also plays better with caching. NOT clauses -> fqs required clauses (either OR or AND) -> q + mm purely optional clauses -> bq/bf If you want complicated (read: parenthesized) boolean logic, it's best to develop your own solution. -Mike

Re: Highlighting question

2007-07-31 Thread Mike Klaas
ontains the word in the query? This is because the example Solr distribution is configured to do stemming (see the definition for "text" fieldtype in schema.xml). Remove PorterStemmerFilterFactory to do exact(er) searching/ highlighting only. -Mike

Re: Facet on certain characters

2007-08-03 Thread Mike Klaas
s the most performant, but you can also issue a series of facet queries: ? q=...&facet.query=title:a*&facet.query=title:b*&facet.query=title:c*&... Now, if this doesn't have to be query-specific (you want the global counts), you can use TermDocs to get the answer quickly. -MIke

Re: embedded solr: write lock issue

2007-08-06 Thread Mike Klaas
pieces to solrconfig.xml: false 10 1000 2147483647 1 1000 Have you tried upping this? The problem might be that you are commiting every 1.0s, and a single commit eventually might take longer than this (and you're only waiting 1.0s to acquire the wri

Re: Optimize index

2007-08-08 Thread Mike Klaas
On 8-Aug-07, at 2:09 PM, Jae Joo wrote: How about standformat optimizion? Jae Optimized indexes are always faster at query time that their non- optimized counterparts. Sometimes significantly so. -Mike

Re: retrieving range of fields for the results

2007-08-08 Thread Mike Klaas
e for faceting, we should only append to the original query without the pagination params in order to get the correct faceting results. Right? Faceting ignores pagenation/startat/maxresults/etc. regards, -Mike

Re: Too many open files

2007-08-09 Thread Mike Klaas
On 9-Aug-07, at 7:52 AM, Ard Schrijvers wrote: ulimit -n 8192 Unless you have an old, creaky box, I highly recommend simply upping your filedesc cap. -Mike

Re: Creating a document blurb when nothing is returned from highlight feature

2007-08-09 Thread Mike Klaas
y has no way of knowing) what parts of a doc matched, so it would still have to try highlighting first. Note that you can control the cpu usage for long fields by setting hl.maxAnalyzedChars (will be in the next release). best, -Mike

Re: is snapshot/backup consistent (reliable) on the index server?

2007-08-10 Thread Mike Klaas
No. Hard links are alternative names for an inode: when lucene replaces a file, it is creating a new (underlying) inode/file, and the backup "link" points to the old one. Don't think of hard links as "links", but additional logical names for the same physical dat

Re: trying to break up highlighted text on line boundaries

2007-08-14 Thread Mike Klaas
On 13-Aug-07, at 6:18 PM, Benjamin Higgins wrote: (using last night's Solr build) Can't seem to get this to work. I am trying to use the regex highlighter fragment type. The docs suggest looking at the example solrconifg.xml for a demonstration of a fragmentor that splits on sentences. It

Re: solr + carrot2

2007-08-20 Thread Mike Klaas
admin ui that ships with Solr? -Mike

Re: index size

2007-08-20 Thread Mike Klaas
s -sh" will tell you roughly where the the space is being occupied. There is something strange going on: 2.5kB * 2.7m is only 6GB, and I have trouble imagining where the 30-fold index size expansion is coming from. -Mike

almost realtime updates with replication

2007-08-22 Thread mike topper
ster than every couple of minutes (say, every 10 seconds)? what if I take out the postcommit hook on the master and just have the snapshooter run on a cron every 5 minutes? -Mike

Re: How to get more documents as response in xml?

2007-08-23 Thread Mike Klaas
In the xml response its displaying the numDocs as 22 but giving only first 10 records. I am unable to get the remaining 12 records. Whether i have to do any configuration in solrconfig.xml? See the 'start' and 'rows' parameters: http://wiki.apache.org/solr/CommonQueryParameters -Mike

Re: Embedded about 50% faster for indexing

2007-08-24 Thread Mike Klaas
c per http request, using persistent connections, and threading. -Mike

Re: Embedded about 50% faster for indexing

2007-08-27 Thread Mike Klaas
n handling strategy: are you using persistent http connections? Are you threadedly indexing? cheers, -Mike Paul Sundling -Original Message- From: climbingrose [mailto:[EMAIL PROTECTED] Sent: Monday, August 27, 2007 12:22 AM To: solr-user@lucene.apache.org Subject: Re: Embedded about 50%

Re: Solr and KStem

2007-08-28 Thread Mike Klaas
t;100"> best, -Mike

Re: Multiple indexes

2007-08-29 Thread Mike Klaas
2 billion docs (signed int). On 29-Aug-07, at 6:24 PM, James liu wrote: what is the limits for Lucene and Solr. 100m, 1000m, 5000m or other number docs? 2007/8/24, Walter Underwood <[EMAIL PROTECTED]>: It should work fine to index them and search them. 13 million docs is not even close to t

Re: performance questions

2007-08-30 Thread Mike Klaas
very slow. Any suggestions as to how to improve this? Maybe a problem with HashSets? Try reducing this value to zero: -Mike

Re: Multiple indexes

2007-08-30 Thread Mike Klaas
you will hit physical limits of your machine long before you can achieve your hypothetical situation: that's 20,000 Tb, which is many, many times the size of a complete internet crawl. -Mike 2007/8/30, Mike Klaas <[EMAIL PROTECTED]>: 2 billion docs (signed int). On 29-Aug-07,

Re: Multiple indexes

2007-08-30 Thread Mike Klaas
functionality. Not currently developed. See http://wiki.apache.org/solr/FederatedSearch and http://issues.apache.org/jira/browse/SOLR-303 -Mike -Nathan -Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Thursday, August 30, 2007 11:44 AM To: solr-user@lucene.apache.org

Re: minimum occurances of term in document

2007-08-30 Thread Mike Klaas
n the appropriate place (eg. as a filter). best, -Mike

Re: Facet for multiple values field

2007-08-30 Thread Mike Klaas
uot; I can send multiple values? Yes. The one-term-per-field restriction applies to: a) sorting b) _optimization_ of facets. -Mike

Re: Index corruption checker?

2007-08-30 Thread Mike Klaas
On 30-Aug-07, at 12:09 PM, Lance Norskog wrote: Is there an app that walks a Lucene index and checks for corruption? How would we know if our index had become corrupted? Try asking on [EMAIL PROTECTED] -Mike

Re: Embedded Solr w/ multiple indexes

2007-08-30 Thread Mike Klaas
(for Embedded version of Solr)? Probably not until the multiple solr core support patch gets committed. SolrCore is currently a singleton. -Mike

<    1   2   3   4   5   6   7   8   9   10   >