How to avoid case sensitive search?

2009-03-25 Thread con
-- View this message in context: http://www.nabble.com/How-to-avoid-case-sensitive-search--tp22716698p22716698.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Deleting documents

2009-03-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
how are you posting the xml ? missing content stream means that the POST data is missing On Wed, Mar 25, 2009 at 7:03 PM, Rui Pereira wrote: > I'm trying to delete documents based on the following type of update > requests: > topologyid:3140topologyid:3142 > > This doesn't cause any changes on i

Re: Scheduling DIH

2009-03-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
right now a cron job is the only option. building this into DIH has been a common request? What do others think about this? On Thu, Mar 26, 2009 at 10:11 AM, Tricia Williams wrote: > Hello, > >   Is there a best way to schedule the DataImportHandler?  The idea being to > schedule a delta-import

Scheduling DIH

2009-03-25 Thread Tricia Williams
Hello, Is there a best way to schedule the DataImportHandler? The idea being to schedule a delta-import every Sunday morning at 7am or perhaps every hour without human intervention. Writing a cron job to do this wouldn't be difficult. I'm just wondering is this a built in feature? Tric

Re: delta-import commit=false doesn't seems to work

2009-03-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
take a look at this http://wiki.apache.org/solr/SolrPerformanceFactors#head-4ea89b13099bdaf11d82e54303d2408220c12f22 On Wed, Mar 25, 2009 at 4:07 AM, sunnyfr wrote: > > Hi, > Sorry I still don't know what should I do ??? > I can see in my log which clearly optimize somewhere even if my command i

Re: Not able to configure multicore

2009-03-25 Thread mitulpatel
Actually solr2 is an application other then default one(example) on which I have configured my application. let me explain things more in details: so my application path is http://localhost:8983/solr2/admin and I would like to configure it for multi-cores so I have placed solr.xml in config dir

Re: Delta import

2009-03-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi Alex , you may be able to use CachedSqlEntityprocessor. you can do delta-import using full-import http://wiki.apache.org/solr/DataImportHandlerFaq#fullimportdelta the inner entity can use a CachedSqlEntityProcessor On Thu, Mar 26, 2009 at 1:45 AM, AlexxelA wrote: > > Yes my database is remot

Re: Solr OpenBitSet OutofMemory Error

2009-03-25 Thread Otis Gospodnetic
Hi, I'm not sure if anyone will be able to help without more detail. First suggestion would be to look at Solr with a debugger/profiler to see where memory is used up. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: smock > To: solr-us

Re: Partition index by time using Solr

2009-03-25 Thread Otis Gospodnetic
Hi, Yes, you can use Solr for this, but index partitioning should be done outside of Solr. That is, your app will need to know where to send each doc based on its timestamp, when and where to create new index (new Solr core), and so on. Similarly, deleting older than N days is done by you, u

Re: large index vs multicore

2009-03-25 Thread Otis Gospodnetic
Hi, Without knowing the details, I'd say keep it in the same index if the additional information shares some/enough fields with the main product data and separately if it's sufficiently distinct (this also means 2 queries and manual merging/joining). Otis -- Sematext -- http://sematext.com/

Re: large index vs multicore

2009-03-25 Thread Ryan McKinley
My question is - From design and query speed point of - should I add new core to handle the additional data or should I add the data to the existing core. Do you ever need to get results from both sets of data in the same query? If so, putting them in the same index will be faster. If

Re: get all facets

2009-03-25 Thread Ashish P
Actually what I meant was if there are 100 indexed fields. So there are 100 facet fields right.. So whenever I create solrQuery, I have to do addFacetField("fieldName") can I avoid this and just get all facet fields. Sorry for the confusion. Thanks again, Ashish Shalin Shekhar Mangar wrote: >

solr_hostname in scripts.conf

2009-03-25 Thread Garafola Timothy
I've a question. Is it safe to use 'localhost' as solr_hostname in scripts.conf? -- -Tim

large index vs multicore

2009-03-25 Thread Manepalli, Kalyan
Hi All, In my project, I have one primary core containing all the basic information for a product. Now I need to add additional information which will be searched and displayed in conjunction with the product results. My question is - From design and query speed point of - should I ad

Re: SRW/U and OAI-PMH servers over solr

2009-03-25 Thread Ryan McKinley
I implemented OAI-PMH for solr a few years back for the Massachusetts library system... it appears not to be running right now, but check... http://www.digitalcommonwealth.org/ It would be great to get that code revived and live open source somewhere. As is, it uses a pre 1.3 release tha

Re: Snapinstaller + Overlapping onDeckSearchers Problems

2009-03-25 Thread Cloude Porteus
I set the autowarm to 2000, which only takes about two minutes and resolves my issues. Thanks for your help! best, cloude On Wed, Mar 25, 2009 at 9:34 AM, Ryan McKinley wrote: > It looks like the cache is configured big enough, but the autowarm count is > too big to have good performance. > >

Re: Delta import

2009-03-25 Thread AlexxelA
Yes my database is remote, mysql 5 and i'm using connector/J 5.1.7. My index has 2 documents. When i try to do lets say 14 updates it takes about 18 sec total. Here's the resulting log of the operation : 2009-03-25 15:53:57 org.apache.solr.handler.dataimport.JdbcDataSource$1 call INFO: Ti

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Alejandro Gonzalez
try using db for permission management and when u want to make a rep public u just have to add it's id or name to everyuser permissions field. i think you don't need to add any "is_public" field to index, just an id or name field in wich the indexed doc is.So you can pre-filter the reps quering the

Partition index by time using Solr

2009-03-25 Thread vivek sar
Hi, I've used Lucene before, but new to Solr. I've gone through the mailing list, but unable to find any clear idea on how to partition Solr indexes. Here is what we want, 1) Be able to partition indexes by timestamp - basically partition per day (create a new index directory every day) 2)

SRW/U and OAI-PMH servers over solr

2009-03-25 Thread Miguel Coxo
Hello there, I'm looking for a way to implement SRW/U and a OAI-PMH servers over solr, similar to what i have found here: http://marc.info/?l=solr-dev&m=116405019011211&w=2 . Well actually if it is decoupled (not a plugin) would be ok, if not better =). I wanted to know if anyone knows if there i

Re: Realtime Searching..

2009-03-25 Thread Otis Gospodnetic
Would it not make more sense to wait for the Lucene's IW+IR marriage and other things happening in core Lucene that will make near-real-time search possible? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: John Wang > To: solr-user@lucen

Re: Can TermIndexInterval be set in Solr?

2009-03-25 Thread Otis Gospodnetic
I think it's the later. I don't think the term interval is exposed anywhere. If you expose it through the config and provide a patch, I think we can add this to the core quickly. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: "Burton-

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Jesper Nøhr
OK, we're getting closer. I just have two final questions regarding this then: 1. This would also include all the public repositories, right? If so, how would such a query look? Some kind of is_public:true AND ...? 2. When a repository is made public, the is_public property in the Solr index need

Solr OpenBitSet OutofMemory Error

2009-03-25 Thread smock
Hello, After running a nightly release from around January of Solr for about 4 weeks without any problems, I'm starting to see OutofMemory errors: Mar 24, 2009 1:35:36 AM org.apache.solr.common.SolrException log SEVERE: java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.util

Re: getting started

2009-03-25 Thread Erick Erickson
OK, now I'll turn it over to the folks who actually maintain that site . Meanwhile, here's the link to the 2.4.1 query syntax. http://lucene.apache.org/java/2_4_1/queryparsersyntax.html Best Erick On Wed, Mar 25, 2009 at 2:00 PM, nga pham wrote: > http://lucene.apache.org/solr/tutorial.html#G

Re: Realtime Searching..

2009-03-25 Thread John Wang
Hi Jon: We are running various LinkedIn search systems on Zoie in production. -John On Thu, Feb 19, 2009 at 9:11 AM, Jon Baer wrote: > This part: > > The part of Zoie that enables real-time searchability is the fact that > ZoieSystem contains three IndexDataLoader objects: > >* a RAMLuc

Re: getting started

2009-03-25 Thread nga pham
http://lucene.apache.org/solr/tutorial.html#Getting+Started link - lucene QueryParser syntax is not working On Wed, Mar 25, 2009 at 10:48 AM, nga pham wrote: > Oops my mistake. Sorry for the trouble > > On Wed, Mar 25, 2009 at 10:42 AM, Erick Erickson < > erickerick...@gmail.com> wrote: > >>

Can TermIndexInterval be set in Solr?

2009-03-25 Thread Burton-West, Tom
Hello all, We are experimenting with the ShingleFilter with a very large document set (1 million full-text books). Because the ShingleFilter indexes every word pair as a token, the number of unique terms increases tremendously. In our experiments so far the tii and tis files are getting very l

Re: getting started

2009-03-25 Thread nga pham
Oops my mistake. Sorry for the trouble On Wed, Mar 25, 2009 at 10:42 AM, Erick Erickson wrote: > Which links? Please be as specific as possible. > > Erick > > On Wed, Mar 25, 2009 at 1:20 PM, nga pham wrote: > > > Hi > > > > Some of the getting started link dont work. Can you please enable it?

Re: getting started

2009-03-25 Thread Erick Erickson
Which links? Please be as specific as possible. Erick On Wed, Mar 25, 2009 at 1:20 PM, nga pham wrote: > Hi > > Some of the getting started link dont work. Can you please enable it? >

Re: Strange anomaly(?) with string matching in query

2009-03-25 Thread Kurt Nordstrom
Otis, Absolutely. Here are the tokenizers and filters for the "text" fieldtype in the schema. http://pastebin.com/f2bb249f3 Thanks! That's what I suspected. Want to paste the relevant tokenizer+filters sections of your schema? The index-time and query-time analysis has to be the same or c

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Alejandro Gonzalez
ok so u can create a table in a DB where you have a row foreach user and a field with the reps he/she can access. Then you just have to take a look on the db and include the repository name in the index. so you just have to control (using query parameters) if the query is done for the right reps fo

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Jesper Nøhr
Hm, I must be missing something, then. Consider this. There are three repositories, A and B, C. There are two users, U1 and U2. Repository A is public, while B and C are private. Only U1 can access B. No one can access C. I index this data, such that Is_Private is true for B. Now, when U2 sear

getting started

2009-03-25 Thread nga pham
Hi Some of the getting started link dont work. Can you please enable it?

Re: Strange anomaly(?) with string matching in query

2009-03-25 Thread Otis Gospodnetic
That's what I suspected. Want to paste the relevant tokenizer+filters sections of your schema? The index-time and query-time analysis has to be the same or compatible enough, and that's not the case here. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Mess

RE: REST interface for Query

2009-03-25 Thread Olson, Curtis B
Otis, that very much looks like what I'm after. Curtis > -Original Message- > From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] > Sent: Wednesday, March 25, 2009 12:53 PM > To: solr-user@lucene.apache.org > Subject: Re: REST interface for Query > > > Curtis, > > Like this? >

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Alejandro Gonzalez
i can't see the problem about that. you can manage your users using a DB and keep there the permissions they could have, and create or erase users without problems. you just have to manage a "working index" field for each user with repositories' ids he can access. or u can create several indexes an

Re: Strange anomaly(?) with string matching in query

2009-03-25 Thread Kurt Nordstrom
Otis: Okay, I'm not sure whether I should be including the quotes in the query when using the analyzer, so I've run it both ways (no quotes on the index value). I'll try to approximate the final "tables" returned for each term: The field is dc_subject in both cases, being of type "text" *** V

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Alejandro Gonzalez
you can even create separated indexes for private or public access if u need (and place them in separated machines), but i think Eric's suggestion is the best and easier On Wed, Mar 25, 2009 at 5:52 PM, Jesper Nøhr wrote: > Hi list, > > I've finally settled on Solr, seeing as it has almost every

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Jesper Nøhr
On Wed, Mar 25, 2009 at 5:57 PM, Eric Pugh wrote: > You could index the user name or ID, and then in your application add as > filter the username as you pass the query back to Solr.  Maybe have a > access_type that is Public or Private, and then for public searches only > include the ones that me

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Eric Pugh
You could index the user name or ID, and then in your application add as filter the username as you pass the query back to Solr. Maybe have a access_type that is Public or Private, and then for public searches only include the ones that meet the access_type of Public. Eric On Mar 25, 200

Re: REST interface for Query

2009-03-25 Thread Otis Gospodnetic
Curtis, Like this? https://issues.apache.org/jira/browse/SOLR-839 Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: "Olson, Curtis B" > To: solr-user@lucene.apache.org > Sent: Wednesday, March 25, 2009 12:28:35 PM > Subject: REST interfac

How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Jesper Nøhr
Hi list, I've finally settled on Solr, seeing as it has almost everything I could want out of the box. My setup is a complicated one. It will serve as the search backend on Bitbucket.org, a mercurial hosting site. We have literally thousands of code repositories, as well as users and other data.

Re: Snapinstaller + Overlapping onDeckSearchers Problems

2009-03-25 Thread Ryan McKinley
It looks like the cache is configured big enough, but the autowarm count is too big to have good performance. Try something smaller and see if that fixes both problems. I imagine even just warming the most recent 100 queries would precache the most important ones, but try some higher numbe

REST interface for Query

2009-03-25 Thread Olson, Curtis B
Greetings, I am a new subscriber. I'm Curtis Olson and I work for CACI under contract at the U.S. Department of State, where we deal with massive quantities of documents, so Solr is ideal for us. We have a good sized index that we are starting to build up in development. Some of the filter

Re: Snapinstaller + Overlapping onDeckSearchers Problems

2009-03-25 Thread Otis Gospodnetic
Hi, If you want to fill up the new cache set the autowarmCount to something high (e.g. same number as the cache size), but be prepared to pay the price in warmupTime and thus hit those onDeckSearchers warming again. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Ori

Re: Strange anomaly(?) with string matching in query

2009-03-25 Thread Otis Gospodnetic
Hi, Take the whole string to your Solr Admin -> Analysis page and analyze it. Does it get analyzed the way you'd expect it to be analyzed? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Kurt Nordstrom > To: solr-user@lucene.apache.org

Re: speeding up indexing with a LOT of indexed fields

2009-03-25 Thread Britske
Thanks for the quick reply. the box has 8 real cpu's. Perhaps a good idea then to reduce the nr of cores to 8 as well. I'm testing out a different scenario with multiple boxes as well, where clients persist docs to multiple cores on multiple boxes. (which is what multicore was invented for after

Strange anomaly(?) with string matching in query

2009-03-25 Thread Kurt Nordstrom
Hello, We've encountered a strange issue in our Solr install regarding a particular string that just doesn't seem to want to return results, despite the exact same string being in the index. What makes it even stranger is that we had the same data in a previous install of Solr, and it worked the

Re: Snapinstaller + Overlapping onDeckSearchers Problems

2009-03-25 Thread Cloude Porteus
Yes, I guess I'm running 40k queries when it starts :) I didn't know that each count was equal to a query. I thought it was just copying the cache entries from the previous searcher, but I guess that wouldn't include new entries. I set it to the size of our filterCache. What should I set the the au

Re: Snapinstaller + Overlapping onDeckSearchers Problems

2009-03-25 Thread Ryan McKinley
I don't understand why this sometimes takes two minutes between the start commit & /update and sometimes takes 20 minutes? One of our caches has about ~40,000 items, but I can't imagine it taking 20 minutes to autowarm a searcher. What do your cache configs look like? How big is the auto

Re: Hardware Questions...

2009-03-25 Thread Otis Gospodnetic
Ah, it's hard to tell. I look at index size on disk, number of docs, query rate, types of queries, etc. Are you actually seeing problems with your existing servers? Or see specific performance movement in one of the aspects? (e.g. increasing latency, increased GC or memory usage, increased

Re: Not able to configure multicore

2009-03-25 Thread Otis Gospodnetic
Hm, where does that /solr2 come from? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: mitulpatel > To: solr-user@lucene.apache.org > Sent: Wednesday, March 25, 2009 12:30:11 AM > Subject: Re: Not able to configure multicore > > > > >

Re: Snapinstaller + Overlapping onDeckSearchers Problems

2009-03-25 Thread Otis Gospodnetic
Hm, I can't quite tell from here, but that is just a warning, so it's not super problematic at this point. Could it be that one of your other caches (query cache) is large and lots of items are copied on searcher flip? Could it be that your JVM doesn't have large or free enough enough heap? Ca

Re: Copy solr indexes from 2 solr instance

2009-03-25 Thread Otis Gospodnetic
Prerna, You could create an index snapshot with snapshooter script and then copy the index. You should do that while the source index is not getting modified. Re issue #2: run optimize Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: pre

Re: speeding up indexing with a LOT of indexed fields

2009-03-25 Thread Otis Gospodnetic
Britske, Here are a few quick ones: - Does that machine really have 10 CPU cores? If it has significantly less, you may be beyond the "indexing sweet spot" in terms of indexer threads vs. CPU cores - Your maxBufferedDocs is super small. Comment that out anyway. use ramBufferedSizeMB and s

speeding up indexing with a LOT of indexed fields

2009-03-25 Thread Britske
hi, I'm having difficulty indexing a collection of documents in a reasonable time. it's now going at 20 docs / sec on a c1.xlarge instance of amazon ec2 which just isnt enough. This box has 8GB ram and the equivalent of 20 xeon processors. these document have a couple of stored, indexed, m

Copy solr indexes from 2 solr instance

2009-03-25 Thread prerna07
Hi, Issue 1: I have 2 solr instances, i need to copy indexes from solr1 instance to solr2 without restarting the solr. Please suggest how will this work. Both solr are on multicore setup. Issue2: I deleted all indexes from solr and reloaded my core, solr admin return 0 results. The size of ind

Deleting documents

2009-03-25 Thread Rui Pereira
I'm trying to delete documents based on the following type of update requests: topologyid:3140topologyid:3142 This doesn't cause any changes on index and if I try to read the response, the following error ocurs: 13:32:35,196 ERROR [STDERR] 25/Mar/2009 13:32:35 org.apache.solr.update.processor.Log

Re: Status of an update request

2009-03-25 Thread Shalin Shekhar Mangar
On Wed, Mar 25, 2009 at 12:42 PM, Pierre-Yves LANDRON wrote: > > Hello, > > When I send an update or a commit to solr via curl, the response I get is > formated in HTML ; I can't find a way to have a machine readable response > file. > Here what is said on the subject in the solr config file : > "

Re: Anyone use solr admin and Opera?

2009-03-25 Thread Shalin Shekhar Mangar
On Wed, Mar 25, 2009 at 1:33 PM, ristretto.rb wrote: > Hello, I'm a happy Solr user. Thanks for the excellent software!! > Hopefully this is a good question, I have indeed looked around the FAQ > and google and such first. > I have just switched from Firefox to Opera for web browsing. (Another

Re: numeric range facets

2009-03-25 Thread Shalin Shekhar Mangar
On Wed, Mar 25, 2009 at 3:26 PM, Ashish P wrote: > > Similar to getting range facets for date where we specify start, end and > gap. > Can we do the same thing for numeric facets where we specify start, end and > gap. No. But you can do this with multiple queries by using facet.field with fq pa

Re: get all facets

2009-03-25 Thread Shalin Shekhar Mangar
On Wed, Mar 25, 2009 at 7:30 AM, Ashish P wrote: > > Can I get all the facets in QueryResponse?? You can get all the facets that are returned by the server. Set facet.limit to the number of facets you want to retrieve. See http://lucene.apache.org/solr/api/solrj/org/apache/solr/client/solrj/So

numeric range facets

2009-03-25 Thread Ashish P
Similar to getting range facets for date where we specify start, end and gap. Can we do the same thing for numeric facets where we specify start, end and gap. -- View this message in context: http://www.nabble.com/numeric-range-facets-tp22698330p22698330.html Sent from the Solr - User mailing li

Anyone use solr admin and Opera?

2009-03-25 Thread ristretto.rb
Hello, I'm a happy Solr user. Thanks for the excellent software!! Hopefully this is a good question, I have indeed looked around the FAQ and google and such first. I have just switched from Firefox to Opera for web browsing. (Another story) When I use the solr/admin the home page and stats works

Re: lucene-java version mismatches

2009-03-25 Thread Shalin Shekhar Mangar
On Wed, Mar 25, 2009 at 12:30 PM, Paul Libbrecht wrote: > could I suggest that the maven repositories are populated next-time a >>> release of "solr-specific-lucenes" are made? >>> >> But they are? It is inside the org.apache.solr group since those lucene >> jars >> are released by Solr -- http://

Status of an update request

2009-03-25 Thread Pierre-Yves LANDRON
Hello, When I send an update or a commit to solr via curl, the response I get is formated in HTML ; I can't find a way to have a machine readable response file. Here what is said on the subject in the solr config file : "The response format differs from solr1.1 formatting and returns a standard

Re: lucene-java version mismatches

2009-03-25 Thread Paul Libbrecht
could I suggest that the maven repositories are populated next-time a release of "solr-specific-lucenes" are made? But they are? It is inside the org.apache.solr group since those lucene jars are released by Solr -- http://repo2.maven.org/maven2/org/apache/solr/ Nope, http://repo1.maven.org/