Re: Indexing from DB connection issue

2009-05-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess it is better to copy the ClobTransformer.class alone and use the old Solr1.3 DIH On Tue, May 26, 2009 at 11:50 PM, ahammad wrote: > > I have an update: > > I played around with it some more and it seems like it's being caused by the > ClobTransformer. If I remove the 'clob="true"' fr

Re: How to deal with hyphens in PDF documents?

2009-05-27 Thread Peter Kiraly
Hi, My solution was to this problem in Lucene, that I modified the Lucene's parser. There was a file in Lucene not in Java (StandardTokenizer.jj), which defines what is a token, and the types of tokens. My rule was, that a soft or hard hypen at the end of the line denote a word which continues

using field search in morelikethis

2009-05-27 Thread Renz Daluz
Hi, When I'm doing normal search I'm using q=test +field:somevalue&otherparams... How can I implement this using morelikethis by posting the text and q is empty? I tried to use fq= but this is not what I want. Thanks, Renz

Index replication without HTTP

2009-05-27 Thread Ashish P
Hi, I have two instances of embedded server (no http) running on a network with two separate indexes.. I want to replicate changes from one index to other. Is there any way?? Thanks, Ashish -- View this message in context: http://www.nabble.com/Index-replication-without-HTTP-tp23739156p23739156.

Re: Index replication without HTTP

2009-05-27 Thread Shalin Shekhar Mangar
On Wed, May 27, 2009 at 3:06 PM, Ashish P wrote: > > Hi, > I have two instances of embedded server (no http) running on a network with > two separate indexes.. > I want to replicate changes from one index to other. > Is there any way?? > EmbeddedSolrServer is meant for small scale usage -- like

Re: Recover crashed solr index

2009-05-27 Thread Michael McCandless
Hmm... so in fact it looks like Solr had done a number of commits already (especially, given how large your generation is -- the "cje" in segments_cje means there were a number of commits). Were there any other exceptions leading up to this? Disk full? Anything unusual in your Solr configuration?

Re: [Solr Wiki] Update of "FrontPage" by OscarBernal

2009-05-27 Thread Fergus McMenemie
>Oscar - are you on either of these lists? The front page does not >seem like an appropriate place to list a blog entry. And the >technique used in that blog entry doesn't seem like a best practice to >me anyway, though I'd be curious to hear it debated a bit on solr-user >on ways to offe

Re: [Solr Wiki] Update of "FrontPage" by OscarBernal

2009-05-27 Thread Grant Ingersoll
I would just move it to the blog page. On May 26, 2009, at 9:17 PM, Erik Hatcher wrote: Oscar - are you on either of these lists? The front page does not seem like an appropriate place to list a blog entry. And the technique used in that blog entry doesn't seem like a best practice to me

creating new fields at index time - is it possible?

2009-05-27 Thread Kir4
Hi guys!! I have started studying Solr, and I was wondering if anyone would be so kind as to help me understand a couple of things. I have XML data that looks like this: ID-0 Cheap hotels in Paris? I want to search the data based on a location hier

Re: creating new fields at index time - is it possible?

2009-05-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, May 27, 2009 at 5:41 PM, Kir4 wrote: > > Hi guys!! > I have started studying Solr, and I was wondering if anyone would be so kind > as to help me understand a couple of things. > > I have XML data that looks like this: >     >        ID-0 >         >            Cheap hotels in Paris? >    

Re: Indexing from DB connection issue

2009-05-27 Thread ahammad
Hmmm, that's probably a good idea...although it does not explain how my current local setup works. Can you please explain how this is done? I am assuming that I need to add the class itself to the source of solr 1.3, and then compile the code, and take the new .war file and put it in Tomcat? If t

Re: Index replication without HTTP

2009-05-27 Thread Bill Au
If you are running on Unix/Linux, you should be able to use the scripts-based replication with some minor modification. You will need to change the scripts where it try to use HTTP to trigger a commit in Solr. Bill On Wed, May 27, 2009 at 5:36 AM, Ashish P wrote: > > Hi, > I have two instances

Re: Solr distributed, questions about upgrading

2009-05-27 Thread Bill Au
I agree. It is always a good idea to start with the example config/schema in the version that you are upgrading to and work you specific settings back into them. Newer versions of Solr will probably have new or changed settings. Even though sometime the config/shema is backward compatible, I thi

Re: Indexing from DB connection issue

2009-05-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
take the trunk dih.jar. use winzip/winrar or any tool and just delete all the files other than ClobTransformer.class. put that jar into solr.home/lib On Wed, May 27, 2009 at 6:10 PM, ahammad wrote: > > Hmmm, that's probably a good idea...although it does not explain how my > current local setup

1.4 Replication

2009-05-27 Thread Matthew Gregg
Does replication in 1.4 support passing credentials/basic auth? If not what is the best option to protect replication?

Re: Indexing from DB connection issue

2009-05-27 Thread ahammad
Would I need to rename it or refer to it somewhere? Or can I keep the existing name (apache-solr-dataimporthandler-1.4-dev.jar)? Cheers Noble Paul നോബിള്‍ नोब्ळ्-2 wrote: > > take the trunk dih.jar. use winzip/winrar or any tool and just delete > all the files other than ClobTransformer.clas

Re: Indexing from DB connection issue

2009-05-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
no need to rename . On Wed, May 27, 2009 at 6:50 PM, ahammad wrote: > > Would I need to rename it or refer to it somewhere? Or can I keep the > existing name (apache-solr-dataimporthandler-1.4-dev.jar)? > > Cheers > > > Noble Paul നോബിള്‍  नोब्ळ्-2 wrote: >> >> take the trunk dih.jar. use  winzip

Re: 1.4 Replication

2009-05-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, May 27, 2009 at 6:48 PM, Matthew Gregg wrote: > Does replication in 1.4 support passing credentials/basic auth?  If not > what is the best option to protect replication? do you mean protecting the url /replication ? ideally Solr is expected to run in an unprotected environment. if you wis

Re: Indexing from DB connection issue

2009-05-27 Thread ahammad
Hello, I tried your suggestion, and it still gives me the same error. I'd like to point out again that the same folder/config setup is running on my machine with no issues, but it gives me that stack trace in the logs on the server. When I do the full data import request through the browser, I

Re: 1.4 Replication

2009-05-27 Thread Matthew Gregg
On Wed, 2009-05-27 at 19:06 +0530, Noble Paul നോബിള്‍ नोब्ळ् wrote: > On Wed, May 27, 2009 at 6:48 PM, Matthew Gregg > wrote: > > Does replication in 1.4 support passing credentials/basic auth? If not > > what is the best option to protect replication? > do you mean protecting the url /replicati

Re: 1.4 Replication

2009-05-27 Thread Toby Cole
I've not figured out a way to use basic auth with replication. We ended up using IP based auth, it shouldn't be too tricky to add basicauth support as, IIRC, the replication is based on the commons httpclient library. On 27 May 2009, at 15:17, Matthew Gregg wrote: On Wed, 2009-05-27 at 19

Re: 1.4 Replication

2009-05-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
The question is what all do you wish to protect. There are 'read' as well as 'write' attributes . The reads are the ones which will not cause any harm other than consuming some cpu cycles. The writes are the ones which can change the state of the system. The slave uses the 'read' API's which i f

Re: Indexing from DB connection issue

2009-05-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
all I can suggest is write a simple jdbc program and see if it works from that m/c( any privilege issue etc?) On Wed, May 27, 2009 at 7:15 PM, ahammad wrote: > > Hello, > > I tried your suggestion, and it still gives me the same error. > > I'd like to point out again that the same folder/config s

Re: 1.4 Replication

2009-05-27 Thread Matthew Gregg
I would like the to protect both reads and writes. Reads could have a significant impact. I guess the answer is no, replication has no built in security? On Wed, 2009-05-27 at 20:11 +0530, Noble Paul നോബിള്‍ नोब्ळ् wrote: > The question is what all do you wish to protect. > There are 'read' as we

Re: 1.4 Replication

2009-05-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
replication has no builtin security On Wed, May 27, 2009 at 8:37 PM, Matthew Gregg wrote: > I would like the to protect both reads and writes. Reads could have a > significant impact.  I guess the answer is no, replication has no built > in security? > > On Wed, 2009-05-27 at 20:11 +0530, Noble

Re: 1.4 Replication

2009-05-27 Thread Matthew Gregg
That is disappointing then. Restricting by IP may be doable, but much more work than basic auth. On Wed, 2009-05-27 at 20:41 +0530, Noble Paul നോബിള്‍ नोब्ळ् wrote: > replication has no builtin security > > > > On Wed, May 27, 2009 at 8:37 PM, Matthew Gregg > wrote: > > I would like the to p

Re: 1.4 Replication

2009-05-27 Thread Shalin Shekhar Mangar
On Wed, May 27, 2009 at 9:01 PM, Matthew Gregg wrote: > That is disappointing then. Restricting by IP may be doable, but much > more work than basic auth. > > The beauty of open source is that this can be changed :) Please open an issue, we can have basic http authentication made configurable.

Re: 1.4 Replication

2009-05-27 Thread Matthew Gregg
Bug filed. Thankyou. On Wed, 2009-05-27 at 22:40 +0530, Shalin Shekhar Mangar wrote: > On Wed, May 27, 2009 at 9:01 PM, Matthew Gregg wrote: > > > That is disappointing then. Restricting by IP may be doable, but much > > more work than basic auth. > > > > > The beauty of open source is that this

index time boosting on multivalued fields

2009-05-27 Thread Brian Whitman
I can set the boost of a field or doc at index time using the boost attr in the update message, e.g. pet But that won't work for multivalued fields according to the RelevancyFAQ pet animal ( I assume it applies the last boost parsed to all terms? ) Now, say I'd like to do index-time boosting of

Re: term vectors

2009-05-27 Thread Yosvanys Aponte
i undestand what you say but the problem i have is user can make query like this: //tei.2//p"[quijote"] user want to find all paragraph that belong to tei.2 and have the word "quijote" then i have to search structure and content, because i have the and index format to save the structure and the

Re: term vectors

2009-05-27 Thread Erik Hatcher
On May 27, 2009, at 4:56 PM, Yosvanys Aponte wrote: i undestand what you say but the problem i have is user can make query like this: //tei.2//p"[quijote"] A couple of problems with this... for one, there's no query parser that'll interpret that syntax as you mean it in Solr. And also,

Re: term vectors

2009-05-27 Thread Matt Mitchell
I've been experimenting with the XML + Solr combo too. What I've found to be a good working solution is to: pick out the nodes you want as solr documents (every div1 or div2 etc.) index the text only (with lots of metadata fields) add a field for either the xpath to that node, or save the indivi

Re: term vectors

2009-05-27 Thread Walter Underwood
If you really, really need to do XML-smart queries, go ahead and buy MarkLogic. I've worked with the principle folk there and they are really sharp. Their engine is awesome. XML search is hard, and you can't take a regular search engine, even a really good one, and make it do full XML without tons

Re: term vectors

2009-05-27 Thread Otis Gospodnetic
Nice and timely topic for me. You may find this this interesting: http://www.jroller.com/otis/entry/xml_dbs_vs_search_engines Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Walter Underwood > To: solr-user@lucene.apache.org > Sent: Wed