date:20070919

Solr Index - no segments* file found in org.apache.lucene.store.FSDirectory

2007-09-19 Thread Venkatraman S

Hi , Product : Solr (Embedded)Version : 1.2 Problem Description : While trying to add and search over the index, we are stumbling on this error again and again. Do note that the SolrCore is committed and closed suitably in our Embedded Solr. Error (StackTrace) : Sep 19, 2007 9:41:41 AM org.

multithread update client causes exceptions and dropped documents

2007-09-19 Thread Will Johnson

TestJettyLargeVolume.java Description: Binary data we were doing some performance testing for the updating aspects of solr and ran into what seems to be a large problem. we're creating small documents with an id and one field of 1 term only submitting them in batches of 200 with commits every 50

Index/Update Problems with Solrj/Tomcat and Larger Files

2007-09-19 Thread Daley, Kristopher M.

I am using Tomcat 6 and Solr 1.2 on a Windows 2003 server using the following java code. I am trying to index pdf files, and I'm constantly getting errors on larger files (the same ones). SolrServer server = new CommonsHttpSolrServer(solrPostUrl); SolrInputDocument addDoc = new

Re: Solr Index - no segments* file found in org.apache.lucene.store.FSDirectory

2007-09-19 Thread Bill Au

What files are there in your /data/pub/index directory? Bill On 9/19/07, Venkatraman S <[EMAIL PROTECTED]> wrote: > > Hi , > > Product : Solr (Embedded)Version : 1.2 > > Problem Description : > While trying to add and search over the index, we are stumbling on this > error again and again. >

Re: Solr Index - no segments* file found in org.apache.lucene.store.FSDirectory

2007-09-19 Thread Venkatraman S

Quite inetersting actually (this is for 5 documents that were indexed) : _0.fdt _0.prx _1.fnm _1.tis _2.nrm _3.fdx _3.tii _4.frq segments.gen _0.fdx _0.tii _1.frq _2.fdt _2.prx _3.fnm _3.tis _4.nrm segments_6 _0.fnm _0.tis _1.nrm _2.fdx _2.tii _3.frq _4.fdt _4.prx _0.frq _1

Re: multithread update client causes exceptions and dropped documents

2007-09-19 Thread Will Johnson

one other note. the errors pop up when running against the 1.3 trunk but do not appear to happen when run against 1.2. - will On 9/19/07, Will Johnson <[EMAIL PROTECTED]> wrote: > > > > > > we were doing some performance testing for the updating aspects of solr and > ran into what seems to be a

Re: multithread update client causes exceptions and dropped documents

2007-09-19 Thread Ryan McKinley

Can you start a JIRA issue and attach the patch? I have not seen this happen, but I bet it is caused by something from: https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.ext.subversion:subversion-commits-tabpanel Can we add that test to trunk? By default it does not

Re: How can i make a distribute search on Solr?

2007-09-19 Thread Norberto Meijome

On Wed, 19 Sep 2007 01:46:53 -0400 Ryan McKinley <[EMAIL PROTECTED]> wrote: > Stu is referring to Federated Search - where each index has some of the > data and results are combined before they are returned. This is not yet > supported out of the "box" Maybe this is related. How does this comp

Re: How can i make a distribute search on Solr?

2007-09-19 Thread Yonik Seeley

On 9/19/07, Norberto Meijome <[EMAIL PROTECTED]> wrote: > On Wed, 19 Sep 2007 01:46:53 -0400 > Ryan McKinley <[EMAIL PROTECTED]> wrote: > > > Stu is referring to Federated Search - where each index has some of the It really should be Distributed Search I think (my mistake... I started out calling

Re: Index/Update Problems with Solrj/Tomcat and Larger Files

2007-09-19 Thread Ryan McKinley

I have had this and other files index correctly using a different combination version of Tomcat/Solr without any problem (using similar code, I re-wrote it because I thought it would be better to use Solrj). I get the same error whether I use a simple StringBuilder to created the add manually or

Re: How can i make a distribute search on So lr?

2007-09-19 Thread Stu Hood

Nutch implements federated search separately from their index generation. My understanding is that MapReduce jobs generate the indexes (Nutch calls them segments) from raw data that has been downloaded, and then makes them available to be searched via remote procedure calls. Queries never pass t

Re: How can i make a distribute search on Solr?

2007-09-19 Thread Ryan McKinley

Jarvis wrote: > Thanks for your reply, > I need the Federated Search. You mean this is not yet > supported out of the "box". So I have a question that > in this situation what can Collection Distribution used for? > The collection distribution scripts help you get duplicate copies of the same i

RE: Index/Update Problems with Solrj/Tomcat and Larger Files

2007-09-19 Thread Daley, Kristopher M.

I have tried changing those settings, for example, as: SolrServer server = new CommonsHttpSolrServer(solrPostUrl); ((CommonsHttpSolrServer)server).setConnectionTimeout(60); ((CommonsHttpSolrServer)server).setDefaultMaxConnectionsPerHost(100); ((CommonsHttpSolrServer)server).setMaxTotalConnections(

Re: Index/Update Problems with Solrj/Tomcat and Larger Files

2007-09-19 Thread Ryan McKinley

Daley, Kristopher M. wrote: I have tried changing those settings, for example, as: SolrServer server = new CommonsHttpSolrServer(solrPostUrl); ((CommonsHttpSolrServer)server).setConnectionTimeout(60); ((CommonsHttpSolrServer)server).setDefaultMaxConnectionsPerHost(100); ((CommonsHttpSolrServer)s

RE: Index/Update Problems with Solrj/Tomcat and Larger Files

2007-09-19 Thread Daley, Kristopher M.

I tried 1 and 6, same result. -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 19, 2007 11:18 AM To: solr-user@lucene.apache.org Subject: Re: Index/Update Problems with Solrj/Tomcat and Larger Files Daley, Kristopher M. wrote: > I have t

Re: Index/Update Problems with Solrj/Tomcat and Larger Files

2007-09-19 Thread Ryan McKinley

I'm stabbing in the dark here, but try fiddling with some of the other connection settings: getConnectionManager().getParams().setSendBufferSize( big ); getConnectionManager().getParams().setReceiveBufferSize( big ); http://jakarta.apache.org/httpcomponents/httpclient-3.x/apidocs/org/apache/c

RE: Index/Update Problems with Solrj/Tomcat and Larger Files

2007-09-19 Thread Daley, Kristopher M.

Ok, I'll try to play with those. Any suggestion on the size? Something else that is very interesting is that I just tried to do an aggregate add of a bunch of docs, including the one that always returned the error. I called a function to create a SolrInputDocument and return it. I then did the

Getting only size of getFacetCounts , to simulate count(group by( a field) ) using facets

2007-09-19 Thread Laurent Hoss

Hi We want to (mis)use facet search to get the number of (unique) field values appearing in a document resultset. I thought facet search perfect for this, because it already gives me all the (unique) field values. But for us to be used for this special problem, we don't want all the values li

Re: Getting only size of getFacetCounts , to simulate count(group by( a field) ) using facets

2007-09-19 Thread Ryan McKinley

But for us to be used for this special problem, we don't want all the values listed in response as there might be over 1 and we don't need the values at all, just the count of how many! check the LukeReqeustHandler http://wiki.apache.org/solr/LukeRequestHandler It gives you lots of fi

RE: Triggering snapshooter through web admin interface

2007-09-19 Thread Lance Norskog

Is there a ticket for this yet? I have a bug report and request: I just did a snapshot while indexing 700 records/sec. and got an inconsistency. I was tarring off the snapshot and tar reported that a file changed while it was being copied. The error rolled off my screen, so I cannot report the file

"Select distinct" in Solr

2007-09-19 Thread Lance Norskog

I believe I saw in the Javadocs for Lucene that there is the ability to return the unique values for one field for a search, rather than each record. Is it possible to add this feature to Solr? It is the equivalent of 'select distinct' in SQL. Thanks, Lance Norskog

Re: "Select distinct" in Solr

2007-09-19 Thread Ryan McKinley

Lance Norskog wrote: I believe I saw in the Javadocs for Lucene that there is the ability to return the unique values for one field for a search, rather than each record. Is it possible to add this feature to Solr? It is the equivalent of 'select distinct' in SQL. Look into faceting: http://

useColdSearcher = false... not working in 1.2?

2007-09-19 Thread Adam Goldband

Anyone else using this, and finding it not working in Solr 1.2? Since we've got an automated release process, I really need to be able to have the appserver not see itself as done warming up until the firstSearcher is ready to go... but with 1.2 this no longer seems to be the case. adam

Re: useColdSearcher = false... not working in 1.2?

2007-09-19 Thread Yonik Seeley

On 9/19/07, Adam Goldband <[EMAIL PROTECTED]> wrote: > Anyone else using this, and finding it not working in Solr 1.2? Since > we've got an automated release process, I really need to be able to have > the appserver not see itself as done warming up until the firstSearcher > is ready to go... but

Re: Getting only size of getFacetCounts , to simulate count(group by( a field) ) using facets

2007-09-19 Thread Yonik Seeley

On 9/19/07, Laurent Hoss <[EMAIL PROTECTED]> wrote: > We want to (mis)use facet search to get the number of (unique) field > values appearing in a document resultset. We have paging of facets, so just like normal search results, it does make sense to list the total number of facets matching. The

Exact phrase highlighting

2007-09-19 Thread Marc Bechler

Hi out of there, I just walked through the mailing list archive, but I did not find an appropriate answer for phrase highlighting. I do not have any highlighting section (and no dismax handler definition) in solrconfig.xml. This way (AFAIK :-)), the standard lucene query syntax should be sup

Re: Solr Index - no segments* file found in org.apache.lucene.store.FSDirectory

2007-09-19 Thread Chris Hostetter

: Product : Solr (Embedded)Version : 1.2 : java.io.FileNotFoundException: no segments* file found in : org.apache.lucene.store.FSDirectory@/data/pub/index: files: According to that, the FSDirectory was empty when it ws opened (a file list is suppose to come after that "files: " part) you

Re: Exact phrase highlighting

2007-09-19 Thread Mike Klaas

On 19-Sep-07, at 1:12 PM, Marc Bechler wrote: Hi out of there, I just walked through the mailing list archive, but I did not find an appropriate answer for phrase highlighting. I do not have any highlighting section (and no dismax handler definition) in solrconfig.xml. This way (AFAIK :-)

Re: DisMax queries referencing undefined fields

2007-09-19 Thread Chris Hostetter

: I noticed that the "field list" (fl) parameter ignores field names that it : cannot locate, while the "query fields" (qf) parameter throws an exception : when fields cannot be located. Is there any way to override this behavior and : have qf also ignore fields it cannot find? Those parameters

analysis page and search not in sync - no result for "t-shirt"?

2007-09-19 Thread Martin Grotzke

Hello, I have an issue, that "T-Shirt" is not found, even if there are documents with the title "T-Shirt". The analysis page shows that both the index-analyzer and the query-analyzer create "t" and "shirt" of this. However, when I search for "t-shirt", I don't find anything. The product title i

Re: Exact phrase highlighting

2007-09-19 Thread Marc Bechler

Hi Mike, thanks for the quick response. > It would make a great project to get one's hands dirty contributing, though :) ... sounds like giving a broad hint ;-) Sounds challenging... Regards from Germany marc

Re: Exact phrase highlighting

2007-09-19 Thread Mike Klaas

On 19-Sep-07, at 2:39 PM, Marc Bechler wrote: Hi Mike, thanks for the quick response. > It would make a great project to get one's hands dirty contributing, though :) ... sounds like giving a broad hint ;-) Sounds challenging... I'm not sure about that--it is supposed to be a drop-in rep

Re: Getting only size of getFacetCounts , to simulate count(group by( a field) ) using facets

2007-09-19 Thread Chris Hostetter

: The main problem with implementing this is trying to figure out where : to put the info in a backward compatible manner. Here is how the info 1) this seems like the kind of thing that would only be returend if requested -- so we probably don't have to be overly concerned about backwards comp

RE: Triggering snapshooter through web admin interface

2007-09-19 Thread Chris Hostetter

lance: since the topic you are describing is not directly related to triggering a snapshot from the web interface can you please start a new thread with a unique subejct describing in more details exactly what it was you were doing and the problem you encountered? this will make it easier for

rsync start and enable for multiple solr instances within one tomcat

2007-09-19 Thread Yu-Hui Jin

Hi, there, So we are using the Tomcat's JNDI method to set up multiple solr instances within a tomcat server. Each instance has a solr home directory. Now we want to set up collection distribution for all these solr home indexes. My understanding is: 1. we only need to run rsync-start once use

Re: Index/Update Problems with Solrj/Tomcat and Larger Files

2007-09-19 Thread Ryan McKinley

However, if I go to the tomcat server and restart it after I have issued the process command, the program returns and the documents are all posted correctly! Very strange behavioram I somehow not closing the connection properly? What version is the solr you are connecting to? 1.2 or 1.3

setting absolute path for snapshooter in solrconfig.xml doesn't work

2007-09-19 Thread Yu-Hui Jin

Hi, there, I used an absolute path for the "dir" param in the solrconfig.xml as below: snapshooter /var/SolrHome/solr/bin true arg1 arg2 MYVAR=val1 However, I got "snapshooter: not found" exception thrown in catalina.out. I don't see why this doesn't

RE: How can i make a distribute search on Solr?

2007-09-19 Thread Jarvis

Nutch has two ways to make a distributed query - through HDFS(hadoop file system) or RPC call that is in "org.apache.nutch.searcher.DistributedSearch" class. But I think these are both not good enough. If we use HDFS to service the user's query. Stability is a problem. We must all do the crawl ,

Filter by Group

2007-09-19 Thread mark angelillo

Hey all, Let's say I have an index of one hundred documents, and these documents are grouped into 4 groups A, B, C, and D. The groups do in fact overlap. What would people recommend as the best way to apply a search query and return only the documents that are in group A? Also, how about

Re: How can i make a distribute search on Solr?

2007-09-19 Thread Norberto Meijome

On Wed, 19 Sep 2007 10:29:54 -0400 "Yonik Seeley" <[EMAIL PROTECTED]> wrote: > > Maybe this is related. How does this compare to the map-reduce > > functionality in Nutch/Hadoop ? > > map-reduce is more for batch jobs. Nutch only uses map-reduce for > parallel indexing, not searching. I see.

Term extraction

2007-09-19 Thread Pieter Berkel

I'm currently looking at methods of term extraction and automatic keyword generation from indexed documents. I've been experimenting with MoreLikeThis and values returned by the "mlt.interestingTerms" parameter and so far this approach has worked well. However, I'd like to be able to analyze docu

RE: How can i make a distribute search on Solr?

2007-09-19 Thread Jarvis

I think index data which stored in HDFS and generated by map-reduce function is used for searching in NUTCH-0.9 You can see the code in "org.apache.nutch.searcher.NutchBean" class . :) Jarvis -Original Message- From: Norberto Meijome [mailto:[EMAIL PROTECTED] Sent: Thursday, September 2

Re: setting absolute path for snapshooter in solrconfig.xml doesn't work

2007-09-19 Thread Pieter Berkel

See this recent thread for some helpful info: http://www.nabble.com/solr-doesn%27t-find-exe-in-postCommit-event-tf4264879.html#a12167792 You'll probably want to configure your exe with an absolute path rather than the dir: /var/SolrHome/solr/bin/snapshooter . In order to get the snap

Re: How can i make a distribute search on Solr?

2007-09-19 Thread Norberto Meijome

On Thu, 20 Sep 2007 09:37:51 +0800 "Jarvis" <[EMAIL PROTECTED]> wrote: > If we use the RPC call in nutch . Hi, I wasn't suggesting to use nutch in solr...I'm only a young grasshopper in this league to be suggesting architecture stuff :) but i imagine there's nothing wrong with using what they've b

Re: Term extraction

2007-09-19 Thread Brian Whitman

On Sep 19, 2007, at 9:58 PM, Pieter Berkel wrote: I'm currently looking at methods of term extraction and automatic keyword generation from indexed documents. We do it manually (not in solr, but we put the results in solr.) We do it the usual way - chunk (into n-grams, named entities & nou

Re: Filter by Group

2007-09-19 Thread Pieter Berkel

Sounds like you're on the right track, if your groups overap (i.e. a document can be in group A and B), then you should ensure your "groups" field is multivalued. If you are searching for "foo" in documents contained in group "A", then it might be more efficient to use a filter query (fq) like: q

RE: How can i make a distribute search on Solr?

2007-09-19 Thread Jarvis

HI, What you say is done by hadoop that support Hardware Failure、Data Replication and some else . If we want to implement such a good system by ourselves without HDFS but Solr , it's a very very complex work I think. :) I just want to know whether there is a component exis

Re: Term extraction

2007-09-19 Thread Pieter Berkel

Thanks Brian, I think the "smart" approaches you refer to might be outside the scope of my current project. The documents I am indexing already have manually-generated keyword data, moving forward I'd like to have these keywords automatically generated, selected from a pre-defined list of keywords

Re: How can i make a distribute search on Solr?

2007-09-19 Thread Mike Klaas

On 19-Sep-07, at 7:21 PM, Jarvis wrote: HI, What you say is done by hadoop that support Hardware Failure、Data Replication and some else . If we want to implement such a good system by ourselves without HDFS but Solr , it's a very very complex work I think. :) I just want

Re: setting absolute path for snapshooter in solrconfig.xml doesn't work

2007-09-19 Thread Yu-Hui Jin

Hi, Pieter, Thanks! Now the exception is gone. However, There's no snapshot file created in the data directory. Strangely, the snapshooter.log seems to complete successfully. Any idea what else I'm missing? $ cat var/SolrHome/solr/logs/snapshooter.log 2007/09/19 20:16:17 started by solruser 200

Re: setting absolute path for snapshooter in solrconfig.xml doesn't work

2007-09-19 Thread Pieter Berkel

If you don't need to pass any command line arguments to snapshooter, remove (or comment out) this line from solrconfig.xml: arg1 arg2 By the same token, if you're not setting environment variables either, remove the following line as well: MYVAR=val1 Once you alter / remove those two lines,

Re: How can i make a distribute search on Solr?

2007-09-19 Thread Norberto Meijome

On Thu, 20 Sep 2007 10:02:08 +0800 "Jarvis" <[EMAIL PROTECTED]> wrote: > You can see the code in "org.apache.nutch.searcher.NutchBean" class . :) thx for the pointer. _ {Beto|Norberto|Numard} Meijome "In order to avoid being called a flirt, she always yielded easily."

Re: How can i make a distribute search on Solr?

2007-09-19 Thread Norberto Meijome

On Thu, 20 Sep 2007 10:21:39 +0800 "Jarvis" <[EMAIL PROTECTED]> wrote: > What you say is done by hadoop that support Hardware Failure、Data > Replication and some else . > If we want to implement such a good system by ourselves without HDFS > but Solr , it's a very very complex work I

Re: How can i make a distribute search on Solr?

2007-09-19 Thread Venkatraman S

Along similar lines : assuming that i have 2 indexes in the same box , say at : /home/abc/data/index1 and /home/abc/data/index2, and i want the results from both the indexes when i do a search - then how should this be 'optimally' designed - basically these are different Solr homes and i want th

Re: Solr Index - no segments* file found in org.apache.lucene.store.FSDirectory

2007-09-19 Thread Venkatraman S

On 9/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > > you imply that you are building your index using embedded solr, but based > on your stack trace it seems you are using Solr in a servlet container ... > i assume to search the index you've already built? I have a jsp that routes the

Re: setting absolute path for snapshooter in solrconfig.xml doesn't work

2007-09-19 Thread Yu-Hui Jin

Thanks, it works now. regards, -Hui On 9/19/07, Pieter Berkel <[EMAIL PROTECTED] > wrote: > > If you don't need to pass any command line arguments to snapshooter, > remove > (or comment out) this line from solrconfig.xml: > > arg1 arg2 > > By the same token, if you're not setting environment

56 matches

Mail list logo