Hi ,
Product : Solr (Embedded)Version : 1.2
Problem Description :
While trying to add and search over the index, we are stumbling on this
error again and again.
Do note that the SolrCore is committed and closed suitably in our Embedded
Solr.
Error (StackTrace) :
Sep 19, 2007 9:41:41 AM org.
TestJettyLargeVolume.java
Description: Binary data
we were doing some performance testing for the updating aspects of solr and ran into what seems to be a large problem. we're creating small documents with an id and one field of 1 term only submitting them in batches of 200 with commits every 50
I am using Tomcat 6 and Solr 1.2 on a Windows 2003 server using the
following java code. I am trying to index pdf files, and I'm
constantly getting errors on larger files (the same ones).
SolrServer server = new CommonsHttpSolrServer(solrPostUrl);
SolrInputDocument addDoc = new
What files are there in your /data/pub/index directory?
Bill
On 9/19/07, Venkatraman S <[EMAIL PROTECTED]> wrote:
>
> Hi ,
>
> Product : Solr (Embedded)Version : 1.2
>
> Problem Description :
> While trying to add and search over the index, we are stumbling on this
> error again and again.
>
Quite inetersting actually (this is for 5 documents that were indexed) :
_0.fdt _0.prx _1.fnm _1.tis _2.nrm _3.fdx _3.tii _4.frq segments.gen
_0.fdx _0.tii _1.frq _2.fdt _2.prx _3.fnm _3.tis _4.nrm segments_6
_0.fnm _0.tis _1.nrm _2.fdx _2.tii _3.frq _4.fdt _4.prx
_0.frq _1
one other note. the errors pop up when running against the 1.3 trunk
but do not appear to happen when run against 1.2.
- will
On 9/19/07, Will Johnson <[EMAIL PROTECTED]> wrote:
>
>
>
>
>
> we were doing some performance testing for the updating aspects of solr and
> ran into what seems to be a
Can you start a JIRA issue and attach the patch?
I have not seen this happen, but I bet it is caused by something from:
https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.ext.subversion:subversion-commits-tabpanel
Can we add that test to trunk? By default it does not
On Wed, 19 Sep 2007 01:46:53 -0400
Ryan McKinley <[EMAIL PROTECTED]> wrote:
> Stu is referring to Federated Search - where each index has some of the
> data and results are combined before they are returned. This is not yet
> supported out of the "box"
Maybe this is related. How does this comp
On 9/19/07, Norberto Meijome <[EMAIL PROTECTED]> wrote:
> On Wed, 19 Sep 2007 01:46:53 -0400
> Ryan McKinley <[EMAIL PROTECTED]> wrote:
>
> > Stu is referring to Federated Search - where each index has some of the
It really should be Distributed Search I think (my mistake... I
started out calling
I have had this and other files index correctly using a different
combination version of Tomcat/Solr without any problem (using similar
code, I re-wrote it because I thought it would be better to use Solrj).
I get the same error whether I use a simple StringBuilder to created the
add manually or
Nutch implements federated search separately from their index generation.
My understanding is that MapReduce jobs generate the indexes (Nutch calls them
segments) from raw data that has been downloaded, and then makes them available
to be searched via remote procedure calls. Queries never pass t
Jarvis wrote:
> Thanks for your reply,
> I need the Federated Search. You mean this is not yet
> supported out of the "box". So I have a question that
> in this situation what can Collection Distribution used for?
>
The collection distribution scripts help you get duplicate copies of the
same i
I have tried changing those settings, for example, as:
SolrServer server = new CommonsHttpSolrServer(solrPostUrl);
((CommonsHttpSolrServer)server).setConnectionTimeout(60);
((CommonsHttpSolrServer)server).setDefaultMaxConnectionsPerHost(100);
((CommonsHttpSolrServer)server).setMaxTotalConnections(
Daley, Kristopher M. wrote:
I have tried changing those settings, for example, as:
SolrServer server = new CommonsHttpSolrServer(solrPostUrl);
((CommonsHttpSolrServer)server).setConnectionTimeout(60);
((CommonsHttpSolrServer)server).setDefaultMaxConnectionsPerHost(100);
((CommonsHttpSolrServer)s
I tried 1 and 6, same result.
-Original Message-
From: Ryan McKinley [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 19, 2007 11:18 AM
To: solr-user@lucene.apache.org
Subject: Re: Index/Update Problems with Solrj/Tomcat and Larger Files
Daley, Kristopher M. wrote:
> I have t
I'm stabbing in the dark here, but try fiddling with some of the other
connection settings:
getConnectionManager().getParams().setSendBufferSize( big );
getConnectionManager().getParams().setReceiveBufferSize( big );
http://jakarta.apache.org/httpcomponents/httpclient-3.x/apidocs/org/apache/c
Ok, I'll try to play with those. Any suggestion on the size?
Something else that is very interesting is that I just tried to do an
aggregate add of a bunch of docs, including the one that always returned
the error.
I called a function to create a SolrInputDocument and return it. I then
did the
Hi
We want to (mis)use facet search to get the number of (unique) field
values appearing in a document resultset.
I thought facet search perfect for this, because it already gives me
all the (unique) field values.
But for us to be used for this special problem, we don't want all the
values li
But for us to be used for this special problem, we don't want all the
values listed in response as there might be over 1 and we don't need
the values at all, just the count of how many!
check the LukeReqeustHandler
http://wiki.apache.org/solr/LukeRequestHandler
It gives you lots of fi
Is there a ticket for this yet? I have a bug report and request: I just did
a snapshot while indexing 700 records/sec. and got an inconsistency. I was
tarring off the snapshot and tar reported that a file changed while it was
being copied. The error rolled off my screen, so I cannot report the file
I believe I saw in the Javadocs for Lucene that there is the ability to
return the unique values for one field for a search, rather than each
record. Is it possible to add this feature to Solr? It is the equivalent of
'select distinct' in SQL.
Thanks,
Lance Norskog
Lance Norskog wrote:
I believe I saw in the Javadocs for Lucene that there is the ability to
return the unique values for one field for a search, rather than each
record. Is it possible to add this feature to Solr? It is the equivalent of
'select distinct' in SQL.
Look into faceting:
http://
Anyone else using this, and finding it not working in Solr 1.2? Since
we've got an automated release process, I really need to be able to have
the appserver not see itself as done warming up until the firstSearcher
is ready to go... but with 1.2 this no longer seems to be the case.
adam
On 9/19/07, Adam Goldband <[EMAIL PROTECTED]> wrote:
> Anyone else using this, and finding it not working in Solr 1.2? Since
> we've got an automated release process, I really need to be able to have
> the appserver not see itself as done warming up until the firstSearcher
> is ready to go... but
On 9/19/07, Laurent Hoss <[EMAIL PROTECTED]> wrote:
> We want to (mis)use facet search to get the number of (unique) field
> values appearing in a document resultset.
We have paging of facets, so just like normal search results, it does
make sense to list the total number of facets matching.
The
Hi out of there,
I just walked through the mailing list archive, but I did not find an
appropriate answer for phrase highlighting.
I do not have any highlighting section (and no dismax handler
definition) in solrconfig.xml. This way (AFAIK :-)), the standard lucene
query syntax should be sup
: Product : Solr (Embedded)Version : 1.2
: java.io.FileNotFoundException: no segments* file found in
: org.apache.lucene.store.FSDirectory@/data/pub/index: files:
According to that, the FSDirectory was empty when it ws opened (a file
list is suppose to come after that "files: " part)
you
On 19-Sep-07, at 1:12 PM, Marc Bechler wrote:
Hi out of there,
I just walked through the mailing list archive, but I did not find
an appropriate answer for phrase highlighting.
I do not have any highlighting section (and no dismax handler
definition) in solrconfig.xml. This way (AFAIK :-)
: I noticed that the "field list" (fl) parameter ignores field names that it
: cannot locate, while the "query fields" (qf) parameter throws an exception
: when fields cannot be located. Is there any way to override this behavior and
: have qf also ignore fields it cannot find?
Those parameters
Hello,
I have an issue, that "T-Shirt" is not found, even if there
are documents with the title "T-Shirt".
The analysis page shows that both the index-analyzer and the
query-analyzer create "t" and "shirt" of this.
However, when I search for "t-shirt", I don't find anything.
The product title i
Hi Mike,
thanks for the quick response.
> It would make a great project to get one's hands dirty contributing,
though :)
... sounds like giving a broad hint ;-) Sounds challenging...
Regards from Germany
marc
On 19-Sep-07, at 2:39 PM, Marc Bechler wrote:
Hi Mike,
thanks for the quick response.
> It would make a great project to get one's hands dirty
contributing, though :)
... sounds like giving a broad hint ;-) Sounds challenging...
I'm not sure about that--it is supposed to be a drop-in rep
: The main problem with implementing this is trying to figure out where
: to put the info in a backward compatible manner. Here is how the info
1) this seems like the kind of thing that would only be returend if
requested -- so we probably don't have to be overly concerned about
backwards comp
lance: since the topic you are describing is not directly related to
triggering a snapshot from the web interface can you please start a new
thread with a unique subejct describing in more details exactly what it
was you were doing and the problem you encountered?
this will make it easier for
Hi, there,
So we are using the Tomcat's JNDI method to set up multiple solr instances
within a tomcat server. Each instance has a solr home directory.
Now we want to set up collection distribution for all these solr home
indexes. My understanding is:
1. we only need to run rsync-start once use
However, if I go to the tomcat server and restart it after I have issued
the process command, the program returns and the documents are all
posted correctly!
Very strange behavioram I somehow not closing the connection
properly?
What version is the solr you are connecting to? 1.2 or 1.3
Hi, there,
I used an absolute path for the "dir" param in the solrconfig.xml as below:
snapshooter
/var/SolrHome/solr/bin
true
arg1 arg2
MYVAR=val1
However, I got "snapshooter: not found" exception thrown in catalina.out.
I don't see why this doesn't
Nutch has two ways to make a distributed query - through HDFS(hadoop file
system) or RPC call that is in
"org.apache.nutch.searcher.DistributedSearch" class.
But I think these are both not good enough.
If we use HDFS to service the user's query. Stability is a problem. We must
all do the crawl ,
Hey all,
Let's say I have an index of one hundred documents, and these
documents are grouped into 4 groups A, B, C, and D. The groups do in
fact overlap. What would people recommend as the best way to apply a
search query and return only the documents that are in group A? Also,
how about
On Wed, 19 Sep 2007 10:29:54 -0400
"Yonik Seeley" <[EMAIL PROTECTED]> wrote:
> > Maybe this is related. How does this compare to the map-reduce
> > functionality in Nutch/Hadoop ?
>
> map-reduce is more for batch jobs. Nutch only uses map-reduce for
> parallel indexing, not searching.
I see.
I'm currently looking at methods of term extraction and automatic keyword
generation from indexed documents. I've been experimenting with
MoreLikeThis and values returned by the "mlt.interestingTerms" parameter and
so far this approach has worked well. However, I'd like to be able to
analyze docu
I think index data which stored in HDFS and generated by map-reduce function
is used for searching in NUTCH-0.9
You can see the code in "org.apache.nutch.searcher.NutchBean" class . :)
Jarvis
-Original Message-
From: Norberto Meijome [mailto:[EMAIL PROTECTED]
Sent: Thursday, September 2
See this recent thread for some helpful info:
http://www.nabble.com/solr-doesn%27t-find-exe-in-postCommit-event-tf4264879.html#a12167792
You'll probably want to configure your exe with an absolute path rather than
the dir:
/var/SolrHome/solr/bin/snapshooter
.
In order to get the snap
On Thu, 20 Sep 2007 09:37:51 +0800
"Jarvis" <[EMAIL PROTECTED]> wrote:
> If we use the RPC call in nutch .
Hi,
I wasn't suggesting to use nutch in solr...I'm only a young grasshopper in this
league to be suggesting architecture stuff :) but i imagine there's nothing
wrong with using what they've b
On Sep 19, 2007, at 9:58 PM, Pieter Berkel wrote:
I'm currently looking at methods of term extraction and automatic
keyword
generation from indexed documents.
We do it manually (not in solr, but we put the results in solr.) We
do it the usual way - chunk (into n-grams, named entities & nou
Sounds like you're on the right track, if your groups overap (i.e. a
document can be in group A and B), then you should ensure your "groups"
field is multivalued.
If you are searching for "foo" in documents contained in group "A", then it
might be more efficient to use a filter query (fq) like:
q
HI,
What you say is done by hadoop that support Hardware Failure、Data
Replication and some else .
If we want to implement such a good system by ourselves without HDFS
but Solr , it's a very very complex work I think. :)
I just want to know whether there is a component exis
Thanks Brian, I think the "smart" approaches you refer to might be outside
the scope of my current project. The documents I am indexing already have
manually-generated keyword data, moving forward I'd like to have these
keywords automatically generated, selected from a pre-defined list of
keywords
On 19-Sep-07, at 7:21 PM, Jarvis wrote:
HI,
What you say is done by hadoop that support Hardware Failure、Data
Replication and some else .
If we want to implement such a good system by ourselves without HDFS
but Solr , it's a very very complex work I think. :)
I just want
Hi, Pieter,
Thanks! Now the exception is gone. However, There's no snapshot file
created in the data directory. Strangely, the snapshooter.log seems to
complete successfully. Any idea what else I'm missing?
$ cat var/SolrHome/solr/logs/snapshooter.log
2007/09/19 20:16:17 started by solruser
200
If you don't need to pass any command line arguments to snapshooter, remove
(or comment out) this line from solrconfig.xml:
arg1 arg2
By the same token, if you're not setting environment variables either,
remove the following line as well:
MYVAR=val1
Once you alter / remove those two lines,
On Thu, 20 Sep 2007 10:02:08 +0800
"Jarvis" <[EMAIL PROTECTED]> wrote:
> You can see the code in "org.apache.nutch.searcher.NutchBean" class . :)
thx for the pointer.
_
{Beto|Norberto|Numard} Meijome
"In order to avoid being called a flirt, she always yielded easily."
On Thu, 20 Sep 2007 10:21:39 +0800
"Jarvis" <[EMAIL PROTECTED]> wrote:
> What you say is done by hadoop that support Hardware Failure、Data
> Replication and some else .
> If we want to implement such a good system by ourselves without HDFS
> but Solr , it's a very very complex work I
Along similar lines :
assuming that i have 2 indexes in the same box , say at :
/home/abc/data/index1 and /home/abc/data/index2,
and i want the results from both the indexes when i do a search - then how
should this be 'optimally' designed - basically these are different Solr
homes and i want th
On 9/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
>
>
> you imply that you are building your index using embedded solr, but based
> on your stack trace it seems you are using Solr in a servlet container ...
> i assume to search the index you've already built?
I have a jsp that routes the
Thanks, it works now.
regards,
-Hui
On 9/19/07, Pieter Berkel <[EMAIL PROTECTED] > wrote:
>
> If you don't need to pass any command line arguments to snapshooter,
> remove
> (or comment out) this line from solrconfig.xml:
>
> arg1 arg2
>
> By the same token, if you're not setting environment
56 matches
Mail list logo