After carefully reading the mlt parameters here
https://wiki.apache.org/solr/MoreLikeThis
I found that I can specify the following parameters to return "bbb" when
search for similar documents of "aaa":
mlt.mintf=1
mlt.mindf=2
Details:
mlt.mintf: Minimum Term Frequency - the frequency below which
Hello,
I am trying to implement a Rollup Search component on a version of SOLR that
exists previously to the parent/child additions, so I am trying to implement my
own. The searches will be executed exclusively against the child documents, and
I want to “rollup” those child documents into the p
Hi Nishant,
Thank you for the reply.
I believe that solr removes the first document from the mlt list because a
document is most similar to "itself" and thus should be removed. In my
case, "aaa" and "bbb" are two different documents. When search for
documents similar to "aaa", the document "a
Is that the complete stack trace? There are multiple indexDoc methods in
that class. Some of them assert that the response from control collection
and the default collection are the same. However, in this case, it seems
that an AssertionError is being sent from the server itself as a
RemoteSolrExce
yes, that solr queries continue to run the query on the solr server even
after a connection is broken was my understanding and concern as well
I was hoping I had overlooked or missed something in Solr or Tomcat
documentation that might do the job
it is unfortunate
if anyone else can think of som
solr-user [solr-u...@hotmail.com] wrote:
> while we have optimized our queries for an average 50ms response time,
> we do occasionally see some that can run between 10 and 100 seconds.
That sounds suspicious. Response times so far from your average indicates that
there is special processing going
Hey hhc,
I am new to Solr, so pardon me if this throws you off. But I think the
following piece of code is relevant to your problem from
MoreLikeThisHandler#handleRequestBody():
// Find documents MoreLikeThis - either with a reader or a query
//
--
millions of documents per shard, with a number of shards
~40gb index folder size
12gb of heap on a 16gb machine (this old Solr doesnt use O/S mem space like
4.x does)
servers are hosted internally, and are powerful
understood. as mentioned, we tuned the bulk of our queries to run very
quickly (50
How big is the index (document count, gigabytes)?
How much RAM is on the servers?
How big is your Java heap?
How are the servers hosted? AWS?
Long queries are often caused by long-tail queries fetched from disk. There are
several ways to speed these up, but they all use RAM or SSD.
wunder
Wal
I inherited a set of some old 1.4x Solrs running under tomcat6/java6
while I will eventually upgrade them to a more recent solr/tomcat/java, I am
unable to do in near term
one of my priority fixes tho is to implement some sort of timeout for solr
queries that exceed 1000ms (or so); ie if the quer
Thanks for closing this off _and_ providing info to others!
Best,
Erick
On Thu, Nov 27, 2014 at 1:15 AM, Nishant Kelkar wrote:
> Seems like I've resolved these issues:
> 1. A text search for "rs_A_count_gte300k.txt" throughout my IntelliJ
> project revealed that a file by that name was being exp
On Wed, Nov 26, 2014 at 10:38 PM, Alexandre Rafalovitch
wrote:
> Looks like one of these:
> http://stackoverflow.com/questions/1379934/large-numbers-erroneously-rounded-in-javascript
Yeah, that's what Brendan pointed to earlier in this thread.
> In the UI code, we just seem to be using JSON obje
Thanks, I'll learn about facets.
Actually, we want to use Mahout, but it needs term vectors - so we faced the
problem of receiving term vector for author from set of documents.
Anyway the main reason of my question was the desire to learn, if I'm
missing some simple solution, or not.
So, thank u
Presumably requesting pivot facets returns what are you asking for.
However, it takes a time. Overall problem seems like more suitable for
Mahout, or (really sorry for mentioning it) Hadoop.
On Thu, Nov 27, 2014 at 3:01 PM, Norgorn wrote:
> I'm working with social media data.
> We have blog post
I'm working with social media data.
We have blog posts in our index - text + authors_id.
Now we need to clusterize authors by their texts. We need to get term vector
not for documents, but one vector per one author (for all authors
documents).
We can't get all documents and then unite 'em cause It
I have two documents with ids "aaa" and "bbb", and the titles of both
documents are "a black fox jumps over a red flower". I imported both
documents, along with several other testing documents, two a core "test".
I want solr to return documents similar to document "aaa", so I submited the
followi
Seems like I've resolved these issues:
1. A text search for "rs_A_count_gte300k.txt" throughout my IntelliJ
project revealed that a file by that name was being expected by my
schema.xml (thank you, blind copy/pasting). After removing the conflicting
fields and a few other fields for which I didn't
As an additional issue related to the one above, I sometimes also get this
error (and it's pretty random, the times that I get it):
*java.lang.AssertionError: fix your classpath to have tests-framework.jar
before lucene-core.jar*
at __randomizedtesting.SeedInfo.seed([50225DA1F52F32BB]:0)
at
org.ap
Hi All,
I'm trying to run a simple piece of code, to get SolrTestCaseJ4 to work.
Here's my code:
public class MyTest extends SolrTestCaseJ4 {
@BeforeClass
public static void init() throws Exception {
initCore("solrconfig.xml", "schema.xml");
lrf = h.getRequestFactory("st
19 matches
Mail list logo