Hi guys,
We observe some strange bug in solr 4.10.2, where by a sloppy query hits
words it should not:
the "e commerce"the "e commerce"SpanNearQuery(spanNear([Contents:the,
spanNear([Contents:eä, Contents:commerceä], 0, true)], 300,
false))spanNear([Contents:the,
spanNear([Contents:eä, Contents:c
Hi,
I encounter this peculiar case with solr 4.10.2 where the parsed query
doesnt seem to be logical.
PHRASE23("reduce workforce") ==>
SpanNearQuery(spanNear([spanNear([Contents:reduceä,
Contents:workforceä], 1, true)], 23, true))
The question is why does the Phrase("quoted string") gets convert
Re,
Thanks for your reply.
I mock my parser like this :
@Overridepublic Query parse() { SpanQuery[] clauses = new SpanQuery[2];
clauses[0] = new SpanTermQuery(new Term("details", "london"));
clauses[1] = new SpanTermQuery(new Term("details", "city")); return new
SpanNearQue
We're running some tests on Solr and would like to have a deeper
understanding of its limitations.
Specifically, We have tens of millions of documents (say 50M) and are
comparing several "#collections X #docs_per_collection" configurations.
For example, we could have a single collection with 50M
Hi,
I am very new to the nutch and solr plateform. I have been trying a
lot to integrate Solr 5.2.0 with nutch 1.10 but not able to do so. I have
followed all the steps mentioned at nutch 1.x tutorial page but when I
execute the following command ,
bin/nutch solrindex http://localhost:8983/so
As a general rule, there are only two ways that Solr scales to large
numbers: large number of documents and moderate number of nodes (shards and
replicas). All other parameters should be kept relatively small, like
dozens or low hundreds. Even shards and replicas should probably kept down
to that s
Why don't you take a step back and tell us what you are really trying to do.
Try using a normal Solr query parser first, to verify that the data is
analyzed as expected.
Did you try using the surround query parser? It supports span queries.
Your span query appears to require that the two terms a
Thanks Jack for your response. But I think Arnon's question was different.
If you need to index 10,000 different collection of documents in Solr (say
a collection denotes someone's Dropbox files), then you have two options:
index all collections in one Solr collection, and add a field like
collect
My guess is that you have WordDelimiterFilterFactory in your
analysis chain with parameters that break up E-Tail to both "e" and "tail" _and_
put them in the same position. This assumes that the result fragment
you pasted is incomplete and "commerce" is in it
>From E-Tail commerce
or some such. T
No clue, you'd probably have better luck on the Nutch user's list
unless there are _Solr_ errors. Does your Solr log show any errors?
Best,
Erick
On Sun, Jun 14, 2015 at 6:49 AM, kunal chakma wrote:
> Hi,
> I am very new to the nutch and solr plateform. I have been trying a
> lot to integr
To my knowledge there's nothing built in to Solr to limit the number
of collections. There's nothing explicitly in place to handle
many hundreds of collections either so you're really in uncharted,
certainly untested waters. Anecdotally we've heard of the problem
you're describing.
You say you sta
My answer remains the same - a large number of collections (cores) in a
single Solr instance is not one of the ways in which Solr is designed to
scale. To repeat, there are only two ways to scale Solr, number of
documents and number of nodes.
-- Jack Krupansky
On Sun, Jun 14, 2015 at 11:00 AM,
Yes, there are some known problems while scaling to large number of
collections, say 1000 or above. See
https://issues.apache.org/jira/browse/SOLR-7191
On Sun, Jun 14, 2015 at 8:30 PM, Shai Erera wrote:
> Thanks Jack for your response. But I think Arnon's question was different.
>
> If you need
>
> My answer remains the same - a large number of collections (cores) in a
> single Solr instance is not one of the ways in which Solr is designed to
> scale. To repeat, there are only two ways to scale Solr, number of
> documents and number of nodes.
>
Jack, I understand that, but I still feel y
Hi,
I face the same problem when trying to index DITA XML files. These are XML
files but have the file extension .dita which Solr ignores.
According to java -jar post.jar -h only the following file extensions are
supported:
/-Dfiletypes=[,,...]
(default=xml,json,csv,pdf,doc,docx,ppt,pptx,xls
I'm having trouble getting Solr to pay attention to the defaultField value
when I send a document to Solr Cell or Tika. Here is my post I'm sending
using Solrj
POST
/solr/collection1/update/extract?extractOnly=true&defaultField=text&wt=javabin&version=2
HTTP/1.1
When I get the response back the
Looks like this has been solved recently in the current dev branch:
"SimplePostTool (and thus bin/post) cannot index files with unknown
extensions"
https://issues.apache.org/jira/browse/SOLR-7546
--
View this message in context:
http://lucene.472066.n3.nabble.com/file-index-format-tp4199892p42
re: hybrid approach.
Hmmm, _assuming_ that no single user has a really huge number of
documents you might be able to use a single collection (or much
smaller group of collections), by using custom routing. That allows
you to send all the docs for a particular user to a particular shard.
There are
This issue has also already been discussed in the Tika issue queue:
"Add method get file extension from MimeTypes"
https://issues.apache.org/jira/browse/TIKA-538
And
http://svn.apache.org/repos/asf/tika/trunk/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
does support DITA X
I think I have this about working with the analytics component. It seems to
fill in all the gaps that the stats component and the json facet don't
support.
It solved the following problems for me:
- I am able to perform math on stats to form other stats.. Then i can sort
on those as needed.
- When
Why it isn't in core Solr... Because it doesn't (and probably can't)
support distributed mode.
The Streaming aggregation stuff, and the (in trunk Real Soon Now)
Parallel SQL support
are where the effort is going to support this kind of stuff.
https://issues.apache.org/jira/browse/SOLR-7560
https:
And anyone who, you know, really likes working with UI code please
help making it better!
As of Solr 5.2, there is a new version of the Admin UI available, and
several improvements are already in 5.2.1 (release imminent). The old
admin UI is still the default, the new one is available at
/admin/i
But I think Solr 3.6 is too far back to fall back to as I'm already using
Solr 5.1.
Regards,
Edwin
On 14 June 2015 at 14:49, Upayavira wrote:
> When in 2012? I'd give it a go with Solr 3.6 if you don't want to modify
> the library.
>
> Upayavira
>
> On Sun, Jun 14, 2015, at 04:14 AM, Zheng Lin
Hi all,
Every time I optimize my index with maxSegment=2 after some time the
replication fails to get filelist for a given generation. Looks like the index
version and generation count gets messed up.
(If the maxSegment=1 this never happens. I am able to successfully reproduce
this by optimizin
Hi chillra,
I have changed the index and query filed configuration to
But still my problem not solved , it won't resolve my problem.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352p4211788.html
Sent from the Solr - U
25 matches
Mail list logo