You must have this field in your schema with some default value assigned to
it (most probably default value is NOW). This field is usually used to
determine latest timestamp when this document was last indexed.
On 17 December 2015 at 04:51, Guillermo Ortiz wrote:
> I'm indexing documents in solr
:
: Indeed, it's a doc problem. A long time ago in a Solr far away, there
: was a bunch of effort to use the "default" collection (collection1).
: When that was changed, this documentation didn't get updated.
:
: We'll update it in a few, thanks for reporting!
Fixed on erick's behalf because he
Andrej:
Indeed, it's a doc problem. A long time ago in a Solr far away, there
was a bunch of effort to use the "default" collection (collection1).
When that was changed, this documentation didn't get updated.
We'll update it in a few, thanks for reporting!
Erick
On Thu, Dec 17, 2015 at 1:39 AM,
Looks like your Spark job is not connecting to the same Zookeeper
as your Solr nodes.
Or, I suppose, the Solr nodes aren't started.
You might get more information on the Cloudera help boards
Best,
Erick
On Wed, Dec 16, 2015 at 11:58 PM, Guillermo Ortiz wrote:
> I'm getting some errors when I t
Yeah, if your warmup times are that long, then either you're
having lots of disk I/O contention or something. That said, you've
mentioned that after a while the queries are fine.
That indicates to me that you aren't autowarming _enough_ and
that your slow queries are not pre-loading parts of your
Glad to hear it's solved! The suggester stuff is
way cool, but can surprise you!
Erick
On Thu, Dec 17, 2015 at 2:54 AM, Vincenzo D'Amore wrote:
> Great!!! Great Erick! It was a buildOnCommit.
>
> Many thanks for your help.
>
>
>
> On Wed, Dec 16, 2015 at 6:30 PM, Erick Erickson
> wrote:
>
>
On 12/14/2015 10:47 PM, ig01 wrote:
> We installed solr with solr.cmd -e cloud utility that comes with the
> installation.
> The names of shards are odd because in this case after the installation
> We've migrated an old index from our other environment (wich is solr single
> node) and splitted it
I just tried it (admittedly using just a simple input obviously not a
PDF file) and
it works perfectly as I'd expect.
So a couple of things:
1> what happens if you highlight the content field? The text field
should be fine.
2> Did you completely blow away your index whenever you changed the
schema
Noble Paul നോബിള് नोब्ळ् wrote
> It works as designed.
>
> Protect the read path [...]
Works like described in 5.4.0, didn't work in 5.3.1, s.
https://issues.apache.org/jira/browse/SOLR-8408
--
View this message in context:
http://lucene.472066.n3.nabble.com/API-accessible-without-authentic
Not sure how much help I can be, I have no clue what DSpace is
doing with Solr.
If you're willing to try to index straight to Solr, you can always use
SolrJ to parse the files, it's actually not very hard. Here's an example:
https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/
some databas
On 12/16/2015 9:08 PM, Erick Erickson wrote:
> Hmmm, take a look at the individual queries on a shard, i.e. peek at
> the Solr logs and see if the fq clause comes through cleanly when you
> see &distrib=false. I suspect this is just a glitch in assembling the
> debug response. If it is, it probably
On 12/17/2015 8:00 AM, Moll, Dr. Andreas wrote:
> we are using SolR for some years now and are currently switching from SolR
> 3.6 to 5.3.1.
> SolR 5.3.1 deletes all index files when it shuts down and there were external
> changes on the index-files
> (in our case from a second SolR-server which
Hi,
I've a quick question on zookeeper, how can I run zookeeper as service in linux
so it autostart if the instance is rebooted? The only information I've found in
the internet is on this link
http://positivealex.github.io/blog/posts/how-to-install-zookeeper-as-service-on-centos
and it seems t
PDF isn’t really text. For example, it doesn’t have spaces, it just moves the
next letter over farther. Letters might not be in reading order — two column
text could be printed as horizontal scans. Custom fonts might not use an
encoding that matches Unicode, which makes them encrypted (badly). A
One thing to note about the hashJoin is that it requires the search results
from the hashed query to fit entirely in memory.
The innerJoin does not have this requirement as it performs a streaming
merge join.
Joel Bernstein
http://joelsolr.blogspot.com/
On Thu, Dec 17, 2015 at 10:33 AM, Joel Ber
Below is an example of nested joins where the innerJoin is done in parallel
using the parallel function. The partitionKeys parameter needs to be added
to the searches when the parallel function is used to partition the results
across worker nodes.
hashJoin(
parallel(workerCollectio
The innerJoin joins two streams sorted by the same join keys (merge join).
If third stream has the same join keys you can nest innerJoins. But all
three tables need to be sorted by the same join keys to nest innerJoins
(merge joins).
innerJoin(innerJoin(...),
search(...),
A single query with tens of thousands of terms is very clearly a misuse of
Solr. If it happens to work at all, consider yourself lucky. Are you using
a standard Solr query parser or the terms query parser that lets you write
a raw list of terms to OR.
Are your nodes CPU-bound or I/O-bound during t
Hi,
we are using SolR for some years now and are currently switching from SolR 3.6
to 5.3.1.
SolR 5.3.1 deletes all index files when it shuts down and there were external
changes on the index-files
(in our case from a second SolR-server which produces the index).
Is this behaviour intentional?
Hi,
I am using SOLR as part of the dspace 5.4 SW application.
I have a problem when running the dspace indexing command
(index-discovery). Most of the files are not being added to the index, and
an exception is raised.
It seems that Solr does not process the PDF files that are result of
scanning w
Hi again,
I got the join to work. A team mate pointed out that one of the search
functions in the innerJoin query was missing a field in the join - adding
the e1 field to the fl parameter of the second search function gave the
result I expected:
http://localhost:8983/solr/gettingstarted/stream
On Wed, Dec 16, 2015 at 4:57 AM, Vincenzo D'Amore wrote:
> Hi all,
>
> given that solr 5.4 is finally released, is this what's more stable and
> efficient version of solrcloud ?
>
> I have a website which receives many search requests. It serve normally
> about 2000 concurrent requests, but someti
Generally, I'd recommend opening an issue on PDFBox's Jira with the file that
you shared. Tika uses PDFBox...if a fix can be made there, it will propagate
back through Tika to Solr.
That said, PDFBox 2.0-RC2 extracts no text and warns: WARNING: No Unicode
mapping for CID+71 (71) in font 505Edd
Hi Markus,
This is indeed related to LUCENE-6590: query boosts are now applied with
BoostQuery and if Query.setBoost is called on a query, its rewrite
implementation needs to rewrite to a BoostQuery. You can do that by
prepending the following to your rewrite(IndexReader) implementation:
if (getB
Hi,
Apologies for the cross post. We have a class overridding
SpanPositionRangeQuery. It is similar to a SpanFirst query but it is capable of
adjusting the boost value with regard to distance. With the 5.4 upgrade the
unit tests suddenly threw the following exception:
Query class org.GrSpanFir
Hi,
I have a field f which is defined as follows.
Solr-5.2.1 is used. The index is spread across 12 shards (no replica) and
the index size on each node is around 100 GB.
When I search for 50 thousand values (ORed) in the field f it takes almost
around 45 to 55 seconds.
Per my understanding it i
Anyone cannot just go "INSERT foo INTO bar” on a random MySql server in the
data room, so why should Solr be less secure once Auth is enabled?
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
> 16. des. 2015 kl. 17.02 skrev Noble Paul :
>
> I don't this behavior is int
Hello Teague,
Thanks for your reply and tip! I think Solr will give me a better result
than just using Tika to read up my files and send to a Fulltext Index in my
MySQL, which has the precise point of not highlighting the text snippets...
So, I will keep on trying to fix Solr to my needs, and sur
Hello Erick,
Sorry for my mistakes. Here is everything I got so far:
1. It bring the result perfectly but the hightlight (empty) field as below:
{
"responseHeader":{
"status":0,
"QTime":15,
"params":{
"q":"text:nietava",
"debug":"query",
"hl":"true",
"hl.sim
Great!!! Great Erick! It was a buildOnCommit.
Many thanks for your help.
On Wed, Dec 16, 2015 at 6:30 PM, Erick Erickson
wrote:
> Quick scan, but probably this:
> INFO
> o.a.solr.spelling.suggest.Suggester - build()
>
> The suggester build process can easily take many minutes, there's some
On 17/12/2015 08:45, Zheng Lin Edwin Yeo wrote:
Hi Alexandre,
Thanks for your reply.
So the only way to solve this issue is to explore with PDF specific tools
and change the encoding of the file?
Is there any way to configure it in Solr?
Solr uses Tika to extract plain text from PDFs. If the
I'm indexing documents in solr with Spark and it's missing the a field
_indexed_at_tdt who is doesn't exist in my documents.
I have added this field in my schema, why is this field being added? any
solution?
hmm it's interesting, in according to the code you can create a transformer
which is doing what described at http://yonik.com/solr/atomic-updates/
in *Atomic
Updates with SolrJ*
It should/might work, but I've never tried.
On Thu, Dec 17, 2015 at 12:26 PM, Midas A wrote:
> Hi,
> can be do partia
It turns out that the documentation is not correct. If I specify the
collection name after shards=, it does work as expected. So this works:
curl "
http://54.93.121.54:8986/solr/connects/select?q=*%3A*&wt=json&indent=true&rows=1000&shards=54.93.121.54:8986/solr/connects
"
This does not work:
curl
Hi - looks like Solr did not start up correctly, got some errors and kept Jetty
running. You should find information in that node's logs.
M.
-Original message-
> From:Andrej van der Zee
> Sent: Thursday 17th December 2015 10:32
> To: solr-user@lucene.apache.org
> Subject: Expected mim
On Thu, Dec 17, 2015 at 8:00 AM, Midas A wrote:
>
> org.apache.solr.update.CommitTracker._scheduleCommitWithinIfNeeded(CommitTracker.java:118)
>
I seems like you specifies commitWithin that's legal but seems unusual and
doubtful with DIH.
> > rejected from java.util.concurrent.ScheduledThreadPo
Hi,
I am having troubles getting data from a particular shard, even though I
follow the documentation:
https://cwiki.apache.org/confluence/display/solr/Distributed+Requests
This is OK:
curl "
http://54.93.121.54:8986/solr/connects/select?q=*%3A*&wt=json&indent=true";
{
// returns correct re
Hi,
can be do partial update trough Data import handler .
Regards,
Abhishek
This fix definitely help for facet.field over docvalues field on
mult-segment index since 5.4.
I suppose it's irrelevant to JSON Facets, non-dv field, and pre 5.4.
I can not comment about comparing perfomance of dv and non-dv fields,
because "it depends" (с) benchmarking and profiler are the only a
You can always write an update handler plugin to convert your PDFs to utf-8
and then push them to solr
On Thu, 17 Dec 2015, 14:16 Zheng Lin Edwin Yeo wrote:
> Hi Alexandre,
>
> Thanks for your reply.
>
> So the only way to solve this issue is to explore with PDF specific tools
> and change the e
Why don't you post the entire stack trace from the logs. That might give us
a better idea to help you.
On Thu, 17 Dec 2015, 13:59 sara hajili wrote:
> hi.
> i wanna to change solr analyzer , like normalization.
> because solr default normalization for persian language does't satisfy me.
> so i s
For this case of inversion in particular a slop of 1 won't cause any issues
since such a reverse match will require the slop to be 2
On Thu, 17 Dec 2015, 14:20 elisabeth benoit
wrote:
> Inversion (paris charonne or charonne paris) cannot be scored the same.
>
> 2015-12-16 11:08 GMT+01:00 Binoy D
Inversion (paris charonne or charonne paris) cannot be scored the same.
2015-12-16 11:08 GMT+01:00 Binoy Dalal :
> What is your exact use case?
>
> On Wed, 16 Dec 2015, 13:40 elisabeth benoit
> wrote:
>
> > Thanks for your answer.
> >
> > Actually, using a slop of 1 is something I can't do (beca
Hi Alexandre,
Thanks for your reply.
So the only way to solve this issue is to explore with PDF specific tools
and change the encoding of the file?
Is there any way to configure it in Solr?
Regards,
Edwin
On 17 December 2015 at 15:42, Alexandre Rafalovitch
wrote:
> They could be using custom
hi.
i wanna to change solr analyzer , like normalization.
because solr default normalization for persian language does't satisfy me.
so i start reading solr plugins .
and i try to implement my PersianNormalization.
now i have 2 class in this way :
class persianNormalizer extends TokenFilter.
and an
Erik's comments not withstanding, there are some gaps in my understanding
of your precise situation. Here's a few things that weren't necessarily
obvious to me when I took my first try with Solr.
Highlighting is the end result of a good hit. It is essentially formatting
applied to your hit. It is
46 matches
Mail list logo