RefGuide gives this for Adding, I would hope the Replace would be similar:
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field-type":{
"name":"myNewTextField",
"class":"solr.TextField",
"indexAnalyzer":{
"tokenizer":{
"class":"solr.PathHi
I admit to not fully understanding the examples, but ComplexQueryParser
looks like something worth at least reviewing:
https://lucene.apache.org/solr/guide/8_8/other-parsers.html#complex-phrase-query-parser
Also I did not see any references to trying to copyField and process same
content in diffe
Most likely issue is that your core configuration (solrconfig.xml)
does not have the request handler for that. The same config may have
had that in 7.x, but changed since.
More details:
https://lucene.apache.org/solr/guide/8_8/uploading-data-with-solr-cell-using-apache-tika.html
Regards,
Alex
Also, investigate if you have repeating conditions and push those into
defaults in custom request handler endpoints (in solrconfig.xml).
Also, Solr supports parameter substitutions, if you have repeated
subconditions.
Regards,
Alex
On Thu., Feb. 18, 2021, 7:08 a.m. Thomas Corthals,
wrote:
better to start new question threads for new questions. More
people will pay attention.
On Thu., Feb. 18, 2021, 1:31 a.m. elivis, wrote:
> Alexandre Rafalovitch wrote
> > What about copyField with the target being index only (docValue only?)
> and
> > no lowercase on the
I wonder if looking more directly at the indexes would allow you to
get closer to the problem source.
Have you tried comparing/exploring the indexes with Luke? It is in the
Lucene distribution (not Solr), and there is a small explanation here:
https://mocobeta.medium.com/luke-become-an-apache-luce
I answered quite a bunch a whole ago, as part of book writing process.
I think a lot of them were missing core information like version of Solr.
So they were not very timeless.
The list allows a conversation and multiple perspectives, which is better
than a one shot answer.
Regards,
Alex
On
But... this query, as well as all the examples, are in json query format, in
> a request body. The actual query will be sent using a custom API that only
> accepts a regular URL query, with parameters. Any idea how I can rewrite the
> json query above into a URL query?
>
> Also, it
This feels like basic faceting on category, but you are trying to make
a latest record, rather than count as a sorting/grouping principle.
How about using JSON Facets?
https://lucene.apache.org/solr/guide/8_8/json-facet-api.html
I would do the first level as range facet and do your dates at
whatev
at
>
> https://lucene.apache.org/solr/guide/8_8/uploading-data-with-solr-cell-using-apache-tika.html#configuring-the-extractingrequesthandler-in-solrconfig-xml
>
> everything worked fine again.
>
>
> What can I do to help updating the docs?
>
>
> Best regards,
>
&g
I think the extract handler is not defined in schemaless. This may be
a change from before and the documentation is out of sync.
Can you try 'techproducts' example instead of schemaless:
bin/solr stop (if you are still running it)
bin/solr start -e techproducts
Then the import command.
The Tika
. elivis, wrote:
> Alexandre Rafalovitch wrote
> > It is documented in the reference guide:
> > https://lucene.apache.org/solr/guide/8_8/analysis-screen.html
> >
> > Hope it helps,
> >Alex.
> >
> > On Tue, 2 Feb 2021 at 00:57, elivis <
>
> &g
It is documented in the reference guide:
https://lucene.apache.org/solr/guide/8_8/analysis-screen.html
Hope it helps,
Alex.
On Tue, 2 Feb 2021 at 00:57, elivis wrote:
>
> Alexandre Rafalovitch wrote
> > Admin UI also allows you to run text string against a field definition to
And if you need something more recent while this is being fixed, you
can look right at the source in GitHub, though a navigation, etc is
missing:
https://github.com/apache/lucene-solr/blob/master/solr/solr-ref-guide/src/analyzers.adoc
Open Source :-)
Regards,
Alex.
On Mon, 1 Feb 2021 at 15:04
Check the field type and associated indexing chain in managed-schema of
your core. It probably has the lowercase filter in it.
Find a better type or make one yourself. Remember to reload the schema and
reindex the content.
Admin UI also allows you to run text string against a field definition to
I don't have an answer, but I feel that maybe explaining the situation
in more details would help a bit more. Specifically, you explain your
data structure well, but not your actual presentation requirement in
enough details.
How would you like the multi-select to work, how it is working for you
n
If, during index time, your "information" and "informed" are tokenized
into the same root (inform?), then you will not be able to distinguish
them without storing original forms somewhere, usually with copyField.
Same with information vs INFORMATION. The search happens based on
indexed tokens. Whic
>possible analysis error: cannot change field "tizh" from
You have content indexed against old incompatible definition. Deleted but
not purged records count.
Delete your index data or change field name during testing.
Regards,
Alex
On Sun., Jan. 10, 2021, 9:19 a.m. Bruno Mannina, wrote:
>
Try with the explicit URP chain too. It may work as well.
Regards,
Alex.
On Thu, 17 Dec 2020 at 16:51, Dmitri Maziuk wrote:
>
> On 12/12/2020 4:36 PM, Shawn Heisey wrote:
> > On 12/12/2020 2:30 PM, Dmitri Maziuk wrote:
> >> Right, ```Every update request received by Solr is run through a chai
Why not? You should be able to put an URP chain after DIH, the usual way.
Is that something about UUID that is special?
Regards,
Alex
On Sat., Dec. 12, 2020, 2:55 p.m. Dmitri Maziuk,
wrote:
> Hi everyone,
>
> is there an easy way to use the stock UUID generator with DIH? We have a
> hand-w
Maybe a postCommit listener?
https://lucene.apache.org/solr/guide/8_4/updatehandlers-in-solrconfig.html
Regards,
Alex.
On Mon, 7 Dec 2020 at 08:03, Pushkar Mishra wrote:
>
> Hi All,
>
> Is there a way to trigger a notification when a document is deleted in
> solr? Or may be when auto purge ge
Did you reload the core for it to notice the new schema? Or try creating a
new core from the same schema?
If it is a SolrCloud, you also have to upload the schema to the Zookeeper.
Regards,
Alex.
On Wed, 2 Dec 2020 at 09:19, Arturas Mazeika wrote:
> Hi Solr-Team,
>
> The manual of charfilte
Are you sure you have the request handler for /update/extract defined
in your solrconfig.xml?
Not all the update request handlers are defined explicitly (you can
check with Config API - /solr/hadoopDocs/config/requestHandler), but I
am 99% sure that the /update/extract would be explicit because it
Why not have a custom handler endpoint for your online queries? You
will be modifying them anyway to remove fq.
Or even create individual endpoints for every significant use-case.
You can share the configuration between them with initParams or
useParams, but have more flexibility going forward.
A
I think this is all explained quite well in the Ref Guide:
https://lucene.apache.org/solr/guide/8_6/docvalues.html
DocValues is a different way to index/store values. Faceting is a
primary use case where docValues are better than what 'indexed=true'
gives you.
Regards,
Alex.
On Mon, 19 Oct 20
Just as a side note,
> indexed="true"
If you are storing 32K message, you probably are not searching it as a
whole string. So, don't index it. You may also want to mark the field
as 'large' (and lazy):
https://lucene.apache.org/solr/guide/8_2/field-type-definitions-and-properties.html#field-defaul
Why not do an XSLT transformation on it before it hits Solr.
Or during if it really has to be in-Solr for some reason
https://lucene.apache.org/solr/guide/8_6/uploading-data-with-index-handlers.html#using-xslt-to-transform-xml-index-updates
But you have more options outside as you could use XQuer
gt; >> addition, I do Solr start/stop with an /etc/init.d script (the Solr
> >> distribution has the basic one which we can embellish) in which there is
> >> control line RUNAS="solr". The RUNAS variable is used to properly start
> >> Solr.
> >>
Solr now has package managers and DIH is one of the packages to reflect the
fact that its development cycle is not locked to Solr's and to reduce core
download. Tika may be heading the same way, as running Tika inside the Solr
process could cause memory issues with complex PDFs.
In terms of other
It sounds like maybe you have started the Solr in a different way than
you are restarting it. E.g. maybe you started it manually (bin/solr
start, probably as a root) but are trying to restart it via service
script. Who owned the .pid file? I am guessing 'root', while the
service script probably run
The tool was introduced in Solr 8.5 and it is in bin/postlogs
location. It is quite new.
Regards,
Alex.
On Tue, 13 Oct 2020 at 12:39, Zisis T. wrote:
>
> I've stumbled upon
> https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/logs.adoc
> which looks very interesti
Are there that many of those words.?Because even if you deal with
, there is still yas!
Maybe you just have regexp synonyms? (ye+s+)
Good luck,
413x
On Thu., Oct. 8, 2020, 6:02 p.m. Mike Drob, wrote:
> I'm looking for a way to transform words with repeated letters into the
> sam
er of defence
on top of everything. Respawn it every hour, if needed.
On Thu, 8 Oct 2020 at 15:05, David Hastings wrote:
>
> Welp. Never mind I refer back to point #1 this is a bad idea
>
> > On Oct 8, 2020, at 3:01 PM, Alexandre Rafalovitch
> > wrote:
> >
> >
t; >
> > https://gist.github.com/nz/673027/313f70681daa985ea13ba33a385753aef951a0f3
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/ (my blog)
> >
> >> On Oct 8, 2020, at 11:49 AM, Alexandre Rafalovitc
I think there were past discussions about people doing but they really
really knew what they were doing from a security perspective, not just
Solr one.
You are increasing your risk factor a lot, so you need to think
through this. What are you protecting and what are you exposing. Are
you trying to
How do you know it does not apply?
My Doh moment is often forgetting that stored version of the field is not
affected by analyzers. One has to look in schema Admin UI to check indexed
values.
Regards,
Alex
On Mon., Oct. 5, 2020, 6:01 a.m. Lukas Brune,
wrote:
> Hello!
>
> I'm having some tro
You may also want to look at something like: https://docs.querqy.org/index.html
ApacheCon had (is having..) a presentation on it that seemed quite
relevant to your needs. The videos should be live in a week or so.
Regards,
Alex.
On Tue, 29 Sep 2020 at 22:56, Alexandre Rafalovitch wrote
I am not sure why you think stop words are your first choice. Maybe I
misunderstand the question. I read it as that you need to exclude
completely a set of documents that include specific keywords when
called from specific module.
If I wanted to differentiate the searches from specific module, I
w
What do the debug versions of the query show between two versions?
One thing that changed is sow (split on whitespace) parameter among
many. It is unlikely to be the cause, but I am mentioning just in
case.
https://lucene.apache.org/solr/guide/8_6/the-standard-query-parser.html#standard-query-pars
Hello,
Does anybody know (or even experimented) with what the minimum set of
jars needed to run EmbeddedSolrServer.
If I just include solr-core, that pulls in a huge number of Jars. I
don't need - for example - Lucene analyzers for Korean and Japanese
for this application.
But what else do I not
Sounds strange. If you had Solr installed previously, it could be
cached Javascript. Force-reload or try doing it in an anonymous
window.
Also try starting with an example (solr/start -e techproducts).
Finally, if you are up to it, see if there are any serious errors in
the Browser's developer co
e same
>
>
>
>
>
>
> Regards,
>
> Anuj
>
> On Thu, 24 Sep 2020 at 18:58, Alexandre Rafalovitch
> wrote:
>
> > These are field definitions for _text_ and text, your original
> > question was about the fields named "country"/"currency" and wha
It is yes to both questions, but I am not sure if they play well
together for historical reasons.
For storing/parsing original JSON in any (custom) format:
https://lucene.apache.org/solr/guide/8_6/transforming-and-indexing-custom-json.html
(srcField parameter)
For indexing nested children (with na
r 8.6.2
> multiValued="true"/>
>
> On Thu, 24 Sep 2020 at 18:33, Alexandre Rafalovitch
> wrote:
>
> > I think that means your field went from multiValued to singleValued.
> > Double check your schema. Remember that multiValued flag can be set
> >
I think that means your field went from multiValued to singleValued.
Double check your schema. Remember that multiValued flag can be set
both on the field itself and on its fieldType.
Regards,
Alex
P.s. However if your field is supposed to be single-valued, maybe you
should treat it as a featur
Your builder parameter should be up to the collection, so only
"http://testserver-dtv:8984/solr/cpsearch";.
Then, on your Query object, you set
query.setRequestHandler("/select_cpsearch") as per
https://lucene.apache.org/solr/8_6_2/solr-solrj/org/apache/solr/client/solrj/SolrQuery.html#setRequestHa
t;nest_path". *
>
> Is this intentional? or should it be as follows?
>
> name="_nest_path_" type="* _nest_path_ *" />
>
> Also, should we explicitly set index=true and store=true on _nest_path_
> and _nest_parent_ fields?
>
>
>
> On T
This is not quite enough information.
There is
https://lucene.apache.org/solr/guide/8_6/filter-descriptions.html#remove-duplicates-token-filter
but it has specific limitations.
What is the problem that you are trying to solve that you feel is due
to duplicate tokens? Why are they duplicates? Is i
Solr has a whole pipeline that you can run during document ingesting before
the actual indexing happens. It is called Update Request Processor (URP)
and is defined in solrconfig.xml or in an override file. Obviously, since
you are indexing from SolrJ client, you have even more flexibility, but it
i
{id=c1_child1, conceptid=c1, storeid=s1, fieldName=c1_child1_field_value1,
> startTime=Mon Sep 07 12:40:37 EDT 2020, integerField_iDF=10,
> booleanField_bDF=true, _root_=abcd, _version_=1678099970090074112}
> {id=c1_child1, conceptid=c1, storeid=foo, fieldName=bar, startTime=Mon Sep
>
Can you double-check your schema to see if you have all the fields
required to support nested documents. You are supposed to get away
with just _root_, but really you should also include _nest_path and
_nest_parent_. Your particular exception seems to be triggering
something (maybe a bug) related t
There are a lot of different use cases and the separate analyzers for
indexing and query is part of the Solr power. For example, you could
apply ngram during indexing time to generate multiple substrings. But
you don't want to do that during the query, because otherwise you are
matching on 'shared
If you are uploading a PDF, then you must be doing it via Tika or via
an extract handler (which uses Tika under the covers).
Try getting a standalone Tika of the same version and see what it
outputs. Perhaps there is something in those specific PDF pages that
confuse Tika. Like, if it used differe
> Doc in Arabic with some English - English text is inverted (for example,
"gro.echapa.www"), what makes search by key words impossible.
What very specifically do you mean by that. How do you see the inversion?
If that's within some sort of web ui, then you are probably seeing the HTML
bidi (bidi
That's a really hard way to get introduced to Solr. What about
downloading Solr and running one of the built-in examples? Because you
are figuring out so many variables at once.
Either way, your specific issue is not in schema.xml (which should be
converted to managed-schema on first run, btw, don
Is this a Solr-side message? Looks like dovecot doing proactive
trimming of some crazy long header.
You can lookup the record by UID in the Admin UI (UID=153535 instead
of *:*) to check what is being indexed. Check that dovecot does not do
any prefixing of field names (any record from first generi
If you are indexing from Drupal into Solr, that's the question for
Drupal's solr module. If you are doing it some other way, which way
are you doing it? bin/post command?
Most likely this is not the Solr question, but whatever you have
feeding data into Solr.
Regards,
Alex.
On Thu, 27 Aug 2020
>
>
> Error 404 Not Found
>
> HTTP ERROR 404
> Problem accessing /solr/dovecot. Reason:
> Not Found
>
>
>
> Anything else I could try?
>
> Best,
>
> Francis
>
> On 2020-08-27 20:46, Alexandre Rafalovitch wrote:
> > Uhm right. I may
t
> > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:495)
> > at
> > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1594)
> > at
> > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1586)
>
Have you tried blowing the index directory away (usually 'data'
directory next to 'conf'). Because:
cannot change field "box" from index
options=DOCS_AND_FREQS_AND_POSITIONS to inconsistent index
options=DOCS
This implies that your field box had different definitions, you
updated it but the index
/e1392c74400d74366982ccb796063ffdcef08047/solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformer.java#L201-L209
> but
> not sure
>
> Regards,
> Munendra S N
>
>
>
> On Sun, Aug 23, 2020 at 7:53 PM Alexandre Rafalovitch
> wrote:
>
> > Thank you N
The issue seems to be more with a specific file and at the level way
below Solr's or possibly even Tika's:
Caused by: java.io.IOException: expected='>' actual='
' at offset 2383
at
org.apache.pdfbox.pdfparser.BaseParser.readExpectedChar(BaseParser.java:1045)
Are you indexing the sa
quot;parent1");
> > parent1.addField("class", "foo.bar.parent1");
> >
> > SolrInputDocument child1 = new SolrInputDocument();
> >
> > parent1.addField("sometag", Arrays.asList(child1));
> > child1.addField("id", "c1
Hello,
I am trying to get up to date with both SolrJ and Nested Document
implementation and not sure where I am failing with a basic test
(https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java).
I am using Solr 8.6.1 with a core created with bin/solr create -c
solrj
If this is reproducible, I would run Wireshark on the network and see what
happens at packet level.
Leaning towards firewall timing out and just starting to drop all packets.
Regards,
Alex
On Mon., Aug. 17, 2020, 6:22 p.m. Susheel Kumar,
wrote:
> Thanks for the all responses.
>
> Shawn - to
I can't remember if field aliasing works with df but it may be worth a try:
https://lucene.apache.org/solr/guide/8_1/the-extended-dismax-query-parser.html#field-aliasing-using-per-field-qf-overrides
Another example:
https://github.com/arafalov/solr-indexing-book/blob/master/published/languages/co
You have echoParams set to all. What does that return?
Regards,
Alex
On Fri., Aug. 7, 2020, 11:31 a.m. yaswanth kumar,
wrote:
> Thanks for looking into this Erick,
>
>
> solr/PROXIMITY_DATA_V2/select?q=pkey:223_*&group=true&group.field=country_en&fl=country_en
>
> that's what the url I am hi
I _think_ it will run all 3 and then do index hopping. But if you know one
fq is super expensive, you could assign it a cost
Value over 100 will try to use PostFilter then and apply the query on top
of results from other queries.
https://lucene.apache.org/solr/guide/8_4/common-query-parameters.ht
Did you try it with 'sow' parameter both ways? I am not sure I fully
understand the question, especially with shingling on both passes
rather than just indexing one. But at least it is something to try and
is one of the difference areas between Solr and ES.
Regards,
Alex.
On Tue, 19 May 2020 a
If you use sort, you are basically ignoring relevancy (unless you put
that into sort). Which you seem to know as your example uses FQ.
Do you see performance drop on non-clustered or clustered Solr?
Because, I would not be surprised if, for clustered node, all the
results need to be brought into o
Does this actually work? This individual ID matching feels very
fragile attempt at enforcing the sort order and maybe represents an
architectural issue. Maybe you need to do some joins or graph walking
instead. Or, more likely, you would benefit from over-fetching and
just sorting on the ids on the
If you are using API (which AdminUI does), the regenerated file will
loose comments and sort everything in particular order. That's just
the implementation at the moment.
If you don't like that, you can always modify the schema file by hand
and reload the core to notice the changes. You can even s
You can only have one chain at the time.
You can, however, create your custom URP chain to contain
configuration from all three.
Or, if you do use multiple chains that are configured similarly, you
can pull each URP into its own definition and then mix and match then
either in the chain or even p
Check for popup and other tracker blockers. It is possible one of the
resources has a similar name and triggers blocking. There was a thread
in early October with a similar discussion, but apart from the
blockers idea nothing else was discovered at the time.
An easy way would be to create a new Ch
You can enable debug which will show you what matches and why. Check
the reference guide for parameters:
https://lucene.apache.org/solr/guide/8_1/common-query-parameters.html#debug-parameter
Regards,
Alex.
On Fri, 6 Dec 2019 at 11:00, rhys J wrote:
>
> I have a search box that is just searchi
What about XMLQueryParser:
https://lucene.apache.org/solr/guide/8_2/other-parsers.html#xml-query-parser
Regards,
Alex.
On Wed, 27 Nov 2019 at 22:43, wrote:
>
> I am trying to simulate the following query(Lucene query builder) using Solr
>
>
>
>
> BooleanQuery.Builder main = new BooleanQuery.B
Oops. And the link...
https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-OptimisticConcurrency
On Wed, Nov 27, 2019, 6:24 PM Alexandre Rafalovitch,
wrote:
> How about Optimistic Concurrency with _version_ set to negative value?
>
>
How about Optimistic Concurrency with _version_ set to negative value?
You could inject that extra value in URP chain if need be.
Regards,
Alex
On Wed, Nov 27, 2019, 5:41 PM Aaron Hoffer, wrote:
> We want to prevent Solr from overwriting an existing document if document's
> ID already exis
I think the main question here is the compound word "credit card"
always the same? If yes, you can preprocess it during indexing to
something unique and discard (see Vincenzo's reply). You could even
copyfield and process the copy to only leave standalone word "credit"
in it, so it basically serves
Try: site:lucene.apache.org inurl:8_2 luceneMatchVersion
(8.3 does not work, seems to be not fully? indexed by google yet)
https://github.com/apache/lucene-solr/search?l=AsciiDoc&q=luceneMatchVersion
(latest development version only).
You can read the rendered documents (without extra processing
Grep on the source of the manual (which ships with Solr source).
Google search with domain or keywords limitations.
Online copy searching is not powered by Solr yet. Yes, we are aware of the
irony and are discussing it.
Regards,
Aled
On Tue, Nov 12, 2019, 1:25 AM Luke Miller, wrote:
> Hi,
, Nov 11, 2019, 2:30 PM Sthitaprajna,
wrote:
>
> https://stackoverflow.com/questions/58763657/solr-missing-mandatory-uniquekey-field-id-or-unknown-field?noredirect=1#comment103816164_58763657
>
> May be this will help ? I added screenshots.
>
> On Fri, 8 Nov 2019, 22:57 Ale
Weird. Did you try echoParams=all just to see what other defaults are
picked up.
It feels like it picks up default parser and maybe default "df" value that
points to not existing text field.
Maybe enable debug too to see what it expands to.
Regards,
Alex
On Sun, Nov 10, 2019, 9:26 PM Kaminsk
Something does not make sense, because your schema defines "title" as
the uniqueKey field, but your message talks about "id". Are you
absolutely sure that the Solr/collection you get an error for is the
same Solr where you are checking the schema?
Also, do you have a bit more of the error and stac
For what purpose?
Because, for example, Solr is not designed to serve direct to the browser,
just like Mysql is not. So, usually, there is a custom middleware.
On the other hand, Solr can serve as JDBC engine so you could use JDBC
frontends to explore data. Or as an engine for visualisations. Etc
It mentions it in the start paragraph "Prefix, Wildcard, Regex, etc."
So, if you search for "abc*" it expands to all terms that start from
"abc", but then not everything can handle this situation as it is a
lot of terms in the same position. So, not all analyzers can handle
that and normally it i
I've done some experiments about indexing RefGuide (from source) into
Solr at: https://github.com/arafalov/solr-refguide-indexing . But the
problem was creating UI, hosting, etc.
There was also a thought (mine) of either shipping RefGuide in Solr
with pre-built index as an example or even just shi
Hi Alex,
> Thanks for your reply. How do we integrate tesseract with Solr? Do we have
> to implement Custom update processor or extend the
> ExtractingRequestProcessor?
>
> Regards
> Suresh
>
> On Wed, Oct 23, 2019 at 11:21 AM Alexandre Rafalovitch >
> wrote:
>
>
I believe Tika that powers this can do so with extra libraries (tesseract?)
But Solr does not bundle those extras.
In any case, you may want to run Tika externally to avoid the
conversion/extraction process be a burden to Solr itself.
Regards,
Alex
On Wed, Oct 23, 2019, 1:58 PM suresh penda
What command do you use to get the file into Solr? My guess that you
are somehow not hitting the correct handler. Perhaps you are sending
it to extract handler (designed for PDF, MSWord, etc) rather than the
correct CSV handler.
Solr comes with the examples of how to index CSV command.
See for exa
I remember several years ago a discussion/blog post about a similar
problem. The author went through a lot of thinking and decided that
the best way to deal with a similar problem was to have Solr documents
represent different level of abstraction, more granular.
IIRC, the equivalent for your exam
oth.
> > Also, you may want to note which normalized fields were truncated or
> > were simply too small. This would give some guidance as to the bias of
> > the normalization. If 95% of the fields were not truncated, there is
> > a chance you are not doing good a
small. This would give some guidance as to the bias of the
> normalization. If 95% of the fields were not truncated, there is a chance
> you are not doing good at normalizing because you have a set of
> particularly short messages. So I would expect a small set of side fields
> re
Is the 100 words a hard boundary or a soft one?
If it is a hard one (always 100 words), the easiest is probably copy
field and in the (unstored) copy, trim off whatever you don't want to
search. Possibly using regular expressions. Of course, "what's a word"
is an important question here.
Similarl
n and the
> > capitalization (otherwise “it” would be taken out as a stopword).
> >
> stopwords are a thing of the past at this point. there is no benefit to
> using them now with hardware being so cheap.
>
> On Tue, Oct 8, 2019 at 12:43 PM Alexandre Rafalovitch
&
Try referencing the jar directly (by absolute path) with a statement
in the solrconfig.xml (and reloading the core).
The DIH example shipped with Solr shows how it works.
This will help to see if the problem with not finding the jar or something else.
Regards,
Alex.
On Wed, 9 Oct 2019 at 09:14
If you don't want it to be touched by a tokenizer, how would the
protection step know that the sequence of characters you want to
protect is "IT:ibm" and not "this is an IT:ibm term I want to
protect"?
What it sounds to me is that you may want to:
1) copyField to a second field
2) Apply a much lig
Can you give a more detailed example, please? Including the schema bits.
There is a bunch of assumptions in here that are hard to really make
sense of. Solr works with tokens, but you are talking about letter
repetitions. Also, if you want to sort by the string, why not just use
sort parameter? It
Your system is under attack, something trying to hack into it via
Solr. Possibly a cryptominer or similar. And it is using DIH endpoint
for it.
Shawn explain the most likely cause for Solr actually deleting the
records. I would also suggest:
1) Figure out where the request is coming from and treat
I don't think you can rename it in the index.
However, you may be able to rename it during the query:
https://lucene.apache.org/solr/guide/6_6/common-query-parameters.html#CommonQueryParameters-FieldNameAliases
Or, if you use eDisMax, during query rewriting:
https://lucene.apache.org/solr/guide/6
1 - 100 of 1986 matches
Mail list logo