Yes, Congratulations and a big thank you Jan!
On Thu, Feb 18, 2021 at 1:56 PM Anshum Gupta wrote:
>
> Hi everyone,
>
> I’d like to inform everyone that the newly formed Apache Solr PMC nominated
> and elected Jan Høydahl for the position of the Solr PMC Chair and Vice
> President. This decision
There is also PostingsHighlighter -- I recommend it, if only for the
performance improvement, which is substantial, but I'm not completely
sure how it handles this issue. The one drawback I *am* aware of is
that it is insensitive to positions (so words from phrases get
highlighted even in isol
On 02/17/2015 03:46 AM, Volkan Altan wrote:
First of all thank you for your answer.
You're welcome - thanks for sending a more complete example of your
problem and expected behavior.
I don’t want to use KeywordTokenizer. Because, as long as the compound words
written by the user are availabl
StandardTokenizer splits your text into tokens, and the suggester
suggests tokens independently. It sounds as if you want the suggestions
to be based on the entire text (not just the current word), and that
only adjacent words in the original should appear as suggestions.
Assuming that's what
ent insert (and
the content has the entities) and will be dificult add the DTD to the
content...
Thanks
- Raul
El 03/02/15 a las 17:15, Michael Sokolov escribió:
If the entities are in the content, you would need to add the DTD to
the content, not to the stylesheet. Or you could transfo
If the entities are in the content, you would need to add the DTD to the
content, not to the stylesheet. Or you could transform the content
converting the entities.
-Mike
On 02/03/2015 10:41 AM, Raul wrote:
Hi all!
I'm trying to use Solr with the DIH and xslt processing. All is fine
till i
Have a look here:
https://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParams;
it might answer your question. Typically what I recommend is to keep
the selected facet in view, but without any limitation on its counts.
However if you want to hide it altogether, I t
Please go ahead and play with autocomplete on safaribooksonline.com/home
- if you are not a subscriber you will have to sign up for a free
trial. We use the AnalyzingInfixSuggester. From your description, it
sounds as if you are building completions from a field that you also use
for searchin
I was tempted to suggest rehab -- but seriously it wasn't clear if Nitin
meant the log files Michael is referring to, or the transaction log
(tlog). If it's the transaction log, the solution is more frequent hard
commits.
-Mike
On 2/2/2015 11:48 AM, Michael Della Bitta wrote:
If you'd like
If you have a finite known set of hosts, you could do something truly awful:
create a field for each distinct host and set all of them to have
value={id of the document} except for the host to which the document
belongs: assign that hostname field some constant value, like "true".
Then query
On 1/31/2015 2:47 PM, Mikhail Khludnev wrote:
Michael,
Please check two questions inlined below
Hi Mikhail,
On Sat, Jan 31, 2015 at 10:14 PM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:
You can only handle a single relation this way since you have to
restructure your in
We were using grouping (no DocValues, though) and recently switched to
using block-indexing and joins (see
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers).
We got a nice speedup on average (perhaps 2x faster) and an even better
improvement in t
commits to the Solr-transaction.log?
-Clemens
-Ursprüngliche Nachricht-
Von: Michael Sokolov [mailto:msoko...@safaribooksonline.com]
Gesendet: Dienstag, 20. Januar 2015 14:54
An: solr-user@lucene.apache.org
Betreff: Re: transactions@Solr(J)
On 1/20/2015 5:18 AM, Clemens Wyss DEV wr
On 1/20/2015 5:18 AM, Clemens Wyss DEV wrote:
http://stackoverflow.com/questions/10805117/solr-transaction-management-using-solrj
Is it true, that a SolrServer-instance denotes a "transaction context"?
Say I have two concurrent threads, each having a SolrServer-instance "pointing" to the
same c
You can also implement your own cursor easily enough if you have a
unique sortkey (not relevance score). Say you can sort by id, then you
select batch 1 (50k docs, say) and record the last (maximum) id in the
batch. For the next batch, limit it to id > last_id and get the first
50k docs (don't
I've seen the same thing, poked around a bit and eventually decided to
ignore it. I think there may be a ticket related to that saying it's a
logging bug (ie not a real issue), but I couldn't swear to it.
-Mike
On 01/16/2015 12:36 PM, Tom Burton-West wrote:
Hello,
I'm running Solr 4.10.2 ou
can avoid the rebuilt index on every commit or
optimize.
Is this the right way ?? or any that I missed ???
Regards
dhanesh s.r
On Thu, Jan 15, 2015 at 3:20 AM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:
did you build the spellcheck index using spellcheck.build as descr
d, Jan 14, 2015 at 12:47 AM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:
I think you are probably getting bitten by one of the issues addressed in
LUCENE-5889
I would recommend against using buildOnCommit=true - with a large index
this can be a performance-killer. Instead, bui
As a foolish dev (not malicious I hope!), I did mess around with
something like this once; I was writing my own Codec. I found I had to
create a file called META-INF/services/org.apache.lucene.codecs.Codec in
my solr plugin jar that contained the fully-qualified class name of my
codec: I guess
I think you are probably getting bitten by one of the issues addressed
in LUCENE-5889
I would recommend against using buildOnCommit=true - with a large index
this can be a performance-killer. Instead, build the index yourself
using the Solr spellchecker support (spellcheck.build=true)
-Mike
It looks like this is a good starting point:
http://wiki.apache.org/solr/SolrConfigXml#codecFactory
-Mike
On 01/12/2015 03:37 PM, Tom Burton-West wrote:
Hello all,
Our indexes have around 3 billion unique terms, so for Solr 3, we set
TermIndexInterval to about 8 times the default. The net ef
On 12/30/14 12:42 PM, Jonathan Rochkind wrote:
On 12/30/14 12:35 PM, Walter Underwood wrote:
You want preserveOriginal=“1”.
You should only do this processing at index time.
If I only do this processing at index time, then "mixedCase" at query
time will no longer match "mixed Case" in the in
I noticed that your suggester analyzers include
which seems like a bad idea -- this will strip all those arabic, russian
and japanese characters entirely, leaving you with probably only
whitespace in your tokens. Try just removing that?
-Mike
On 12/24/14 6:09 PM, alaa.abuzaghleh wrote:
I
t 12:33 AM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:
Have other people tried migrating an index that was created without block
(parent/child) indexing to one that *does* have it? Did you find that you
got duplicate documents - ie multiple documents with the same uniqueField
value
Have other people tried migrating an index that was created without
block (parent/child) indexing to one that *does* have it? Did you find
that you got duplicate documents - ie multiple documents with the same
uniqueField value? That's what I found, and I don't see how that's
possible.
What
Thanks Andrey! I voted for your patch
-Mike
On 12/17/2014 4:01 AM, Kydryavtsev Andrey wrote:
For support scoreMode parameter in BlockJoinParentQParser we have this jira
with attached patch https://issues.apache.org/jira/browse/SOLR-5882
17.12.2014, 06:54, "Michael Sokolov" :
I
I'm trying to use BJPQP and ran into a few little gotchas that I'd like
to share with y'all in case you have any advice.
First I ran into an NPE that probably should be handled better - maybe
just an exception with a better message. The framework I'm working in
makes it slightly annoying to u
Well I think your first step should be finding a reproducible test case
and encoding it as a unit test. But I suspect ultimately the fix will
be something to do with positionIncrement ...
-Mike
On 12/15/2014 09:08 AM, Erlend Garåsen wrote:
On 15.12.14 14:11, Michael Sokolov wrote:
I'
I'm not sure, but is it necessary to set positionIncAttr to 1 when there
are *not* any lemmas found? I think the usual pattern is to call
clearAttributes() at the start of incrementToken
-Mike
On 12/15/14 7:38 AM, Erlend Garåsen wrote:
I have written a dictionary-based lemmatizer for Universi
e case?
What do mean by "controlling the fields used for phrase queries" ?
Rgds
AJ
On 12-Dec-2014, at 20:11, Michael Sokolov
wrote:
Doug - I believe pf controls the fields that are used for the phrase
queries *generated by the parser*.
What I am after is controlling the fields use
I want terms to be stemmed, unless they are quoted, using dismax.
On 12/12/14 8:19 PM, Amit Jha wrote:
Hi Mike,
What is exact your use case?
What do mean by "controlling the fields used for phrase queries" ?
Rgds
AJ
On 12-Dec-2014, at 20:11, Michael Sokolov
wrote:
Doug - I
,
I typically solve this problem by using a copyField and running different
analysis on the destination field. Then you could use this field as pf
insteaf of qf. If I recall, fields in pf must also be mentioned in qf for
this to work.
-Doug
On Fri, Dec 12, 2014 at 8:13 AM, Michael Sokolov <
ms
t.
Ahmet
On Thursday, December 11, 2014 10:50 PM, Michael Sokolov
wrote:
I'd like to supply a different set of fields for phrases than for bare
terms. Specifically, we'd like to treat phrases as more "exact" -
probably turning off stemming and generally having a tighter analysis
So the short answer to your original question is "no." Highlighting is
designed to find matches *within* a tokenized (text) field only. That
is difficult because text gets processed and there are all sorts of
complications, but for integers it should be pretty easy to match the
values in the d
Have you rebooted the machine? (last refuge of the clueless, but often
works) ...
On 12/11/14 2:50 PM, solr-user wrote:
yes, have triple checked the schema and solrconfig XML; various tools have
indicated the XML is valid
no missing types or dupes, and have not disabled the admin handler
as m
I'd like to supply a different set of fields for phrases than for bare
terms. Specifically, we'd like to treat phrases as more "exact" -
probably turning off stemming and generally having a tighter analysis
chain. Note: this is *not* what's done by configuring "pf" which
controls fields for t
Alex, I spent some time answering questions there, but got ultimately
got turned off by the competitive nature of it. I wanted to increase my
score -- fun! But if you are not watching it all the time, the questions
go by very fast, and you lose your edge. The typical pattern seems to
be: so-so
text_suggest
4
...
my schema.xml
...
...
...
-Ursprüngliche Nachricht-
Von: Michael Sokolov [mailto:msoko...@safaribooksonline.com]
Gesendet: Donnerstag, 4. Dezember 2014 14:05
An: solr-user@lucene.apache.org
Betreff: Re: Keeping capitalization in suggestions?
Have a look
Right - allowing Solr to manage these queries (SOLR-6234) seems like the
way to go
... OP == original poster (I lost track of who started the discussion)
-Mike
On 12/08/2014 10:19 AM, Mikhail Khludnev wrote:
On Mon, Dec 8, 2014 at 5:38 PM, Michael Sokolov <
msoko...@safaribooksonline.
I get the impression there was a concern that the caller could hold on
to the query generated by JoinUtil for too long - eg across requests in
Solr. I'm not sure why the OP thinks that would happen, though.
-Mike
On 12/08/2014 04:57 AM, Mikhail Khludnev wrote:
On Fri, Dec 5, 2014 at 10:44 PM,
How about creating a new core that only holds a single week's documents,
and retrieving all of its terms? Then each week, flush it and start over.
-Mike
On 12/05/2014 07:54 AM, lboutros wrote:
Dear all,
I would like to get the new terms of fields since last update (once a week).
If I retriev
There's no appreciable RAM cost during querying, faceting, sorting of
search results and so on. Stored fields are separate from the inverted
index. There is some cost in additional disk space required and I/O
during merging, but I think you'll find these are not significant. The
main cost we
Have a look at AnalyzingInfixSuggester - it does what you want.
-Mike
On 12/4/14 3:05 AM, Clemens Wyss DEV wrote:
When I index a text such as "Chamäleon" and look for suggestions for "chamä" and/or
"Chamä", I'd expect to get "Chamäleon" (uppercased).
But what happens is
If lowecasefilter (see
Stefan I had problems like this -- and the short answer is -- it's a
PITA. Solr is not really designed to be extended in this way. In fact
I believe they are moving towards an architecture where this is even
less possible - folks will be encouraged to run solr using a bundled
exe, perhaps wit
On 12/02/2014 03:41 PM, Mikhail Khludnev wrote:
Thanks for suggestions. Do I remember correctly that you ignored last
Lucene Revolution?
I wouldn't say I ignored it, but it's true I wasn't there in DC: I'm
excited to catch up on the presentations as the videos become available,
though.
-Mike
Mikhail - I can imagine a filter that strips out everything but numbers
and then indexes those with a (separate) numeric (trie) field. But I
don't believe you can do phrase or other proximity queries across
multiple fields. As long as an or-query is good enough, I think this
problem is not to
, Michael Sokolov
wrote:
Have you considered using grouping? If I understand your requirements, I think
it does what you want.
https://cwiki.apache.org/confluence/display/solr/Result+Grouping
<https://cwiki.apache.org/confluence/display/solr/Result+Grouping>
On 12/02/2014 12:59 PM, Dari
I would keep trying with the highlighters. Some of them, at least, have
options to provide an external text source, although you will almost
certainly have to write some java code to get this working; extend the
highlighter you choose and supply its text from an external source.
-Mike
On 12
Have you considered using grouping? If I understand your requirements,
I think it does what you want.
https://cwiki.apache.org/confluence/display/solr/Result+Grouping
On 12/02/2014 12:59 PM, Darin Amos wrote:
Thanks!
I will take a look at this. I do have an additional question, since after a
On 11/29/14 1:30 PM, Toke Eskildsen wrote:
Michael Sokolov [msoko...@safaribooksonline.com] wrote:
I wonder if there's any value in providing this metric (total index size
- stored field size - term vector size) as part of the admin panel? Is
it meaningful? It seems like there would be
pularizers community: https://www.linkedin.com/groups?gid=6713853
On 29 November 2014 at 13:16, Michael Sokolov
wrote:
Of course testing is best, but you can also get an idea of the size of the
non-storage part of your index by looking in the solr index folder and
subtracting the size of the files cont
Of course testing is best, but you can also get an idea of the size of
the non-storage part of your index by looking in the solr index folder
and subtracting the size of the files containing the stored fields from
the total size of the index. This depends of course on the internal
storage stra
Yes - here's a working example we have in production (tested in 4.8.1
and 4.10.2, but the underlying lucene stuff hasn't changed since 4.6.1
I'm pretty sure):
https://github.com/safarijv/ifpress-solr-plugin/blob/master/src/main/java/com/ifactory/press/db/solr/processor/UpdateDocValuesProcessor.
Scores are related to total term frequencies *in each shard*, not
globally, and I think they may include term counts from deleted
documents as well, which could account for the discrepancy in scores
across the two shards.
-Mike
On 11/25/14 3:22 AM, rashi gandhi wrote:
Hi,
I have created t
right -- missed Ahmet's answer there in my haste to respond ...
-Mike
On 11/25/14 6:56 AM, Ahmet Arslan wrote:
Hi Apurv,
I wouldn't worry about index size, increase in index size is not linear (2x)
like that.
Please see similar discussion :
https://issues.apache.org/jira/browse/LUCENE-5620
A
The index size will not increase as quickly as you might think, and is
not an issue in most cases. An alternative to two fields, though, is to
index both upper- and lower-case tokens at the same position in a single
field, and then to perform no case folding at query time. There is no
standar
maybe try
description_shingle:(Highest quality)
On 11/24/14 1:46 PM, vit wrote:
I have Solr 4.2.1
I am using the following analyser:
Those Spi classes rely on a configuration file that gets stored in the
META-INF folder. I'm not familiar with who OSGI works, but I'm pretty
sure that failure is because the file
META-INF/services/org.apache.lucene.codecs.Codec (you'll see it in the
lucene-core jar) can't be found
-Mike
On
If you're willing to write some Java you can do something more efficient
by intersecting two terms enumerations: this works with constant memory
for any number of values in two fields, basically like intersecting any
two sorted lists, you leap frog between them. I have an example if
you're int
OK - please disregard; I found a rogue new component in our analyzer
that was messing everything up.
The hunspell behavior was perhaps a little confusing, but I don't
believe it leads to broken queries.
-Mike
On 11/18/2014 02:38 PM, Michael Sokolov wrote:
followup - hunspell has:
f
I find that a query for stemmed terms sometimes fails with the edismax
query parser and hunspell stemmer. Looklng at the output of analysis for
the query (text:following) I can see that it generates two different
terms at the same position: "follow" and "following". Then edismax seems
to genera
nerating
multiple "stems" causes issues
On 11/18/2014 02:33 PM, Michael Sokolov wrote:
I find that a query for stemmed terms sometimes fails with the edismax
query parser and hunspell stemmer. Looklng at the output of analysis
for the query (text:following) I can see that it generates two
Mike
On 11/14/14 2:01 AM, Walter Underwood wrote:
We get no suggestions until we force a build with suggest.build=true. Maybe we
need to define a spellchecker component to get that behavior?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/
On Nov 13, 2014, at
On 11/14/2014 01:43 PM, Erick Erickson wrote:
Just skimming, so maybe I misinterpreted.
ExternalFileField and ExternalFileFieldReloader
refer to storing values for each doc in an external file, they have
nothing to do with storing _files_.
The usual pattern is to have Solr store just enough da
an use filter query like "fq=terms:a:1"
2014. 11. 13. 오전 3:59에 "Michael Sokolov"
님이
작성:
We routinely store images and pdfs in Solr. There *is* a benefit, since
you don't need to manage another storage system, you don't have to worry
about Solr getting out of sync with
4/14 2:01 AM, Walter Underwood wrote:
We get no suggestions until we force a build with suggest.build=true. Maybe we
need to define a spellchecker component to get that behavior?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/
On Nov 13, 2014, at 10:56 PM, Michael
I believe the spellchecker component persists these indexes now and
reloads them on restart rather than rebuilding.
-Mike
On 11/13/14 7:40 PM, Walter Underwood wrote:
We have to manually rebuild the suggest dictionaries after a restart. This
seems odd, since someone else had a problem because
We routinely store images and pdfs in Solr. There *is* a benefit, since
you don't need to manage another storage system, you don't have to worry
about Solr getting out of sync with the other system, you can use Solr
replication for all your assets, etc.
I don't use DIH, so personally I don't c
The usual approach is to use copyField to copy multiple fields to a
single field.
I posted a solution using an UpdateRequestProcessor to merge fields, but
with different analyzers, here:
https://blog.safaribooksonline.com/2014/04/15/search-suggestions-with-solr-2/
My latest approach is this:
The goal is to ensure that suggestions from autocomplete are actually
terms in the main index, so that the suggestions will actually result in
matches. You've considered expanding the main index by adding the
suggestion n-grams to it, but it would probably be better to alter your
suggester so
You didn't describe your analysis chain, but maybe you are using
WordDelimiterFilter to break up hyphenated words? If so, it has a
protwords.txt feature that lets you specify exceptions
-Mike
On 11/5/2014 5:36 PM, Michael Della Bitta wrote:
Pretty sure what you need is called KeywordMarkerFil
Shawn this is really weird -- we run log4j in lots of installations and
have never seen an issue like this.
I wonder if you might be running some other log rotation software (like
logrotate) that is somehow getting in the way or conflicting?
-Mike
On 11/01/2014 01:45 PM, Shawn Heisey wrote:
Just to get the obvious sledgehammer solution out of the way - upload a
new, edited solrconfig.xml with the default changed, and reload the core.
-Mike
On 11/3/14 6:28 AM, Dmitry Kan wrote:
Hello solr fellows,
I'm working on a project that involves using two update chains. One default
chain
OK, I opened SOLR-6672; not sure how I stumbled into using white space;
I would ordinarily use commas too, I think.
-Mike
On 10/29/14 1:23 PM, Chris Hostetter wrote:
: fl="id field(units_used) archive_id"
I didn't even realize until today that fl was documented to support space
seperated fiel
I noticed that when you include a function as a result field, the
corresponding key in the result markup includes trailing whitespace,
which seems like a bug. I wonder if anyone knows if there is a ticket
for this already?
Example:
fl="id field(units_used) archive_id"
ends up returning resu
really offer a solution to your
problem, but there are some possibly helpful similarities: you will
probably want to write a custom UpdateRequestProcessor, and you will
want to feed the suggester with a custom Dictionary / InputIterator as I
have done in that example.
-Mike
-Clemens
-U
This project (https://github.com/safarijv/ifpress-solr-plugin/) has some
examples of custom Solr UpdateRequestProcessors that feed a single
suggester from multiple fields, applying different weights to them,
using complete values from some and analyzing others into tokens.
The first thing I di
3.16e-11.0 looks fishy to me
On 10/23/14 5:09 PM, eShard wrote:
Good evening,
I'm using solr 4.0 Final.
I tried using this function
boost=recip(ms(NOW/HOUR,startdatez,3.16e-11.0,0.08,0.05))
but it fails with this error:
org.apache.lucene.queryparser.classic.ParseException: Expected ')' at
posi
That's what I thought; thanks, Markus.
On 10/23/14 2:19 PM, Markus Jelsma wrote:
You either need to upload them and issue the reload command, or download them
from the machine, and then issue the reload command. There is no REST support
for it (yet) like the synonym filter, or was it stop filt
Thanks for the links, Ramzi. I had already read the wiki page, which
merely talks about how to reload the file into memory once it has been
updated on disk. It doesn't mention any support for uploading that I can
see. Did I miss it?
-Mike
On 10/23/14 1:36 PM, Ramzi Alqrainy wrote:
Of cour
I've been looking at ExternalFileField to handle popularity boosting.
Since Solr updatable docvalues (SOLR-5944) isn't quite there yet. My
question is whether there is any support for uploading the external file
via Solr, or if people do that some other (external, I guess) way?
-Mike
It seems as if 0-hit queries should be pretty fast since they can
terminate very early? Are you seeing a big difference between
first-time and subsequent (cached) no-match queries?
-Mike
On 6/5/2014 8:47 AM, Dmitry Kan wrote:
Hi,
Solr is good at caching: even if first "cold" query takes lo
this shard
Best,
Erick
On Mon, Jun 2, 2014 at 4:27 AM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:
Joe - there shouldn't really be a problem *indexing* these fields:
remember that all the terms are spread across the index, so there is
really
no storage diffe
Joe - there shouldn't really be a problem *indexing* these fields:
remember that all the terms are spread across the index, so there is
really no storage difference between one 180MB document and 180 1 MB
documents from an indexing perspective.
Making the field "stored" is more likely to lead
Is it possible that all your requests are routed to that single shard?
I.e. you are not using the smart client that round-robins requests? I
think that could cause all of the merging of results to be done on a
single node.
Also - is it possible you have a "bad" document in that shard? Like o
Alex - the query parsers generally accept an analyzer, which they must
apply after they perform their own tokenization. Consider: how would a
capitalized query term match lower-cased terms in the index without
query analysis?
-Mike
On 5/17/2014 4:05 AM, Alexandre Rafalovitch wrote:
Hello,
Thanks Dmitry!
On 05/15/2014 07:54 AM, Dmitry Kan wrote:
Hi Mike,
The core name can be accessed via: ${solr.core.name} in solrconfig.xml
(verified in a solr replication config).
HTH,
Dmitry
On Fri, May 9, 2014 at 4:07 PM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:
It
It seems as if the location of the suggester dictionary directory is not
core-specific, so when the suggester is defined for multiple cores, they
collide: you get exceptions attempting to obtain the lock, and the
suggestions bleed from one core to the other. There is an
(undocumented) "indexP
On 5/11/2014 12:55 PM, Olivier Austina wrote:
Hi All,
Is there a way to know if a website use Solr? Thanks.
Regards
Olivier
Ask the people who run the site?
I don't know what the design was, but your use case seems valid to me: I
think you should submit a ticket and a patch. If you write a test, I
suppose it might be more likely to get accepted.
-Mike
On 5/6/2014 10:59 AM, Cario, Elaine wrote:
I experimented locally with modifying the SolrCore c
I'm pretty sure there's nothing to automate that task, but there are
some tools to help with indexing XML. Lux (http://luxdb.org) is one; it
can index all the element text and attribute values, effectively
creating an index for each tag name -- these are not specifically
Solr/Lucene fields, bu
on lucene 4.8?
https://issues.apache.org/jira/plugins/servlet/mobile#issue/LUCENE-5111
Michael Sokolov schreef:For posterity, in case
anybody follows this thread, I tracked the
problem down to WordDelimiterFilter; apparently it creates an offset of
-1 in some case, which PostingsHighlighter rejects.
-M
For posterity, in case anybody follows this thread, I tracked the
problem down to WordDelimiterFilter; apparently it creates an offset of
-1 in some case, which PostingsHighlighter rejects.
-Mike
On 5/2/2014 10:20 AM, Michael Sokolov wrote:
I checked using the analysis admin page, and I
I checked using the analysis admin page, and I believe there are offsets
being generated (I assume start/end=offsets). So IDK I am going to try
reindexing again. Maybe I neglected to reload the config before I
indexed last time.
-Mike
On 05/02/2014 09:34 AM, Michael Sokolov wrote:
I
I've been wanting to try out the PostingsHighlighter, so I added
storeOffsetsWithPositions to my field definition, enabled the
highlighter in solrconfig.xml, reindexed and tried it out. When I
issue a query I'm getting this error:
|field 'text' was indexed without offsets, cannot highlight
On 4/27/14 7:02 PM, Michael Sokolov wrote:
On 4/27/2014 6:30 PM, Trey Grainger wrote:
So my question basically is: which restrictions are applied to the
docset
from which (field) facets are computed?
Facets are generated based upon values found within the documents
matching
your &q
On 4/27/2014 6:30 PM, Trey Grainger wrote:
So my question basically is: which restrictions are applied to the docset
from which (field) facets are computed?
Facets are generated based upon values found within the documents matching
your "q=" parameter and also all of your "fq=" parameters. Basi
I'm trying to understand the facet counts I'm getting back from Solr
when the main query includes a term that restricts on a field that is
being faceted. After reading the docs on the wiki (both wikis) I'm
confused.
In my little test dataset, if I facet on "type" and use q=*:*, I get
facet c
The ordering at the lowest level in Lucene is controlled based on an
arbitrary weighting factor: I believe the only option you have at the
Solr level is to order by term value (eg alphabetically), or by term
frequency. You could do this by creating a field with all of your
"sales" - if you cre
I believe you could use term vectors to retrieve all the terms in a
document, with their offsets. Retrieving them from the inverted index
would be expensive since the index is term-oriented, not
document-oriented. Without tv, I think you essentially have to scan the
entire term dictionary loo
1 - 100 of 247 matches
Mail list logo