Do the failing IDs have any special characters that might need to be
escaped?
Can you find the documents using a normal query on the unique key field?
-- Jack Krupansky
On Thu, Nov 19, 2015 at 10:27 AM, Jérémie MONSINJON <
jeremie.monsin...@gmail.com> wrote:
> Hello everyone !
>
IDs in use during a particular
interval of time?
-- Jack Krupansky
On Fri, Nov 20, 2015 at 4:50 PM, jichi wrote:
> Hi,
>
> I am using Solr 4.7.0 to search text with an id filter, like this:
>
> id:(100 OR 2 OR 5 OR 81 OR 10 ...)
>
> The number of IDs in the boolean fi
The primary recommendation is that you flatten nested documents.
That means one Solr document per cpc, not multivalued.
As always, queries should drive your data model, so please specify what a
typical query might be like, in plain English.
-- Jack Krupansky
On Tue, Nov 24, 2015 at 4:39 AM
x27;m not sure how useful it will be.
-- Jack Krupansky
On Tue, Nov 24, 2015 at 4:06 AM, Manohar Sripada
wrote:
> I have a requirement where I need to be able to query on a field (say
> "salary"). This field contains data in Chinese.
>
> Is it possible in Solr to do a ra
Yeah, this stuff is poorly documented, not very intuitive, and the
terminology is poorly designed in the first place, so it's completely
expected to easily get confused by it. Not even a mention of it in the Solr
reference guide.
-- Jack Krupansky
On Wed, Nov 25, 2015 at 4:39 AM, Aless
The mm parameter or default operator logic only applies to the top level of
the query. Once you get nested in parentheses below the top level,
Solr/Lucene reverts to the default of the OR (SHOULD) operator.
-- Jack Krupansky
On Mon, Nov 30, 2015 at 5:45 AM, Modassar Ather
wrote:
> Hi,
&g
recall (even the most remote partial match to avoid missing
any documents) with a much higher boost for exact matches.
-- Jack Krupansky
On Tue, Dec 1, 2015 at 10:10 AM, Erik Hatcher
wrote:
> One technique that works well is to use copyField to end up with two
> indexed fields, on
Never made it into CHANGES.txt either. Not part of any patch either.
Appears to have been secretly committed as a part of SOLR-6787 (Blob API) via
Revision *1650448
<http://svn.apache.org/viewvc?view=revision&revision=1650448>* in Solr 5.1.
-- Jack Krupansky
On Fri, Dec 4, 2015 a
constantly
re-read portions of the index into memory.
The practical limit for documents is not per core or number of cores but
across all cores on the node since it is mostly a memory limit and the
available CPU resources for accessing that memory.
-- Jack Krupansky
On Tue, Dec 8, 2015 at 8:57 AM
monly.
And, yes, each app has its own latency requirements. The purpose of a
general rule is to generally avoid unhappiness, but if you have an appetite
and tolerance for unhappiness, then go for it.
Replica vs. shard? They're basically the same - a replica is a copy of a
shard.
-- Jack Kr
You can also use Solr Cell to send entire PDF or office documents:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika
-- Jack Krupansky
On Wed, Dec 9, 2015 at 3:09 AM, subinalex wrote:
> Hi,
>
> I am a solr newbie,just got a quick
in a separate table (use the same partition key to assure
that the join will be more efficient by being on the same node.)
-- Jack Krupansky
On Fri, Dec 11, 2015 at 6:21 AM, Andrea Gazzarini
wrote:
> Hi Vikram,
> sounds like you're using those "dynamic" fields only for visua
and then index the raw text.
-- Jack Krupansky
On Mon, Dec 14, 2015 at 12:04 PM, Antelmo Aguilar wrote:
> Hello,
>
> I am trying to index a very large file in Solr (around 5GB). However, I
> get out of memory errors using Curl. I tried using the post script and I
> had some
You would need to define an alternate field which copied a base field but
then had the desired alternate similarity, using SchemaSimilarityFactory.
See:
https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements
-- Jack Krupansky
On Tue, Dec 15, 2015 at 10:02 AM, Dmitry Kan wrote
same things as well.
-- Jack Krupansky
On Tue, Dec 15, 2015 at 2:42 PM, Chris Hostetter
wrote:
>
> : Sweetspot does require reindexing but is that the only one? I have not
> : investigated some exotic implementations, anyone to confirm sweetspot is
> : the only one? In that case you
ink of
the company as being named "Apple Computer" even though they dropped
"Computer" from the name back in 2007. Also, it is "Inc.", not "Company",
so a proper search would be for "Apple Inc." or the old "Apple Computer,
Inc."
-- Jack Kr
Solr Cloud provides HA when you configure at least two replicas for each
shard and have at least 3 zookeepers. That's it. No deck or detail document
is needed.
-- Jack Krupansky
On Tue, Dec 15, 2015 at 9:07 PM, wrote:
> Hi Team,
>
> Can you help me in understanding in achieving
There is no HA with a single replica for each shard. Replication factor
must be at least 2 for HA.
-- Jack Krupansky
On Wed, Dec 16, 2015 at 12:38 AM, Peter Tan wrote:
> Hi Jack, What happens when there is only one replica setup?
>
> On Tue, Dec 15, 2015 at 9:32 PM, Jack Krupansky
that soft commit waits for background merges! (Hoss??)
-- Jack Krupansky
On Fri, Feb 13, 2015 at 4:47 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:
> Check
> http://search-lucene.com/?q=commit+wait+block&fc_type=mail+_hash_+user
>
> e.g. http://search-
There is no recommendation built into Solr itself, but you might get some
good ideas from this presentation:
http://www.slideshare.net/treygrainger/building-a-real-time-solrpowered-recommendation-engine
-- Jack Krupansky
On Fri, Feb 13, 2015 at 8:33 AM, wrote:
> Sir ,
>I need to kno
oss users, so a given query is likely to have been queried
recently by another user.
-- Jack Krupansky
On Sat, Feb 14, 2015 at 3:39 PM, jaime spicciati
wrote:
> All,
> This is my current understanding of how SolrCloud load balancing works...
>
> Within SolrCloud, for a cluster with more
t in invariants, but also in the actual request, which is a
contradiction in terms - what is your actual intent? This isn't the cause
of the exception, but does raise questions of what you are trying to do.
4. Why don't you have a q parameter for the actual query?
-- Jack Krupansky
On
time
when they are not at either end of the query. This way, queries such as "to
be or not to be", "vitamin a", and "the office" can still provide
meaningful and precise matches even as stop words are generally ignored.
-- Jack Krupansky
On Mon, Feb 16, 2015 at 4
ueries with operators and the case of a leading or trailing
stopword. The old Lucid query parser did have better support for queries
with stop words, but that's no longer available in their current product.
-- Jack Krupansky
On Mon, Feb 16, 2015 at 8:16 PM, Alexandre Rafalovitch
wrote:
Please provide a few examples that illustrate your requirements.
Specifically, requirements that are not met by the existing Solr stemming
filters. What is your specific goal?
-- Jack Krupansky
On Wed, Feb 18, 2015 at 10:50 AM, dinesh naik
wrote:
> Hi,
> IS there a way to achieve lemmati
he edismax query parser has a few too many parsing heuristics, causing way
too many odd combinations that are not exhaustively tested.
-- Jack Krupansky
On Sat, Feb 21, 2015 at 5:43 PM, Tang, Rebecca
wrote:
> Hi there,
>
> I have a field pg_int which is number of pages stored as intege
It's never helpful when you merely say that it "did not work" - detail the
symptom, please.
Post both the query and the response. As well as the field and type
definitions for the fields for which you expected term vectors - no term
vectors are enabled by default.
-- Jack Krupans
Is it really a string field - as opposed to a text field? Show us the field
and field type.
Besides, if it really were a "raw" name, wouldn't that be a capital "B"?
-- Jack Krupansky
On Mon, Feb 23, 2015 at 6:52 PM, Arun Rangarajan
wrote:
> I have a string fi
eyword
tokenizer and then filter it for lower case, such as when the user query
might have a capital "B". String field is most appropriate when the field
really is 100% raw.
-- Jack Krupansky
On Mon, Feb 23, 2015 at 7:37 PM, Arun Rangarajan
wrote:
> Yes, it is a string field and not
Please post the info I requested - the exact query, and the Solr response.
-- Jack Krupansky
On Tue, Feb 24, 2015 at 12:45 PM, Arun Rangarajan
wrote:
> In our case, the lower-casing is happening in a custom Java indexer code,
> via Java's String.toLowerCase() method.
>
> I
u provided
in this thread.
-- Jack Krupansky
On Tue, Feb 24, 2015 at 2:35 PM, Arun Rangarajan
wrote:
> Exact query:
> /select?q=raw_name:beyonce*&wt=json&fl=raw_name
>
> Response:
>
> { "responseHeader": {"status": 0,"QTime": 0,
It's a string field, so there shouldn't be any analysis. (read back in the
thread for the field and field type.)
-- Jack Krupansky
On Tue, Feb 24, 2015 at 3:19 PM, Alexandre Rafalovitch
wrote:
> What happens if the query does not have wildcard expansion (*)? If the
> behavior
As a general proposition, your first stop with any query interpretation
questions should be to add the debigQuery=true parameter and look at the
parsed_query in the query response which shows how the query is really
interpreted.
-- Jack Krupansky
On Wed, Feb 25, 2015 at 8:21 AM, wrote:
>
Solr also now has a schema API to dynamically edit the schema without the
need to manually edit the schema file:
https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-AddaDynamicFieldRule
-- Jack Krupansky
On Wed, Feb 25, 2015 at 3:15 PM, Vishal Swaroop
wrote:
> Thanks a
.
Please confirm which doc you were reading for the tutorial steps.
-- Jack Krupansky
On Thu, Feb 26, 2015 at 6:17 AM, rupak wrote:
> Hi,
>
> I am new in Solr and using Solr 5.0.0 search server. After installing when
> I’m going to search any keyword in solr 5.0.0 it dose not give any re
s the qt.shards
parameter as suggested, to re-emphasize to people that if they want to use
a custom handler in distributed mode, then they will most likely need this
parameter.
-- Jack Krupansky
On Thu, Feb 26, 2015 at 11:28 AM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:
> Hel
Please post your field type... or at least confirm a comparison to the
example in the javadoc:
http://lucene.apache.org/solr/4_10_3/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html
-- Jack Krupansky
On Thu, Feb 26, 2015 at 2:38 PM, jaime spicciati
wrote:
> All,
>
Most of the magic is done internal to the query parser which actually
inspects the index analyzer chain when a leading wildcard is present. Look
at the parsed_query in the debug response, and you should see that special
prefix query.
-- Jack Krupansky
On Thu, Feb 26, 2015 at 3:49 PM, jaime
You could simply hash the value before sending it to Solr and then hash the
user query before sending it to Solr as well. Do you need or want only
exact matches, or do you need keyword search, wildcards, etc?
-- Jack Krupansky
On Fri, Feb 27, 2015 at 4:38 PM, Alexandre Rafalovitch
wrote
just trying to match
the product name and availability.
-- Jack Krupansky
On Tue, Mar 3, 2015 at 4:51 PM, Tom Devel wrote:
> Hi,
>
> I am running Solr 5.0.0 and have a question about proximity search and
> multiValued fields.
>
> I am indexing xml files of the following form
citly
registered (refer to SOLR-6792)*". IOW, remove the XML
element from your solrconfig.
As far as the document analysis request handler, that should still be fine.
Are you encountering some problem? The first log line you gave is just an
INFO - information only, not a problem.
-- Jack Krupans
le now using Distributed IDF as their default?
I'm not currently using this, but the existing doc and Jira is too minimal
to offer guidance as requested above. Mostly I'm just curious.
Thanks.
-- Jack Krupansky
sted query term with
"\u0020".
-- Jack Krupansky
On Fri, Mar 13, 2015 at 2:37 AM, Rajesh
wrote:
> Hi,
>
> I want to retrieve the parent document which contain "Test Street" in
> street
> field or if any of it's child contain "Test Street" in
Oops... I said "StatsInfo" and that should have been "StatsCache"
("").
-- Jack Krupansky
On Fri, Mar 13, 2015 at 6:04 PM, Anshum Gupta
wrote:
> There's no rough formula or performance data that I know of at this point.
> About he guidance, if you wa
There was a Solr release with a bug that required that you put a space
between the left parenthesis and the "*:*". The edismax parsed query here
indicates that the "*:*" has not parsed properly.
You have "area", but in your jira you had a range query.
-- Jack Krupan
Great, glad to hear it!
One last question: What release of Solr are you using?
-- Jack Krupansky
On Tue, Mar 17, 2015 at 11:43 AM, Arsen wrote:
> Hello Jack,
>
> Jack, you made "my day" for me.
>
> Indeed, when I inserted space between "(" and "*:*
.
I think it's worth a Jira - text types should use language codes, not
country codes.
-- Jack Krupansky
On Tue, Mar 17, 2015 at 1:35 PM, Eduard Moraru wrote:
> Hi,
>
> First of all, a bit of a disclaimer: I am not a Czech language speaker, at
> all.
>
> We are using Sol
Which query parser are you using? The dismax query parser does not support
wild cards or "*:*".
Either way, the error message is unhelpful - worth filing a Jira.
-- Jack Krupansky
On Fri, Mar 20, 2015 at 7:21 AM, Vishnu Mishra wrote:
> Hi, I am using solr 4.10.3 and doing dist
have a
slice of the fields. Then separate Solr clusters could be used for each of
the slices.
-- Jack Krupansky
On Fri, Mar 20, 2015 at 7:12 AM, varun sharma
wrote:
> Requirements of the system that we are trying to build are for each date
> we need to create a SOLR index containing abo
only the native
keys for the matching records, and then you would do a database lookup in
your bulk storage engine directly by those keys to fetch just the records
that match the query results.
What do your queries tend to look like?
-- Jack Krupansky
On Sat, Mar 21, 2015 at 5:36 AM, varun sharma
?
Also be careful to be clear about using the Solr term "shard" (a slice,
across all replica nodes) as distinct from the Elasticsearch term "shard"
(a single slice of an index for a single replica, analogous to a Solr
"core".)
-- Jack Krupansky
On Tue, Mar 24, 2015 at 9
multi-tenant cluster, and if there are a large number of tenants of even a
moderate number of large tenants, you can't expect them to all run
reasonably on a relatively small cluster. Think about scalability.
-- Jack Krupansky
On Tue, Mar 24, 2015 at 1:22 PM, Ian Rose wrote:
> Let me gi
Don't confuse customers and tenants.
-- Jack Krupansky
On Tue, Mar 24, 2015 at 2:24 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> Sorry Jack. That doesn't scale when you have millions of customers. And
> these are good problems to have!
>
> On Tue, Ma
s entity controlling the
configuration of a single (Solr) server is a recipe for disaster. Solr
works well if there is an architect for the system. Ever hear the old
saying "Too many cooks spoil the stew"?
-- Jack Krupansky
On Tue, Mar 24, 2015 at 3:54 PM, Toke Eskildsen
wrote:
s for lazy-loading of cores. That may
work for you with hundreds (thousands?!) of cores/collections for tenants
who are mostly idle or dormant, but if the server is running long enough,
it may build up a lot of memory usage for collections that were active but
have gone idle after days or weeks.
That should work. Check to be sure that you really are running Solr 5.0.
Was it an old version of trunk or the 5x branch before last August when the
terms query parser was added?
-- Jack Krupansky
On Tue, Mar 24, 2015 at 5:15 PM, Shamik Bandopadhyay
wrote:
> Hi,
>
> I'm tryin
?
-- Jack Krupansky
On Fri, Mar 27, 2015 at 12:22 PM, Alex Sylka wrote:
> I am trying to write a custom analyzer , whose execution is determined by
> the value of another field within the document.
>
> For example if the locale field in the document has 'de' as the value, then
&g
The first step is to work out the queries that you wish to perform - that
will determine how the data should be organized in the Solr schema.
-- Jack Krupansky
On Sun, Mar 29, 2015 at 4:04 PM, Vijay Bhoomireddy <
vijaya.bhoomire...@whishworks.com> wrote:
> Hi,
>
>
>
>
That's an open issue. See:
https://issues.apache.org/jira/browse/SOLR-6273
-- Jack Krupansky
On Mon, Mar 30, 2015 at 5:45 PM, Timothy Ehlers wrote:
> Can you use /replication ??? How would you do this between datacenters?
>
> --
> Tim Ehlers
>
Use the Solr Admin UI analysis page to see how the text is analyzed at both
index and query time.
My e-book does have more narrative and examples for stop word processing:
http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html
-- Jack
ld by
field comparison to all existing documents.
-- Jack Krupansky
On Wed, Apr 1, 2015 at 6:35 AM, thakkar.aayush
wrote:
> I'm facing a challenges using de-dupliation of Solr documents.
>
> De-duplicate is done using TextProfileSignature with following parameters:
> field1, fi
ually happened to match the full indexing filtering.
This is a limitation of Solr. You just have to learn to live with it. Or...
don't use the word delimiter filter when you need to be able to do
wildcards of multi-part terms.
-- Jack Krupansky
On Thu, Apr 2, 2015 at 3:43 AM, Palagiri,
distribute to each term within the nested query. They don't magically
distribute to all nested queries.
Let's see you full set of query parameters, both on the request and in
solrconfig.
-- Jack Krupansky
On Thu, Apr 2, 2015 at 7:12 AM, Mahmoud Almokadem
wrote:
> Hello,
>
Position increments were considered problematic, especially for
highlighting. Did you get this for the stop filter? There was a Jira for
this - check CHANGES.TXT and the Jira for details.
For some discussion, see:
https://issues.apache.org/jira/browse/SOLR-6468
-- Jack Krupansky
On Thu, Apr 2
That's my understanding - but use the Solr Admin UI analysis page to
confirm exactly what happens, for both index and query analysis.
-- Jack Krupansky
On Thu, Apr 2, 2015 at 10:04 AM, Aman Tandon
wrote:
> Hi Jack,
>
> I read that jira, i understand the concern of heaven.
>
the committers sort out
whether it is really a bug or simply needs better doc for its expected
behavior on this specific issue.
-- Jack Krupansky
On Thu, Apr 2, 2015 at 1:02 PM, Mahmoud Almokadem
wrote:
> Thanks all for you response,
>
> But the parsed_query and number of results s
You have to tell the filter what types of tokens to generate - words,
numbers. You told it to generate... nothing. You did tell it to preserve
the original, unfiltered token though, which is fine.
-- Jack Krupansky
On Sun, Apr 5, 2015 at 3:39 AM, Mike L.
wrote:
> Solr User Group,
>
this punctuation. You can
specify a character type map to treat specific characters as letters. See
the doc. (or the examples in my e-book.)
-- Jack Krupansky
On Wed, Apr 8, 2015 at 2:50 AM, avinash09 wrote:
> not able to search on special characters like . ,_
>
> my query
> http
How much RAM do you have? Check whether your system is compute-bound or
I/O-bound? If all or most of your index doesn't fit in the system memory
available for file caching, you're asking for trouble.
Is the indexing time also unacceptably slow, or just the query time?
-- Jack Krupans
, price_, and fill in the
user-id when doing the query.
-- Jack Krupansky
On Wed, Apr 8, 2015 at 5:21 PM, Brian Usrey
wrote:
> I am extremely new to SOLR and am wondering if it is possible to do
> something like the following. Basically I have been tasked with researching
> SOLR to see
It would be better to implement such logic as a separate process - watching
for events or reading from a stream, and then feeding discrete requests (or
modest-sized batches of documents) to Solr in parallel with such processing.
-- Jack Krupansky
On Sat, Apr 11, 2015 at 1:49 AM, vishal dsouza
get a score of 0.7/2 = 0.35.
IOW, apply an additive boost of 1.0 and then a multiplicative boost of 0.5.
-- Jack Krupansky
On Sat, Apr 11, 2015 at 12:28 PM, Mike L.
wrote:
> Hello -
> I have qf boosting setup and that works well and balanced across
> different fields.
>
What does your main query look like? Normally we don't speak of "searching"
with the fq parameter - it filters the results, but the actual searching is
done via the main query with the q parameter.
-- Jack Krupansky
On Tue, Apr 14, 2015 at 4:17 AM, Ali Nazemian wrote:
> Dea
as a bitmap image, so no
text is extracted.
-- Jack Krupansky
On Tue, Apr 14, 2015 at 10:57 AM, Vijaya Narayana Reddy Bhoomi Reddy <
vijaya.bhoomire...@whishworks.com> wrote:
> Hi,
>
> I am trying to index PDF and Microsoft Office files (.doc, .docx, .ppt,
> .pptx, .xlx, an
Check to see if there are any errors in the Solr log for jpg and zip files.
Solr should do something for them - if not, file a Jira to suggest that it
should, as an imporvement. Zip should give a list of the enclosed files.
Images should at least give the metadata.
-- Jack Krupansky
On Wed, Apr
PI service
layer could be the way to go. In any case, don't try to load too much work
onto the Solr server itself.
-- Jack Krupansky
On Mon, Apr 20, 2015 at 7:32 AM, Bruno Mannina wrote:
> Hi Alex,
>
> well ok but if I have a big table ? more than 10 000 entries ?
> is it safe to do
A purely negative sub-query is not supported by Lucene - you need to have
at least one positive term, such as "*:*, at each level of sub-query. Try:
((*:* -(field:V1) AND -(field:V2)) AND -(field:V3))
-- Jack Krupansky
On Wed, Apr 22, 2015 at 10:56 AM, Dhutia, Devansh
wrote:
> I
It's not clear if OCR would happen automatically in Solr Cell, or if
changes to Solr would be needed.
For Tika OCR info, see:
https://issues.apache.org/jira/browse/TIKA-93
https://wiki.apache.org/tika/TikaOCR
-- Jack Krupansky
On Thu, Apr 23, 2015 at 9:14 AM, Alexandre Rafalovitch
Or use a Solr update processor to scrub the source values. The regex
pattern replacement processor could do the trick:
http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/update/processor/RegexReplaceProcessorFactory.html
-- Jack Krupansky
On Thu, Apr 30, 2015 at 11:17 AM, Erick
en each
virtual machine gets only a fairly tiny amount of SSD disk storage space?
Just guessing here
A little clarification is in order.
In any case, if you really only have such a limited amount of storage per
node, that probably simply means that you need more nodes.
-- Jack Krupansky
On Thu, M
number of shards times the replication
factor. But then divided by shards per node if you do place more than one
shard per node.
-- Jack Krupansky
On Thu, May 7, 2015 at 1:29 AM, Jilani Shaik wrote:
> Hi,
>
> Is it possible to restrict number of documents per shard in Solr cloud?
>
> L
Solr does have a command, but it is an expert feature and not
so clear how it works in SolrCloud.
See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers
and
https://wiki.apache.org/solr/UpdateXmlMessages#A.22rollback.22
-- Jack Krupansky
On Tue, May 12, 2015
goes into the q parameter, not the client app that is forming the
overall Solr query request.
-- Jack Krupansky
On Thu, May 14, 2015 at 7:49 AM, Steven White wrote:
> Hi Everyone,
>
> I'm trying to utilize "uf" but it doesn't work. My reading of it per:
>
>
/solr-core/org/apache/solr/update/processor/TruncateFieldUpdateProcessorFactory.html
-- Jack Krupansky
On Fri, May 15, 2015 at 11:38 AM, Charles Sanders
wrote:
> Yes, that is what I am seeing. Looking in the code myself, I see no reason
> for this behavior. That is why I assumed I was
But... if your term is a string anyway, you
could just use the keyword tokenizer.
-- Jack Krupansky
On Fri, May 15, 2015 at 4:06 PM, Charles Sanders
wrote:
> Shawn,
> Thanks a bunch for working with me on this.
>
> I have deleted all records from my index. Stopped solr. Made t
that is causing you
to resert to pattern matching and wildcards? I can't wait to hear!
I mean, if you simply want to match one of a set of numbers that are not in
a consecutive range, try the OR operator.
-- Jack Krupansky
On Mon, May 18, 2015 at 11:20 AM, Todd Long wrote:
> I'
text field, so that Lucene still knows it as an unanalyzed
string field? You need to delete the index and start over if you want to
change the field types like that.
-- Jack Krupansky
On Mon, May 18, 2015 at 8:33 AM, Charles Sanders
wrote:
> Jack,
> Thanks for the information. If I understan
nizer was it?
-- Jack Krupansky
On Mon, May 18, 2015 at 12:21 PM, Charles Sanders
wrote:
> No, the field has always been text. And from the error, its obviously
> passing a very large token to the index, regardless of the tokenizer and
> filter.
>
> So I guess I will have to tok
e the
distribution step, but is that distribution to the leader, or distribution
from leader to replicas for a shard?
-- Jack Krupansky
On Tue, May 19, 2015 at 9:01 AM, Shawn Heisey wrote:
> On 5/19/2015 3:02 AM, Bram Van Dam wrote:
> > I'm looking for a way to have Solr reject doc
"double" (solr.TrieDoubleField) gives more precision
See:
https://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/schema/TrieDoubleField.html
-- Jack Krupansky
On Tue, May 19, 2015 at 11:27 AM, Vishal Swaroop
wrote:
> Please suggest which numeric field type to use so t
Yes.
tf is both 1 and 2 - tf is per document, which is 1 for the first document
and 2 for the second document.
See:
http://lucene.apache.org/core/5_1_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
-- Jack Krupansky
On Wed, May 20, 2015 at 6:13 AM, ariya bala wrote:
>
pushed you to have documents with 1500 field?
Also, is this 1500 fields that are always populated, or are there really a
larger number of different record types, each with a relatively small
number of fields populated in a particular document?
-- Jack Krupansky
On Wed, May 20, 2015 at 8:27 AM, S
. Again, the distinction is between indexed field values and stored
field values.
-- Jack Krupansky
On Tue, May 26, 2015 at 10:25 AM, Zheng Lin Edwin Yeo
wrote:
> It is showing up in the search results. Just to confirm, does this
> UpdateProcessor method remove the characters during index
horribly wrong.
Focus on designing your app to exploit the capabilities of Solr, not to
misuse them.
In short, to answer the original question, more than a couple dozen fields
in qf is indeed too many. More than a dozen raises a yellow flag for me.
-- Jack Krupansky
On Thu, May 28, 2015 at 8:13 A
.
Solr and Lucene do not merely index a bulk blob of bytes, but
semi-structured data, in the form of documents and fields.
In some cases the indexed data can be smaller than the source data, but it
can sometimes be larger as well.
-- Jack Krupansky
On Wed, May 27, 2015 at 12:33 PM, Sznajder
en in doubt, make your schema as clean and simple as possible. Simplicity
over complexity.
-- Jack Krupansky
On Thu, May 28, 2015 at 12:06 PM, Erick Erickson
wrote:
> Gotta agree with Jack here. This is an insane number of fields, query
> performance on any significant corpus will be &q
small documents,
not large documents.
-- Jack Krupansky
On Sat, May 30, 2015 at 3:05 PM, Erick Erickson
wrote:
> Nothing's really changed in that area lately. Your co-worker is
> perhaps confusing the statement that "Solr has no a-priori limit on
> the number of distinct
other than the Lucene limit
of unique terms per segment, which is in the billions. Yeah, that should be
more clearly documented.
-- Jack Krupansky
On Tue, Jun 2, 2015 at 10:29 AM, Steven White wrote:
> Hi everyone
>
> I cannot find much useful info on LimitTokenCountFilterFactory other t
://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents
-- Jack Krupansky
On Tue, Jun 2, 2015 at 10:15 AM, Ксения Баталова
wrote:
> Hi!
>
> I'm using *SOLR 4.4.0* for searching in my project.
> Now I am facing a problem of atomic updates in multiple cores.
> From
Explain a little about why you have separate cores, and how you decide
which core a new document should reside in. Your scenario still seems a bit
odd, so help us understand.
-- Jack Krupansky
On Wed, Jun 3, 2015 at 3:15 AM, Ксения Баталова
wrote:
> Hi!
>
> Thanks for your qu
201 - 300 of 2693 matches
Mail list logo