mean there will be
a reduction in the amount of system memory needed for file caching of the
Lucene index. 100 / 4 * 2.8GB = 70 GB of RAM needed on each server.
-- Jack Krupansky
On Thu, Jan 8, 2015 at 10:57 AM, Andrew Butkus <
andrew.but...@c6-intelligence.com> wrote:
> Hi Shawn,
>
table
performance for both indexing and a full range of queries, and then use 10x
that RAM for the RAM for the 100% load. That's the OS system memory for
file caching, not the total system RAM.
-- Jack Krupansky
On Thu, Jan 8, 2015 at 4:55 PM, Nishanth S wrote:
> Thanks guys for your inpu
Consider an update processor - it can take any input, break it up any way
you want, and then output multiple field values.
You can even us the stateless script update processor to write the logic in
JavaScript.
-- Jack Krupansky
On Fri, Jan 9, 2015 at 6:47 AM, tomas.kalas wrote:
> Hello
that the field type
uses the reversed wildcard filter, and then it generates a wildcard query
that using the reversed query token and wildcard pattern so that the
leading wildcard becomes a trailing wildcard or prefix query
-- Jack Krupansky
On Fri, Jan 9, 2015 at 3:15 PM, Alexandre Rafalovitch
uot;expert" feature. And there should be doc
on how to use it.
I do have some doc in my e-book, with some examples, but even that does not
show the complete end-to-end config and schema.
-- Jack Krupansky
On Sat, Jan 10, 2015 at 1:13 AM, Alexandre Rafalovitch
wrote:
> So, Query Parser does
Correct, Solr clearly needs improvement in this area. Feel free to comment
on the Jira about what options you would like to see supported.
-- Jack Krupansky
On Sat, Jan 10, 2015 at 5:49 AM, SolrUser1543 wrote:
> From reading this (https://issues.apache.org/jira/browse/SOLR-445) I see
>
the
server rather than optimize performance.
-- Jack Krupansky
On Sat, Jan 10, 2015 at 6:02 AM, SolrUser1543 wrote:
> Would it be a good solution to index single document instead of bulk ?
> In this case I will know about the status of each message .
>
> What is recommendation
ot;required".) So, please
explain in plain English what effect you are trying to achieve. mm is not
for newbies!
Also, please point us to whatever doc or other material you were reading
that gave you the impression that mm was appropriate for your use case, so
that we can correct any bad documen
ities/TFIDFSimilarity.html
And to use your custom similarity class in Solr:
https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements#OtherSchemaElements-Similarity
-- Jack Krupansky
On Sun, Jan 11, 2015 at 9:04 AM, Ali Nazemian wrote:
> Hi everybody,
>
> I am going to add some analy
than this optimize operation?
-- Jack Krupansky
On Sun, Jan 11, 2015 at 1:46 AM, ig01 wrote:
> Thank you all for your response,
> The thing is that we have 180G index while half of it are deleted
> documents.
> We tried to run an optimization in order to shrink index size but it
client or app layer code, then maybe
you just need to put more intelligence into that query-generation code in
the client.
-- Jack Krupansky
On Sun, Jan 11, 2015 at 12:08 PM, Michael Lackhoff
wrote:
> Hi Ahmet,
>
> > You might find this useful :
> > https://lucidworks.com/blog/
detect some common use cases and handle them
specially in your client. Such as the example you gave - you could extract
the terms and generate separate bq parameters.
-- Jack Krupansky
On Sun, Jan 11, 2015 at 1:28 PM, Michael Lackhoff
wrote:
> Am 11.01.2015 um 18:30 schrieb Jack Krupan
Won't function queries do the job at query time? You can add or multiply
the tf*idf score by a function of the term frequency of arbitrary terms,
using the tf, mul, and add functions.
See:
https://cwiki.apache.org/confluence/display/solr/Function+Queries
-- Jack Krupansky
On Sun, Jan 11,
Could you clarify what you mean by "Lucene reverse index"? That's not a
term I am familiar with.
-- Jack Krupansky
On Mon, Jan 12, 2015 at 1:01 AM, Ali Nazemian wrote:
> Dear Jack,
> Thank you very much.
> Yeah I was thinking of function query for sorting, but I have to
That's your job. The easiest way is to do a copyField to a "string" field.
-- Jack Krupansky
On Tue, Jan 13, 2015 at 7:33 AM, Naresh Yadav wrote:
> *Schema :*
>
>
> *Code :*
> SolrQuery q = new SolrQuery().setQuery("*:*");
> q.set(GroupParams.GR
A function query or an update processor to create a separate field are
still your best options.
-- Jack Krupansky
On Tue, Jan 13, 2015 at 4:18 AM, Ali Nazemian wrote:
> Dear Markus,
>
> Unfortunately I can not use payload since I want to retrieve this score to
> each user as a
ipt update processors, see my Solr e-book:
http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html
-- Jack Krupansky
On Tue, Jan 13, 2015 at 9:21 AM, tomas.kalas wrote:
> Thanks Jack for your advice. Can you please explain me little
s only . You can use a second pattern char filter to remove
the "<[/]d[12>" markers as well, probably changing them to a space in both
cases.
See:
http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/pattern/PatternReplaceCharFilterFactory.html
-- Jack K
umber of unique row sets.
-- Jack Krupansky
On Tue, Jan 13, 2015 at 4:29 PM, tedsolr wrote:
> I have a complicated problem to solve, and I don't know enough about
> lucene/solr to phrase the question properly. This is kind of a shot in the
> dark. My requirement is to return searc
It should replace all occurrences of the pattern. Post your specific filter
XML. Patterns can be very tricky.
Use the Solr Admin UI analysis page to see how the filtering is occurring.
-- Jack Krupansky
On Wed, Jan 14, 2015 at 7:16 AM, tomas.kalas wrote:
> Jack, thanks for help, but if i u
I was suspecting it might do that - the pattern is "greedy" and takes the
longest matching pattern. Add a question mark after the asterisk to use
stingy mode that matches the shortest pattern.
-- Jack Krupansky
On Wed, Jan 14, 2015 at 8:37 AM, tomas.kalas wrote:
> I just used Sol
It's what Java has, whatever that is:
http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
So, maybe the correct answer is neither, but similar to both.
-- Jack Krupansky
On Wed, Jan 14, 2015 at 9:06 AM, tomas.kalas wrote:
> Oh yeah, that is it. Thank you very much
ow the new analytics component doesn't support distributed mode, but my
question is about the old "stats" component.
-- Jack Krupansky
admittedly, it's moot if
stats is eventually to be superseded by the analytics component.
-- Jack Krupansky
On Wed, Jan 14, 2015 at 12:26 PM, Chris Hostetter
wrote:
>
> : Does anybody know for sure whether the stats component fully supports
> : distributed mode? It is listed in
to do customization, entity extraction, boiler-plate removal, etc. in
app-friendly code, before transport to the Solr server.
The extraction request handler is a really cool feature and quite
sufficient for a lot of scenarios, but additional architectural flexibility
would be a big win.
-- Jack
It sounds like your app needs a lot more RAM so that it is not doing so
much I/O.
-- Jack Krupansky
On Tue, Jan 20, 2015 at 9:24 AM, Nimrod Cohen wrote:
> Hi
>
> I done some performance test, and I wanted to know if any one saw the same
> behavior.
>
>
>
> We need to
The problem is that the presence of a wildcard causes Solr to skip the
usual token analysis. But... you could add a "multiterm" analyzer, and then
the wildcard would just get treated as punctuation.
-- Jack Krupansky
On Thu, Jan 22, 2015 at 4:33 PM, Jorge Luis Betancourt González &
Solr
tried to find the remaining terms in the default query field.
-- Jack Krupansky
On Thu, Jan 22, 2015 at 5:47 PM, Carl Roberts wrote:
> Hi,
>
> How do you query a sentence composed of multiple words in a description
> field?
>
> I want to search for sentence "Oracle Fusi
The dismax query parser does not support wildcards. It is designed to be
simpler.
-- Jack Krupansky
On Thu, Jan 22, 2015 at 5:57 PM, Jorge Luis Betancourt González <
jlbetanco...@uci.cu> wrote:
> I was also suspecting something like that, the odd thing was that the with
> the dismax
Presence of a wildcard in a query term is detected by the traditional Solr
and edismax query parsers and causes normal term analysis to be bypassed.
As I said, wildcards are a specific feature that dismax specifically
doesn't support - this has nothing to do with edismax.
-- Jack Krupansk
/org/apache/solr/handler/FieldAnalysisRequestHandler.html
and in solrconfig.xml
-- Jack Krupansky
On Thu, Jan 22, 2015 at 8:42 AM, Amit Jha wrote:
> Hi,
>
> I need to know how can I retrieve phonetic codes. Does solr provide it as
> part of result? I need codes for record matching.
&g
That's phone the filter is doing - transforming text into phonetic codes at
index time. And at query time as well to do the phonetic matching in the
query. The actual phonetic codes are stored in the index for the purposes
of query matching.
-- Jack Krupansky
On Fri, Jan 23, 2015 at 12:
or maybe use a Solr update processor to pull the
string apart and store the individual pieces as separate fields.
As always, the first question is not how to store your data, but how your
users intend to access your data. Post some sample queries. I imagine that
any sane user would like to refere
which treated the colons as token separators.
-- Jack Krupansky
On Sat, Jan 24, 2015 at 3:28 PM, Alexandre Rafalovitch
wrote:
> You are using keywords here that seem to contradict with each other.
> Or your use case is not clear.
>
> Specifically, you are saying you are getting s
How are you currently importing data?
-- Jack Krupansky
On Sat, Jan 24, 2015 at 3:42 PM, Carl Roberts wrote:
> Sorry if I was not clear. What I am asking is this:
>
> How can I parse the data during import to tokenize it by (:) and strip the
> cpe:/o?
>
>
>
> On 1/2
Take a look at the RegexTransformer. Or,in some cases your may need to use
the raw ScriptTransformer.
See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler
-- Jack Krupansky
On Sat, Jan 24, 2015 at 3:49 PM, Carl Roberts wrote
need to be able to handle.
-- Jack Krupansky
On Wed, Jan 28, 2015 at 5:56 AM, thakkar.aayush
wrote:
> I have around 1 million job titles which are indexed on Solr and am looking
> to improve the faceted search results on job title matches.
>
> For example: a job search for *Resear
Sorry, that feature is not available in Solr at this time. You could
implement an update processor which copied only the desired input field
values. This can be done in JavaScript using the script update processor.
-- Jack Krupansky
On Mon, Feb 2, 2015 at 2:53 AM, danny teichthal wrote:
>
The Solr properties can also be defined in solrcore.properties and
core.properties files:
https://cwiki.apache.org/confluence/display/solr/Configuring+solrconfig.xml
-- Jack Krupansky
On Tue, Feb 3, 2015 at 3:31 PM, O. Olson wrote:
> Thank you Jim. I was hoping if there is an alternative
l not be a matter of how many documents you can load, but
whether the query response latency for those documents is sufficient.
-- Jack Krupansky
On Wed, Feb 4, 2015 at 4:54 PM, Arumugam, Suresh
wrote:
> Hi All,
>
>
>
> We are trying to load 14+ Billion documents into Solr. But we a
this front?
-- Jack Krupansky
On Wed, Feb 11, 2015 at 8:05 AM, Erick Erickson
wrote:
> bq: Are there any such structures?
>
> Well, I thought there were, but I've got to admit I can't call any to mind
> immediately.
>
> bq: 2b is just the hard limit
>
> Yeah,
tenant has their own app and the service provider controls the Solr
server but has no control over the app or load.
The first is supported by Solr. The second is not, other than the service
provider spinning up separate instances of Solr on separate physical
servers.
-- Jack Krupansky
On Thu
Please report any comments or issues to my email address or comment on my
blog. Comments on the blog will benefit other readers, but the choice is
yours.
Thanks!
-- Jack Krupansky
-Original Message-
From: Bernd Fehling
Sent: Tuesday, June 25, 2013 2:06 AM
To: solr-user
Solr does not have any integrated Hadoop/HDFS crawling or indexing support
today. Sorry.
LucidWorks Search does have HDFS crawling support:
http://docs.lucidworks.com/display/lweug/Using+the+High+Volume+HDFS+Crawler
Cloudera Search has HDFS support as well.
-- Jack Krupansky
-Original
No, facet.pivot takes a comma-separated list of "fields", with no support
for "ranges".
But, you can have a combination of field and range facets without pivoting.
-- Jack Krupansky
-Original Message-
From: Jakob Frank
Sent: Tuesday, June 25, 2013 6
There are examples in my book:
http://www.lulu.com/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-1/ebook/product-21079719.html
But... I still think you should use a tokenized text field as well - use all
three: raw string, tokenized text, and URL classification fields.
-- Jack
-sequences that occur in the URL without the need
for wildcards or regular expressions.
-- Jack Krupansky
-Original Message-
From: Jan Høydahl
Sent: Tuesday, June 25, 2013 6:28 AM
To: solr-user@lucene.apache.org
Subject: Re: URL search and indexing
Probably a good match for the RegExp
???
Hadoop=HDFS
If the data is not in Hadoop/HDFS, just use the normal Solr indexing tools,
including SolrCell and Data Import Handler, and possibly ManifoldCF.
-- Jack Krupansky
-Original Message-
From: engy.morsy
Sent: Tuesday, June 25, 2013 8:10 AM
To: solr-user
),
you automatically get most of that. The user can query by a URL fragment,
such as "apache.org", ".org", "lucene.apache.org", etc. and the tokenization
will strip out the punctuation.
I'll add this script to my list of examples to add in the next rev of my
ection - add all the fields to one schema - there is no time or space
penalty if most of the field are empty for most documents.
-- Jack Krupansky
-Original Message-
From: Chris Toomey
Sent: Tuesday, June 25, 2013 6:08 PM
To: solr-user@lucene.apache.org
Subject: Querying multiple col
/tomcat-5.5-doc/config/http.html)
---
If you're not using Tomcat, your container may have a similar limit.
-- Jack Krupansky
-Original Message-
From: yang, gang
Sent: Tuesday, June 25, 2013 5:47 PM
To: solr-user@lucene.apache.org
Cc: Meng, Fan
Subject: RE: Is it possible to searh
Guide mislead people with examples that clearly can never run as expected
with real data.
-- Jack Krupansky
-Original Message-
From: eShard
Sent: Tuesday, June 25, 2013 1:17 PM
To: solr-user@lucene.apache.org
Subject: Is there a way to capture div tag by id?
let's say I have a div
You could use an update processor to turn the text string into multiple
string values. A short snippet of JavaScript in a
StatelessScriptUpdateProcessor could do the trick. The field could then be a
multivalued string field.
-- Jack Krupansky
-Original Message-
From: Elran Dvir
/4_3_1/analyzers-common/org/apache/lucene/analysis/miscellaneous/LimitTokenCountFilterFactory.html
The new Apache Solr Reference? No mention of the filter.
-- Jack Krupansky
-Original Message-
From: Daniel Collins
Sent: Wednesday, June 26, 2013 3:38 AM
To: solr-user@lucene.apache.org
If there is a bug... we should identify it. What's a sample post command
that you issued?
-- Jack Krupansky
-Original Message-
From: Flavio Pompermaier
Sent: Wednesday, June 26, 2013 10:53 AM
To: solr-user@lucene.apache.org
Subject: Re: URL search and indexing
I was doing ex
o
4.4. If not in 4.4, 4.5 is probably a slam-dunk.
-- Jack Krupansky
-Original Message-
From: David Larochelle
Sent: Wednesday, June 26, 2013 11:24 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr indexer and Hadoop
Pardon, my unfamiliarity with the Solr development process.
Now
ence Guide nor current
release from Lucid, but see the detailed examples in my book.
-- Jack Krupansky
-Original Message-
From: Furkan KAMACI
Sent: Wednesday, June 26, 2013 10:51 AM
To: solr-user@lucene.apache.org
Subject: Dynamic Type For Solr Schema
I use Solr 4.3.1 as SolrCloud. I k
You need to do occasional hard commits, otherwise the update log just grows
and grows and gets replayed on each server start.
-- Jack Krupansky
-Original Message-
From: Arun Rangarajan
Sent: Wednesday, June 26, 2013 1:18 PM
To: solr-user@lucene.apache.org
Subject: Solr 4.2.1 - master
directly implemented in Solr
-- Jack Krupansky
-Original Message-
From: aspielman
Sent: Wednesday, June 26, 2013 2:16 PM
To: solr-user@lucene.apache.org
Subject: Solr document auto-upload?
Is it possible to to configure Solr to automatically grab documents in a
specidfied directory, with
No, you cannot use wildcards within a quoted term.
Tell us a little more about what your strings look like. You might want to
consider tokenizing or using ngrams to avoid the need for wildcards.
-- Jack Krupansky
-Original Message-
From: Amit Sela
Sent: Thursday, June 27, 2013 3:33
Just from the string field to a "text" field and use standard
tokenization, then you can search the text field for "youtube" or even
"something" that is a component of the URL path. No wildcard required.
-- Jack Krupansky
-Original Message-
From: Amit
me, and then you can update with atomic update.
You may want to rethink your data model.
-- Jack Krupansky
-Original Message-
From: anurag.jain
Sent: Thursday, June 27, 2013 8:28 AM
To: solr-user@lucene.apache.org
Subject: how to delete on column of a doc in solr
In my solr sche
in the book.
You can also use a regular expression tokenfilter to extract the host name
as well.
And you can use standard Solr "grouping" to group by the field containing
host name.
-- Jack Krupansky
-Original Message-
From: Wojciech Kapelinski
Sent: Thursday, June 27, 20
arvestServer.getHttpClient().getParams().setParameter("update.chain",
"harvest");
In short, the original exception was based on a gross
misinterpretation of how one goes about equating solrconfig.xml with
configurations of SolrJ.
Hope that helps more than it confuses!
Cheers
Jack
On
. Sure, people don't
like seeing the mis-matched results in the list and a larger number of
results, but it's all a tradeoff to assure that the most relevant results
are higher and exact matching is a little looser.
-- Jack Krupansky
-Original Message-
From: Erick Erickson
Sent:
Show us your directive. Maybe there is some subtle error in the
file name.
-- Jack Krupansky
-Original Message-
From: Arun Rangarajan
Sent: Friday, June 28, 2013 1:06 PM
To: solr-user@lucene.apache.org
Subject: Re: Replicating files containing external file fields
Erick,
Thx for
Well, it is known to me and documented in my book. BTW, that field value is
simply ignored.
There are tons of places in Solr where undefined values or outright garbage
are simply ignored, silently.
Go ahead and file a Jira though.
-- Jack Krupansky
-Original Message-
From: Sam
How could you not have ssh access to the Solr host machine? I mean, how are
you managing that server, without ssh access?
And if you are not managing the server, what business do you have trying to
change the Solr configuration?!?!?
Something fishy here!
-- Jack Krupansky
-Original
Ah, yes, good old multi-tenant - I should have known.
Yeah, the Solr API is evolving, albeit too slowly for the needs of some.
-- Jack Krupansky
-Original Message-
From: Wu, James C.
Sent: Friday, June 28, 2013 7:06 PM
To: solr-user@lucene.apache.org
Subject: RE: change solr core
to.)
Sorry, I don't have the answer to the reload question at the tip of my
tongue.
-- Jack Krupansky
-Original Message-
From: Arun Rangarajan
Sent: Friday, June 28, 2013 7:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Replicating files containing external file fields
Ja
to simulate
the effect of a simple join in a single clean query. But you can do a
separate query to get parent record details.
-- Jack Krupansky
-Original Message-
From: Sperrink
Sent: Saturday, June 29, 2013 5:08 AM
To: solr-user@lucene.apache.org
Subject: Schema design for parent child
is good
for keyword search. Use the text variant in qf.
-- Jack Krupansky
-Original Message-
From: winsu
Sent: Friday, June 28, 2013 9:26 PM
To: solr-user@lucene.apache.org
Subject: increase search score of certain category only for certain keyword
Hi,
Currently i've certain sample
s correspond to your date gap.
You can do that with an update processor, or do it before you send the data
to Solr.
In the next release of my book I have a script for a
StatelessScriptUpdateProccessor (with examples) that supports truncation of
dates to a desired resolution, copying or modifyi
It all depends on your data model - tell us more about your data model.
For example, how will users or applications query these documents and what
will they expect to be able to do with the ID/key for the documents?
How are you expecting to identify documents in your data model?
-- Jack
quot;
data model - which includes what expectations you have about the unique
ID/key for each document.
So, for that first PDF file, what expectation (according to your data model)
do you have for what its ID/key should be?
-- Jack Krupansky
-Original Message-
From: archit2112
Sent
g" is inappropriate for this email list (or any
email list.)
-- Jack Krupansky
-Original Message-
From: tuedel
Sent: Monday, July 01, 2013 8:15 AM
To: solr-user@lucene.apache.org
Subject: Re: RemoveDuplicatesTokenFilterFactory to avoid import duplicate
values in multivalued field
H
to get parent or child IDs and
then do a second query filtered by those IDs.
And, yes, this only approximates the full power of an SQL join - but at a
tiny fraction of the cost.
-- Jack Krupansky
-Original Message-
From: adfel70
Sent: Monday, July 01, 2013 9:56 AM
To: solr-user
Unfortunately, update processors only "see" the new, fresh, incoming data,
not any existing document data.
This is a case where your best bet may be to read the document first and
then merge your new value into the existing list of values.
-- Jack Krupansky
-Original Message-
You can write any function query in the field list of the "fl" parameter.
Sounds like you want "termfreq":
termfreq(field_arg,term)
fl=id,a,b,c,termfreq(a,xyz)
-- Jack Krupansky
-Original Message-
From: Tony Mullins
Sent: Monday, July 01, 2013 10
"stored" and "indexed" both default to "true".
This is legal:
This detail will be in Early Access Release #2 of my book on Friday.
-- Jack Krupansky
-Original Message-
From: Otis Gospodnetic
Sent: Monday, July 01, 2013 2:21 PM
To: solr-user@lucen
Correct - the field definitions inherit the attributes of the field type,
and it is the field type that has the actual default values for indexed and
stored (and other attributes.)
-- Jack Krupansky
-Original Message-
From: Yonik Seeley
Sent: Monday, July 01, 2013 3:56 PM
To: solr
sources.
But, yeah, as Otis says, "re-index" is really just a euphemism for deleting
your Solr data directory and indexing from scratch from the original data
sources.
-- Jack Krupansky
-Original Message-
From: Otis Gospodnetic
Sent: Monday, July 01, 2013 2:26 PM
To:
What is the nature of your degradation?
-- Jack Krupansky
-Original Message-
From: solrUserJM
Sent: Tuesday, July 02, 2013 4:22 AM
To: solr-user@lucene.apache.org
Subject: Solr 4.3 Pivot Performance Issue
Hi There,
I notice with the upgrade from solr 4.0 to solr 4.3 that we had a
Simply multiply by the number of miles per kilometer, 0.621371:
fl=_dist_:mul(geodist(),0.621371)
-- Jack Krupansky
-Original Message-
From: irshad siddiqui
Sent: Tuesday, July 02, 2013 5:19 AM
To: solr-user@lucene.apache.org
Subject: need distance in miles not in kilometers
Hi,
I
It sounds like 4.4 will have an RC next week, so the prospects for block
join in 4.4 are kind of dim. I mean, such a significant feature should have
more than a few days to bake before getting released. But... who knows what
Yonik has planned!
-- Jack Krupansky
-Original Message
Start with the Solr Tutorial.
http://lucene.apache.org/solr/tutorial.html
-- Jack Krupansky
-Original Message-
From: fabio1605
Sent: Tuesday, July 02, 2013 11:16 AM
To: solr-user@lucene.apache.org
Subject: Newbie SolR - Need advice
Hi
we have a MSSQL Server which is just getting
Consider DataStax Enterprise - it combines Cassandra for NoSql data storage
with Solr for indexing - fully integrated.
http://www.datastax.com/
-- Jack Krupansky
-Original Message-
From: fabio1605
Sent: Tuesday, July 02, 2013 12:44 PM
To: solr-user@lucene.apache.org
Subject: Re
*&fq=((*:* -color.not_null:[* TO *]) OR color:blue)
-- Jack Krupansky
-Original Message-
From: Van Tassell, Kristian
Sent: Tuesday, July 02, 2013 3:47 PM
To: solr-user@lucene.apache.org
Subject: How to query Solr for empty field or specific value
Hello,
I'm using Solr 4.2 and am trying to get a s
tom script with the Stateless Script update processor.
My book has examples for URL Classify.
-- Jack Krupansky
-Original Message-
From: A Geek
Sent: Tuesday, July 02, 2013 1:47 PM
To: solr user
Subject: How to show just the parent domains from results in Solr
hi All, I've indexed
You will need to set q.op to "OR", and... use a field type that has the
autoGeneratePhraseQueries attribute set to "false".
-- Jack Krupansky
-Original Message-
From: James Bathgate
Sent: Tuesday, July 02, 2013 5:10 PM
To: solr-user@lucene.apache.org
Subject: Part
Ahhh... you put autoGeneratePhraseQueries="false" on the field - but it
needs to be on the field type.
You can see from the parsed query that it generated the phrase.
-- Jack Krupansky
-Original Message-
From: James Bathgate
Sent: Tuesday, July 02, 2013 5:35 PM
To:
Design your own application layer for both indexing and query that knows
about both SQL and Solr. Give it a REST API and then your client
applications can talk to your REST API and not have to care about the
details of Solr or SQL. That's the best starting point.
-- Jack Krup
to undefined
fields. In other words, you are telling Solr that it is okay to have inputs
for these fields - simply ignore them.
But... you could still have update processors that look at the values of
"ignored" fields and maybe assigns them to other, non-ignored fields.
-- Jack
nce it is a wildcard character.
Yes, string_field:*\? should match any string field that ends with a "?".
-- Jack Krupansky
-Original Message-
From: JZ
Sent: Wednesday, July 03, 2013 10:59 AM
To: solr-user@lucene.apache.org
Subject: Search for string ending with question mark
view differences.
-- Jack Krupansky
-Original Message-
From: Ali, Saqib
Sent: Wednesday, July 03, 2013 11:55 AM
To: solr-user@lucene.apache.org
Subject: unused fields in Solr schema.xml increase the index size
Hello all,
Do unused fields in Solr Schem.xml increase the size of the
phrases, and there is no scoring difference whether a term occurs once or a
thousand times in that field for each document. A lot less information needs
to be stored in the index.
-- Jack Krupansky
-Original Message-
From: Ali, Saqib
Sent: Wednesday, July 03, 2013 10:31 PM
To: solr-user
Yes, but it is simply doing an AND or OR of the individual terms - no
phrases or implied ordering of the terms.
-- Jack Krupansky
-Original Message-
From: Ali, Saqib
Sent: Thursday, July 04, 2013 12:52 AM
To: solr-user@lucene.apache.org
Subject: Re: omitTermFreqAndPositions="tru
Oops... I wasn't reading carefully enough - frequencies and positions only
relate to tokenized fields (text) - not string fields.
That doesn't impact your ability to do AND and OR of discrete string terms
of a multivalued string field.
-- Jack Krupansky
-Original Message-
ew feature/improvement.
-- Jack Krupansky
-Original Message-
From: Tony Mullins
Sent: Thursday, July 04, 2013 9:45 AM
To: solr-user@lucene.apache.org
Subject: Total Term Frequency per ResultSet in Solr 4.3 ?
Hi ,
I have lots of crawled data, indexed in my Solr (4.3.0) and lets say user
You can take a look at the MoreLikeThis/Find Similar feature. That gives you
an approximation, but using documents rather than discrete terms. You would
have to write a custom component of your own based on logic from MLT.
-- Jack Krupansky
-Original Message-
From: Dotan Cohen
Sent
901 - 1000 of 2693 matches
Mail list logo