Re: external indexer for Solr Cloud

2014-09-01 Thread Jack Krupansky
ud index? It would be great to have a "standalone DIH" that runs as a separate server and then sends standard Solr update requests to a Solr cluster. -- Jack Krupansky -Original Message- From: Lee Chunki Sent: Sunday, August 31, 2014 8:55 PM To: solr-user@lucene.apache.org

Re: external indexer for Solr Cloud

2014-09-01 Thread Jack Krupansky
could use. -- Jack Krupansky -Original Message- From: Shawn Heisey Sent: Monday, September 1, 2014 11:42 AM To: solr-user@lucene.apache.org Subject: Re: external indexer for Solr Cloud On 9/1/2014 7:19 AM, Jack Krupansky wrote: It would be great to have a "standalone DIH&

Re: Indexing & search list of Key/Value pairs

2014-09-01 Thread Jack Krupansky
en you can simply query: php_skill:[5 TO *] AND ruby_skill:[2 TO *] -- Jack Krupansky -Original Message- From: amid Sent: Monday, September 1, 2014 12:24 PM To: solr-user@lucene.apache.org Subject: Indexing & search list of Key/Value pairs Hi, I'm using solr and trying to

Re: Indexing & search list of Key/Value pairs

2014-09-01 Thread Jack Krupansky
evelopment:[10 TO *] That would match somebody with "Agile Methodology" or "Agile Development" AND 10 or more years of "Software Development". -- Jack Krupansky -Original Message- From: amid Sent: Monday, September 1, 2014 12:50 PM To: solr-user@lucene.apa

Re: looking for a solr/search expert in Paris

2014-09-03 Thread Jack Krupansky
e sure to keep your listing up to date, including regional availability and any specialties. -- Jack Krupansky -Original Message- From: elisabeth benoit Sent: Wednesday, September 3, 2014 4:02 AM To: solr-user@lucene.apache.org Subject: looking for a solr/search expert in Paris Hello,

Re: FAST-like document vector data structures in Solr?

2014-09-05 Thread Jack Krupansky
he highest relevance. The similarity vector is created during item processing and indicates the most important terms or concepts in the item and the corresponding weight.” See: http://msdn.microsoft.com/en-us/library/office/ff521597(v=office.14).aspx -- Jack Krupansky From: "Jürgen Wagn

Re: How to implement multilingual word components fields schema?

2014-09-05 Thread Jack Krupansky
the same source text in multiple fields, one for each language. You can then do a dismax query on that set of fields. -- Jack Krupansky -Original Message- From: Ilia Sretenskii Sent: Friday, September 5, 2014 10:06 AM To: solr-user@lucene.apache.org Subject: How to implement

Re: FAST-like document vector data structures in Solr?

2014-09-05 Thread Jack Krupansky
Sounds like a great future to add to Solr, especially if it would facilitate more automatic relevancy enhancement. LucidWorks Search has a feature called "unsupervised feedback" that does that but something like a docvector might make it a more realistic default. -- Jack

Re: How to solve?

2014-09-06 Thread Jack Krupansky
Payload really don't have first class support in Solr. It's a solid feature of Lucene, but never expressed well in Solr. Any thoughts or proposals are welcome! (Hmmm... I wonder what the good folks at Heliosearch have up their sleeves in this area?!) -- Jack Krupansky -Origin

Re: Is there any sentence tokenizers in sold 4.9.0?

2014-09-08 Thread Jack Krupansky
Out of curiosity, what would be an example query for your application that would depend on sentence tokenization, as opposed to simple term tokenization? I mean, there are no sentence-based query operators in the Solr query parsers. -- Jack Krupansky -Original Message- From: Sandeep

Re: How to implement multilingual word components fields schema?

2014-09-08 Thread Jack Krupansky
re very sensitive to short queries. Keep in mind that auto-detection for indexing full documents is a different problem that auto-detection for very short queries. -- Jack Krupansky -Original Message- From: Ilia Sretenskii Sent: Sunday, September 7, 2014 10:33 PM To: solr-user@lucen

Re: Solr multiple sources configuration

2014-09-09 Thread Jack Krupansky
It is mostly a matter of how you expect to query that data - do you need different queries for different sources, or do you have a common conceptual model that covers all sources with a common set of queries? -- Jack Krupansky -Original Message- From: vineet yadav Sent: Tuesday

Re: Tricky exact match, unwanted search results

2014-09-14 Thread Jack Krupansky
ng the term "exact match" ONLY for string field queries, and that don't use wildcard, fuzzy, or range queries. And maybe also keyword tokenizer text fields that don't have any filters, which might as well be string fields. -- Jack Krupansky -Original Message- From:

Re: Solr Exceptions -- "immense terms"

2014-09-15 Thread Jack Krupansky
You can use an update request processor to filter the input for large values. You could write a script with the stateless script processor which ignores or trims large input values. -- Jack Krupansky -Original Message- From: Christopher Gross Sent: Monday, September 15, 2014 7:58 AM

Re: Solr Exceptions -- "immense terms"

2014-09-15 Thread Jack Krupansky
full wiki page as a string field. -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Monday, September 15, 2014 8:39 AM To: solr-user Subject: Re: Solr Exceptions -- "immense terms" May not need a script for that: http://www.solr-start.com/javadoc/solr-lu

Re: Mongo DB Users

2014-09-15 Thread Jack Krupansky
>Waiting for a positive response! -1 -- Jack Krupansky -Original Message- From: Rakesh Varna Sent: Monday, September 15, 2014 10:18 AM To: solr-user@lucene.apache.org Subject: Re: Mongo DB Users Remove Regards, Rakesh Varna On Mon, Sep 15, 2014 at 9:29 AM, Ed Smiley wr

Re: How to summarize a String Field ?

2014-09-18 Thread Jack Krupansky
Do a to a numeric field. -- Jack Krupansky -Original Message- From: Erick Erickson Sent: Thursday, September 18, 2014 11:35 AM To: solr-user@lucene.apache.org Subject: Re: How to summarize a String Field ? You cannot do this as far as I know, it must be a numeric field (float

Re: [ANN] Lucidworks Fusion 1.0.0

2014-09-23 Thread Jack Krupansky
You simply download it yourself and give yourself a demo!! http://lucidworks.com/product/fusion/ -- Jack Krupansky -Original Message- From: Thomas Egense Sent: Tuesday, September 23, 2014 2:00 AM To: solr-user@lucene.apache.org Subject: Re: [ANN] Lucidworks Fusion 1.0.0 Hi Grant

Re: query for space character in text field ...

2014-09-23 Thread Jack Krupansky
Or simply enclosed the full term in quotes: q=path:"my path" Which is more properly encoded as: q=path:%22my+path%22 -- Jack Krupansky -Original Message- From: Erick Erickson Sent: Tuesday, September 23, 2014 11:02 PM To: solr-user@lucene.apache.org Subject: Re: query

Re: Changed behavior in solr 4 ??

2014-09-23 Thread Jack Krupansky
You set the defaults on the "search handler", not the "search component". See solrconfig.xml: explicit 10 text ... -- Jack Krupansky -Original Message- From: Jorge Luis Betancourt Gonzalez Sent: Tuesday, September 23, 2014 11

Re: Scoring with wild cars

2014-09-25 Thread Jack Krupansky
The wildcard query is “constant score” to make it faster, so unfortunately that means there is no score differentiation between the wildcard matches. You can simple add the wildcard prefix as a separate query term and boost it: q=text:carre* text:carre^1.5 -- Jack Krupansky From: Pigeyre

Re: Changed behavior in solr 4 ??

2014-09-25 Thread Jack Krupansky
I am not aware of any such feature! That doesn't mean it doesn't exist, but I don't recall seeing it in the Solr source code. -- Jack Krupansky -Original Message- From: Jorge Luis Betancourt Gonzalez Sent: Wednesday, September 24, 2014 1:31 AM To: solr-user@lucene.apa

Re: java.lang.NumberFormatException: For input string: "string;#-6.872515521, 53.28853084"

2014-09-27 Thread Jack Krupansky
And how is the schema field declared. Seems like it's a TrieDoubleField, which should be a simple floating point value. You should be using the spatial field types. -- Jack Krupansky -Original Message- From: Erick Erickson Sent: Friday, September 26, 2014 12:20 PM To: solr

Re: demo app explaining solr features

2014-09-28 Thread Jack Krupansky
And you can also check out the tutorials in any of the Solr books, including my Solr Deep Dive e-book: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html -- Jack Krupansky -Original Message- From: Mikhail Khludnev Sent

Re: multiple terms order in query - eDismax

2014-09-28 Thread Jack Krupansky
pf and ps merely control boosting of documents, not selection of documents. mm controls selection of documents. So, hopefully at least doc3 is returned before doc2. -- Jack Krupansky From: Tomer Levi Sent: Sunday, September 28, 2014 5:39 AM To: solr-user@lucene.apache.org Subject: multiple

Re: multiple terms order in query - eDismax

2014-09-29 Thread Jack Krupansky
That's called phrase query - selecting documents based on the order of the terms. Just enclose the terms in quotes. -- Jack Krupansky -Original Message- From: Tomer Levi Sent: Monday, September 29, 2014 2:41 AM To: solr-user@lucene.apache.org Subject: RE: multiple terms ord

Re: How to query certain fields filtered by a condition

2014-09-29 Thread Jack Krupansky
You can perform boolean operations using parentheses. So you can OR a sequence of sub-queries, and each sub-query can be an AND of the desired search term and the constraining values for other fields. -- Jack Krupansky -Original Message- From: Shamik Bandopadhyay Sent: Monday

Re: Search multiple values with wildcards

2014-09-30 Thread Jack Krupansky
The special characters (colon) are treated as term delimiters for text field. How do you really intend to query this "string". You could make it simply a "string" field. -- Jack Krupansky -Original Message- From: J'roo Sent: Tuesday, September 30, 20

Re: Boost Query (bq) syntax/usage

2014-09-30 Thread Jack Krupansky
nts that contain all three of the terms rather than any of the three terms. -- Jack Krupansky -Original Message- From: shamik Sent: Tuesday, September 30, 2014 5:38 PM To: solr-user@lucene.apache.org Subject: Boost Query (bq) syntax/usage Hi, I'm little confused with the right

Re: Boost Query (bq) syntax/usage

2014-09-30 Thread Jack Krupansky
The "+" signs in the parsed boost query indicated the terms were ANDed together, but maybe you can use the q.op and mm parameters to change the default operator (I forget!). -- Jack Krupansky -Original Message- From: shamik Sent: Tuesday, September 30, 2014 7:19 PM To:

Re: Boost Query (bq) syntax/usage

2014-09-30 Thread Jack Krupansky
dismax and then specify edismax for bq using the localParam notation. -- Jack Krupansky -Original Message- From: Jack Krupansky Sent: Tuesday, September 30, 2014 8:19 PM To: solr-user@lucene.apache.org Subject: Re: Boost Query (bq) syntax/usage The "+" signs in the parsed b

Re: Wildcard search makes no sense!!

2014-10-01 Thread Jack Krupansky
token gets analyzed into - that's what your wildcard prefix must match. Sometimes (usually!) you will be surprised. -- Jack Krupansky -Original Message- From: Wayne W Sent: Wednesday, October 1, 2014 7:16 AM To: solr-user@lucene.apache.org Subject: Wildcard search makes no sense!

Re: Adding filter in custom query parser

2014-10-01 Thread Jack Krupansky
Unless you consider yourself to be a "Solr expert", it would be best to implement such query translation in an application layer. -- Jack Krupansky -Original Message- From: sagarprasad Sent: Wednesday, October 1, 2014 3:27 AM To: solr-user@lucene.apache.org Subject: Adding

Re: Solr + Federated Search Question

2014-10-01 Thread Jack Krupansky
ata into Solr and then simply directly search the data within Solr. -- Jack Krupansky -Original Message- From: Ahmet Arslan Sent: Wednesday, October 1, 2014 9:35 AM To: solr-user@lucene.apache.org Subject: Re: Solr + Federated Search Question Hi, Federation is possible. Solr has distribut

Re: Regarding Default Scoring For Solr

2014-10-03 Thread Jack Krupansky
That's a reasonable description for Solr/Lucene scoring, but use the latest release: http://lucene.apache.org/core/4_10_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html -- Jack Krupansky -Original Message- From: mdemarco123 Sent: Thursday, October 2, 2014 6:06

Re: Solr + Federated Search Question

2014-10-03 Thread Jack Krupansky
Yes, either term can be used to confuse people equally well! -- Jack Krupansky -Original Message- From: Alejandro Calbazana Sent: Thursday, October 2, 2014 3:28 PM To: solr-user@lucene.apache.org ; Ahmet Arslan Subject: Re: Solr + Federated Search Question Thanks Ahmet. Yay! New

Re: Flexible search field analyser/tokenizer configuration

2014-10-04 Thread Jack Krupansky
What exactly do you think that filter query is doing? Explain it in plain English. My guess is that it eliminates all your document matches. -- Jack Krupansky -Original Message- From: PeterKerk Sent: Saturday, October 4, 2014 12:34 AM To: solr-user@lucene.apache.org Subject: Re

Re: Flexible search field analyser/tokenizer configuration

2014-10-04 Thread Jack Krupansky
max can be used to apply a boost to all un-fielded terms for a field, you otherwise need to apply any boost on a term, not a field. -- Jack Krupansky -Original Message- From: PeterKerk Sent: Saturday, October 4, 2014 10:43 AM To: solr-user@lucene.apache.org Subject: Re: Flexible search

Re: Advise on an architecture with lot of cores

2014-10-07 Thread Jack Krupansky
have separate clusters for larger groups of customers, maybe with a smaller cluster with a collection that maps the customer ID to a Solr cluster, and then the application layer can direct requests to the Solr cluster that owns that customer. -- Jack Krupansky -Original Message-

Re: dismax query does not match with additional field in qf

2014-10-07 Thread Jack Krupansky
ot; term, so it requried the string term to match, which won't happen since only the full string is indexed. Generally, you need to escape all special characters in a query. Then hopefully your string field will match. -- Jack Krupansky -Original Message- From: Andreas Hubold Sen

Re: dismax query does not match with additional field in qf

2014-10-07 Thread Jack Krupansky
tch on the string field, but a tokenized phrase match on the text field, and support partial matches on the text field as a phrase of contiguous terms. -- Jack Krupansky -Original Message- From: Andreas Hubold Sent: Tuesday, October 7, 2014 12:08 PM To: solr-user@lucene.apache.org S

Re: WhitespaceTokenizer to consider incorrectly encoded c2a0?

2014-10-08 Thread Jack Krupansky
aking white space as white space here. And update the Lucene Javadoc contract to be more explicit. -- Jack Krupansky -Original Message- From: Markus Jelsma Sent: Wednesday, October 8, 2014 10:16 AM To: solr-user@lucene.apache.org ; solr-user Subject: RE: WhitespaceTokenizer to con

Re: eDisMax parser and special characters

2014-10-08 Thread Jack Krupansky
again, so the hyphen gets quoted, and then analyzed to nothing for text fields but is still a string for string fields. -- Jack Krupansky -Original Message- From: Lanke,Aniruddha Sent: Wednesday, October 8, 2014 4:38 PM To: solr-user@lucene.apache.org Subject: Re: eDisMax parser an

Re: Edismax parser and boosts

2014-10-08 Thread Jack Krupansky
Definitely sounds like a bug! File a Jira. Thanks for reporting this. What release of Solr? -- Jack Krupansky -Original Message- From: Pawel Rog Sent: Wednesday, October 8, 2014 3:57 PM To: solr-user@lucene.apache.org Subject: Edismax parser and boosts Hi, I use edismax query with

Re: Best way to index wordpress blogs in solr

2014-10-08 Thread Jack Krupansky
The LucidWorks product has builtin crawler support so you could crawl one or more web sites. http://lucidworks.com/product/fusion/ -- Jack Krupansky -Original Message- From: Vishal Sharma Sent: Tuesday, October 7, 2014 2:08 PM To: solr-user@lucene.apache.org Subject: Best way to

Re: does one need to reindex when changing similarity class

2014-10-09 Thread Jack Krupansky
The similarity class is only invoked at query time, so it doesn't participate in indexing. -- Jack Krupansky -Original Message- From: Markus Jelsma Sent: Thursday, October 9, 2014 6:59 AM To: solr-user@lucene.apache.org Subject: RE: does one need to reindex when changing simil

Re: DateMathParser question

2014-10-10 Thread Jack Krupansky
Sounds reasonable. File a Jira! -- Jack Krupansky -Original Message- From: Jamie Johnson Sent: Friday, October 10, 2014 11:45 AM To: solr-user@lucene.apache.org Subject: DateMathParser question I have found that DateMathParser is extremely useful in providing nice labels back to

Re: What happens if you don't set positionIncrementGap

2014-10-12 Thread Jack Krupansky
detail. See: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Sunday, October 12, 2014 7:40 PM To: solr-user Subject: What happens if you don't

Re: What happens if you don't set positionIncrementGap

2014-10-13 Thread Jack Krupansky
epend on the analyzer to "tokenize" the quoted string into individual terms, which the numeric field types would not do. I'm too lazy to check to see what Lucene classes call the getPositionIncrementGap method and what they do with its return value. -- Jack Krupansky -

Re: eDisMax parser and special characters

2014-10-13 Thread Jack Krupansky
Simply escape it with a backslash or enclose the term in quotes. Sure, it would be nice to be able to configure various operators to be disabled, but that's not fesible with query parser designed around static grammars and infleexible tools such as JFlex. -- Jack Krupansky -Ori

Re: does one need to reindex when changing similarity class

2014-10-13 Thread Jack Krupansky
different. -- Jack Krupansky -Original Message- From: Markus Jelsma Sent: Monday, October 13, 2014 5:06 PM To: solr-user@lucene.apache.org Subject: RE: does one need to reindex when changing similarity class Yes, if the replacing similarity has a different implementation on norms,

Re: numfound in solr

2014-10-14 Thread Jack Krupansky
It would be nice to have a logging option to log updates vs. inserts, to help make it more obvious what is happening. And maybe even a way for a Solr update request to get back a summary of how many documents were inserted, updated, and deleted. -- Jack Krupansky -Original Message

Re: How should one search on all fields? *:XX does not work

2014-10-16 Thread Jack Krupansky
in their new Fusion product is unclear - I couldn't find any reference in the doc. -- Jack Krupansky -Original Message- From: Aaron Lewis Sent: Thursday, October 16, 2014 1:47 AM To: solr-user@lucene.apache.org Subject: How should one search on all fields? *:XX does not work Hi,

Re: CopyField from text to multi value

2014-10-19 Thread Jack Krupansky
As always, you need to first examine how you intend to query the fields before you dive into data modeling. In this case, is there any particular reason that you need the individual terms as separate values, as opposed to simply using a tokenized text field? -- Jack Krupansky From: Tomer Levi

Re: prefix length in fuzzy search solr 4.10.1

2014-10-31 Thread Jack Krupansky
No, but it is a reasonable request, as a global default, a collection-specific default, a request-specific default, and on an individual fuzzy term. -- Jack Krupansky -Original Message- From: elisabeth benoit Sent: Thursday, October 30, 2014 6:07 AM To: solr-user@lucene.apache.org

Re: How to update SOLR schema from continuous integration environment

2014-11-01 Thread Jack Krupansky
ng parallel systems with an atomic swap/redirection is probably simplest, while for larger clusters an incremental rolling update with thorough testing on a pre-production test cluster is the way to go. -- Jack Krupansky -Original Message- From: Faisal Mansoor Sent: Saturday, November 1

Re: How to update SOLR schema from continuous integration environment

2014-11-02 Thread Jack Krupansky
urther, the "crash" would more likely have occurred on your "dev" cluster first, well before even making it to your pre-production test system. -- Jack Krupansky -Original Message- From: Will Martin Sent: Sunday, November 2, 2014 6:37 AM To: solr-user@lucene.apac

Re: Question about StandardTokenizer in Solr 4.9

2014-11-02 Thread Jack Krupansky
Yeah, that behavior is consistent with what I documented in my e-book for Solr. The dot is kept only if between two digits or two letters. -- Jack Krupansky -Original Message- From: Jorge Luis BetancourtGonzález Sent: Sunday, November 2, 2014 4:34 PM To: solr-user@lucene.apache.org

Re: Ignoring Duplicates in Multivalue Field

2014-11-03 Thread Jack Krupansky
cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents -- Jack Krupansky -Original Message- From: Tomer Levi Sent: Monday, November 3, 2014 4:19 AM To: solr-user@lucene.apache.org ; Ahmet Arslan Subject: RE: Ignoring Duplicates in Multivalue Field Hi Ahmet, When I add the RunU

Re: A bad idea to store core data directory over NAS?

2014-11-04 Thread Jack Krupansky
Think of Solr/SolrCloud itself as a SAN - smart networked machines that intensely manage local storage. Have two levels of "SAN" is counterproductive. -- Jack Krupansky -Original Message- From: Gili Nachum Sent: Tuesday, November 4, 2014 4:57 PM To: solr-user@lucene.

Re: Best way to map holidays to corresponding date

2014-11-05 Thread Jack Krupansky
ipt update processor. -- Jack Krupansky -Original Message- From: Patrick Kirsch Sent: Wednesday, November 5, 2014 6:12 AM To: solr-user@lucene.apache.org Subject: Best way to map holidays to corresponding date Hey, maybe someone already faced the situation and could give me a hi

Re: add and then delete same document before commit,

2014-11-05 Thread Jack Krupansky
Document x doesn't exist - in terms of visibility - until the commit, so the delete will no-op since a query of Lucene will not "see" the uncommitted new document. -- Jack Krupansky -Original Message- From: Matteo Grolla Sent: Wednesday, November 5, 2014 4:47 A

Re: on regards to Solr and NoSQL storages integration

2014-11-05 Thread Jack Krupansky
. (Disclosure: I am a contractor for DataStax. I'm their "Domain Expert for Search/Solr".) -- Jack Krupansky -Original Message- From: andrey prokopenko Sent: Wednesday, November 5, 2014 8:52 AM To: solr-user@lucene.apache.org Subject: on regards to Solr and NoSQL storages integra

Re: Delete data from stored documents

2014-11-07 Thread Jack Krupansky
Could you clarify exactly what you are trying to do, like with an example? I mean, how exactly are you determining what fields are "unwanted"? Are you simply asking whether fields can be deleted from the index (and schema)? -- Jack Krupansky -Original Message- From: yriv

Re: Delete data from stored documents

2014-11-08 Thread Jack Krupansky
Agreed, but I think it would be great if Lucene and Solr provided an API to delete a single field for the entire index. We could file a Jira, but can Lucene accommodate it? Maybe we'll just have to wait for Elasticsearch to implement this feature! -- Jack Krupansky -Original Me

Re: Synonymn for Numbers

2014-11-08 Thread Jack Krupansky
Are you using the synonyms for both indexing and query? It sounds like you want to use these synonyms only at query time. Otherwise, "10" in the index becomes "2010" in the index. -- Jack Krupansky -Original Message- From: EXTERNAL Taminidi Ravi (ETI, Automoti

Re: on regards to Solr and NoSQL storages integration

2014-11-08 Thread Jack Krupansky
es can occur at full speed, with indexing in a background thread, maximizing ingestion performance. -- Jack Krupansky -Original Message- From: andrey prokopenko Sent: Friday, November 7, 2014 5:00 AM To: solr-user@lucene.apache.org Subject: Re: on regards to Solr and NoSQL storages int

Re: Best way to map holidays to corresponding date

2014-11-09 Thread Jack Krupansky
Try writing a few examples. Try christmas, easter, and memorial day. -- Jack Krupansky -Original Message- From: Anurag Sharma Sent: Sunday, November 9, 2014 12:54 PM To: solr-user@lucene.apache.org Subject: Re: Best way to map holidays to corresponding date Not sure this is the

Re: Search for partial name in Solr 4.x

2014-11-09 Thread Jack Krupansky
Please post some examples of titles and queries that you expect should match. How "partial" can the title be? How "full" does it need to be? When there are multiple partial matches how are you expecting them to be ranked? -- Jack Krupansky -Original Message- From:

Re: How to return single value from multi valued field

2014-11-15 Thread Jack Krupansky
You could implement a custom highlighter for that field. Otherwise, you are requesting a feature that does not exist in Solr today. The fl parameter specifies fields to return, not portions of fields. -- Jack Krupansky -Original Message- From: kumar Sent: Thursday, November 13, 2014

Re: Indexing problems with BBoxField

2014-11-23 Thread Jack Krupansky
static field rather than dynamic field, although the latter should work anyway. Please file a Jira to request that Solr give a user-sensible error, not a Lucene-level error. I mean, the Solr user has no ability to directly invoke the "createFields" method. And now... let&#

Re: Reindex Issues

2014-11-25 Thread Jack Krupansky
egments or rewriting them. The constant score should normally be 1.0. If it is not, maybe you have query boost terms, and they are using the df of the boost terms. -- Jack Krupansky -Original Message- From: Ahmet Arslan Sent: Tuesday, November 25, 2014 4:50 AM To: solr-user@lucene.apache.o

Re: TrieLongField not store large longs correctly

2014-11-26 Thread Jack Krupansky
Your query has a space in it after the colon, which is not valid. Could you post the actual, full query request, as well as the full query response? -- Jack Krupansky -Original Message- From: Thomas L. Redman Sent: Wednesday, November 26, 2014 2:45 PM To: solr-user@lucene.apache.org

Re: Disappearance of post.jar from the new tutorial

2014-11-30 Thread Jack Krupansky
of adopting for Solr. I mean, are we trying too reinvent the wheel here, or what?! Note: This is the Solr USER list, which isn't the best forum for development discussions. -- Jack Krupansky -Original Message- From: Erik Hatcher Sent: Sunday, November 30, 2014 10

Re: Large fields storage

2014-12-01 Thread Jack Krupansky
In particular, if they are image-intensive, all the images go away. And the formatting as well. -- Jack Krupansky -Original Message- From: Ahmet Arslan Sent: Monday, December 1, 2014 6:02 PM To: solr-user@lucene.apache.org Subject: Re: Large fields storage Hi Avi, I assume your

Re: How to stop Solr tokenising search terms with spaces

2014-12-06 Thread Jack Krupansky
to providing us with more specific requirements. My guess, from your mention of LDAP, is that the field would contain only a name, but... that's me guessing when you need to be specific. Once this distinction is cleared up, we can then focus on solutions that work either for arbitrary text or

Re: How to stop Solr tokenising search terms with spaces

2014-12-07 Thread Jack Krupansky
combined with the NGramFilterFactory and lower case filter, but only use the ngram filter at index time. See: http://lucene.apache.org/core/4_10_2/analyzers-common/org/apache/lucene/analysis/ngram/NGramFilterFactory.html But be aware that use of the ngram filter dramatically increases the index

Re: How to stop Solr tokenising search terms with spaces

2014-12-10 Thread Jack Krupansky
If possible, please post your field type for others to see the final solution. Thanks! -- Jack Krupansky -Original Message- From: Dinesh Babu Sent: Wednesday, December 10, 2014 9:54 AM To: solr-user@lucene.apache.org ; Ahmet Arslan Subject: RE: How to stop Solr tokenising search

Re: different fields for user-supplied phrases in edismax

2014-12-13 Thread Jack Krupansky
boost as do less-precise phrases. But it does need to be optional since it has an added cost at query time. -- Jack Krupansky -Original Message- From: Michael Sokolov Sent: Saturday, December 13, 2014 8:43 AM To: solr-user@lucene.apache.org Subject: Re: different fields for user-supplied

Re: first time user

2014-12-16 Thread Jack Krupansky
My Solr Deep Dive e-book has full details and lots of examples for CSV indexing: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Tuesday, December

Re: first time user

2014-12-16 Thread Jack Krupansky
thing, but the real problem is further upstream and hasn't been fully expressed. My model is to give you a lot of examples and you can decide for yourself which best exemplifies what you are trying to do. And to give more detail on the features of Solr. -- Jack Krupansky -Origina

Re: 'Illegal character in query' on Solr cloud 4.10.1

2014-12-24 Thread Jack Krupansky
ther it is Tomcat or Solr that gives the error, the main point is that the raw circumflex shouldn't be sent to either. -- Jack Krupansky On Wed, Dec 24, 2014 at 4:32 PM, Erick Erickson wrote: > OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are > con

Re: solr export get wrong results

2014-12-26 Thread Jack Krupansky
/solr/Exporting+Result+Sets -- Jack Krupansky On Fri, Dec 26, 2014 at 3:58 AM, Sandy Ding wrote: > Hi, all > > I've recently set up a solr cluster and found that "export" returns > different results from "select". > And I confirmed that the "expor

Re: Solr server becomes non-responsive.

2014-12-26 Thread Jack Krupansky
are no longer I/O bound. If compute bound, shard more heavily until the query latency becomes acceptable. -- Jack Krupansky On Fri, Dec 26, 2014 at 1:02 AM, Modassar Ather wrote: > Thanks for your suggestions Erick. > > This may be one of those situations where you really have to &g

Re: How to implement multi-set in a Solr schema.

2014-12-28 Thread Jack Krupansky
You can also use group.query or group.func to group documents matching a query or unique values of a function query. For the latter you could implement an NLP algorithm. -- Jack Krupansky On Sun, Dec 28, 2014 at 5:56 PM, Meraj A. Khan wrote: > Thanks Aman, the thing is the bookName fi

Re: How large is your solr index?

2014-12-29 Thread Jack Krupansky
. -- Jack Krupansky -- Jack Krupansky On Mon, Dec 29, 2014 at 12:54 PM, Erick Erickson wrote: > When you say 2B docs on a single Solr instance, are you talking only one > shard? > Because if you are, you're very close to the absolute upper limit of a > shard, internally > the doc

Re: WordDelimiter filter, expanding to multiple words, unexpected results

2014-12-29 Thread Jack Krupansky
term and the multi-term phrase, while the query analyzer would NOT do the split on case, so that the query could be a unitary term (possibly with mixed case, but that would not split the term) or could be a two-word phrase. -- Jack Krupansky -- Jack Krupansky On Mon, Dec 29, 2014 at 5:12 PM

Re: Solr server becomes non-responsive.

2014-12-30 Thread Jack Krupansky
e absolute precision. Sometimes you just want to know whether "something" exists matching the pattern, or "generally" what the values look like. I think it would be worth a Jira. -- Jack Krupansky On Tue, Dec 30, 2014 at 6:16 AM, Modassar Ather wrote: > Hi, > >

Re: How large is your solr index?

2014-12-30 Thread Jack Krupansky
a proof of concept implementation to validate whether the sweet spot for your particular data, data model, and application access patterns may be well above or even below that. Yes, indeed, sing praises for heroes, but don't kill yourself and drag down others trying to be one yourself. --

Re: WordDelimiter filter, expanding to multiple words, unexpected results

2014-12-30 Thread Jack Krupansky
Right, that's what I meant by WDF not being "magic" - you can configure it to match any three out of four use cases as you choose, but there is no choice that matches all of the use cases. To be clear, this is not a "bug" in WDF, but simply a limitation. -- Jack Krupan

Re: WordDelimiter filter, expanding to multiple words, unexpected results

2014-12-30 Thread Jack Krupansky
I do have a more thorough discussion of WDF in my Solr Deep Dive e-book: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html You're not "wrong" about anything here... you just need to accept that WDF is not magic a

Re: Join in SOLR

2014-12-31 Thread Jack Krupansky
You would have to do your own build since the patch has not been committed. -- Jack Krupansky On Wed, Dec 31, 2014 at 12:27 AM, Rajesh wrote: > Mikhail, > > How can I get a nightly build with fix for SOLR-5147 included. I've > searched and found that nightly build will not be

Re: Queries not supported by Lucene Query Parser syntax

2015-01-01 Thread Jack Krupansky
R-839 -- Jack Krupansky On Thu, Jan 1, 2015 at 4:08 AM, Leonid Bolshinsky wrote: > Hello, > > Are we always limited by the query parser syntax when passing a query > string to Solr? > What about the query elements which are not supported by the syntax? > For example, BooleanQuery.setM

Re: De Duplication using Solr

2015-01-03 Thread Jack Krupansky
First, see if you can get your requirements to align to the de-dupe feature that Solr already has: https://cwiki.apache.org/confluence/display/solr/De-Duplication -- Jack Krupansky On Sat, Jan 3, 2015 at 2:54 AM, Amit Jha wrote: > I am trying to find out duplicate records based on dista

Re: How large is your solr index?

2015-01-03 Thread Jack Krupansky
ere. So the race is on between when Lucene will relax the 2G limit and when hardware gets fast enough that 2G documents can be indexed within a small number of hours. -- Jack Krupansky On Sat, Jan 3, 2015 at 4:00 PM, Toke Eskildsen wrote: > Erick Erickson [erickerick...@gmail.com] wrote: &

Re: How large is your solr index?

2015-01-03 Thread Jack Krupansky
t I agree that it would be highly desirable to push that 100 million number up to 350 million or even 500 million ASAP since the pain of unnecessarily sharding is unnecessarily excessive. I wonder what changes will have to occur in Lucene, or... what evolution in commodity hardware will be necessary t

Re: edismax with multiple words for keyword tokenizer splitting on space

2015-01-06 Thread Jack Krupansky
You need to escape the space in your query (using backslash or quotes around the term) - the query parser doesn't parse based on the analyzer/tokenizer for each field. -- Jack Krupansky On Tue, Jan 6, 2015 at 4:05 AM, Sankalp Gupta wrote: > Hi > I come across this weird behaviour i

Re: Vertical search Engine

2015-01-06 Thread Jack Krupansky
queries are expressed and the results being returned. -- Jack Krupansky On Tue, Jan 6, 2015 at 3:39 AM, klunwebale wrote: > hello > > i want to create a vertical search engine like trovit.com. > > I have installed solr and solarium. > > What else to i need can you recomme

Re: Solr support for multi-tenant applications

2015-01-07 Thread Jack Krupansky
cores/tenants. Will tenants be directly accessing Solr, or will you provide them with a REST API for an application layer that intermediates access to Solr? -- Jack Krupansky On Wed, Jan 7, 2015 at 4:31 AM, Bram Van Dam wrote: > One possibility is to have separate core for each tenant domain. &

Re: Determining the Number of Solr Shards

2015-01-07 Thread Jack Krupansky
number of CPU cores? -- Jack Krupansky On Wed, Jan 7, 2015 at 9:14 PM, Nishanth S wrote: > Thanks Shawn and Walter.Yes those are 12,000 writes/second.Reads for the > moment would be in the 1000 reads/second. Guess finding out the right > number of shards would be my starting point. &

<    4   5   6   7   8   9   10   11   12   13   >