Re: Solr middle-ware?

2014-01-30 Thread Jack Krupansky
even if these examples are not formally released, at least people can view and copy them. -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Tuesday, January 21, 2014 8:00 AM To: solr-user@lucene.apache.org Subject: Solr middle-ware? Hello, All the Solr

Re: KeywordTokenizerFactory - trouble with "exact" matches

2014-01-30 Thread Jack Krupansky
ot; remains for now. -- Jack Krupansky -Original Message- From: Aleksander Akerø Sent: Thursday, January 30, 2014 9:31 AM To: solr-user@lucene.apache.org Subject: Re: KeywordTokenizerFactory - trouble with "exact" matches Yes, I actually noted that about the filter vs. tokeni

Re: Realtimeget SolrCloud

2014-01-31 Thread Jack Krupansky
g what the original handler name was. -- Jack Krupansky -Original Message- From: StrW_dev Sent: Friday, January 31, 2014 4:56 AM To: solr-user@lucene.apache.org Subject: Re: Realtimeget SolrCloud That seemed to be the issue. I had several other request handlers as I wasn't using the s

Re: Storing ranges on documents and searching all document with specific value included

2014-01-31 Thread Jack Krupansky
What does your actual query look like? Is it two range queries and an AND? Also, you have spaces in your field names, so that makes it more difficult to write queries since they need to be escaped. -- Jack Krupansky -Original Message- From: Avner Levy Sent: Saturday, January 18

Re: Special character search in Solr and boosting without altering the resultset

2014-02-01 Thread Jack Krupansky
q=+term1 term2^0.6 Will require term1 but term2 is optional. -- Jack Krupansky -Original Message- From: abhishek jain Sent: Saturday, February 1, 2014 10:27 AM To: solr-user@lucene.apache.org ; 'Ahmet Arslan' Subject: RE: Special character search in Solr and boosting withou

Re: Solr and SDL Tridion Integration

2014-02-03 Thread Jack Krupansky
If SDL Tridion can export to CSV format, Solr can then import from CSV format. Otherwise, you may have to write a custom script or even maybe Java code to read from SDL Tridion and output a supported Solr format, such as Solr XML, Solr JSON, or CSV. -- Jack Krupansky -Original Message

Re: Apache Solr.

2014-02-03 Thread Jack Krupansky
PDF files can be directly imported into Solr using Solr Cell (AKA ExtractingRequestHandler). See: https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika Internally, Solr Cell uses Tika, which in turn uses PDFBox. -- Jack Krupansky -Original

Re: Score of Search Term for every character remove

2014-02-03 Thread Jack Krupansky
I think he want to do a bunch of separate queries and return separate result sets for each. Hmmm... maybe it would be nice to allow multiple "q" parameters in one query request, each returning a separate set of results. -- Jack Krupansky -Original Message- From: Eric

Re: Solr Searching Issue

2014-02-04 Thread Jack Krupansky
Maybe you need a larger Java heap. -- Jack Krupansky -Original Message- From: Sathya Sent: Tuesday, February 4, 2014 6:11 AM To: solr-user@lucene.apache.org Subject: Solr Searching Issue Hi Friends, I am working in Solr 4.6.0 from last 2 months. i have indexed the data in solr

Re: How to index the data from db using Solandra

2014-02-04 Thread Jack Krupansky
nd queried using the Cassandra API(s) as well. To be clear, DSE is a "database", not a "search platform". The idea is that DSE can be the system of record, with data stored in Cassandra and can easily be reindexed for Solr at any time from that Cassandra data. -- Jack Kr

Re: Lowering query time

2014-02-04 Thread Jack Krupansky
nt of data processed can help. Any multivalued fields with lots of values? -- Jack Krupansky -Original Message- From: Joel Cohen Sent: Tuesday, February 4, 2014 1:43 PM To: solr-user@lucene.apache.org Subject: Re: Lowering query time 1. We are faceting. I'm not a developer so I'

Re: Max Limit to Schema Fields - Solr 4.X

2014-02-04 Thread Jack Krupansky
What will your queries be like? Will it be okay if they are relatively slow? I mean, how many of those 100 fields will you need to use in a typical (95th percentile) query? -- Jack Krupansky -Original Message- From: Mike L. Sent: Tuesday, February 4, 2014 10:00 PM To: solr-user

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Jack Krupansky
to a query. -- Jack Krupansky -Original Message- From: Mauro Gregorio Binetti Sent: Wednesday, February 5, 2014 5:17 AM To: solr-user@lucene.apache.org Subject: Disable searching on ddm tika metadata Hi everybody, I'm a newbie and I'm working on searching performance in

Re: Import data from mysql to sold

2014-02-05 Thread Jack Krupansky
ush" model, as opposed to the Solr DIH "pull" model. See: http://blog.mongodb.org/post/29127828146/introducing-mongo-connector -- Jack Krupansky -Original Message- From: rachun Sent: Wednesday, February 5, 2014 6:25 AM To: solr-user@lucene.apache.org Subject: Re:

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Jack Krupansky
Simply post to this mail list the timing section of the query response for a test query that you feel is too slow, but be sure to add the debug=true parameter (or debug=timing.) -- Jack Krupansky -Original Message- From: Mauro Gregorio Binetti Sent: Wednesday, February 5, 2014 6:44

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Jack Krupansky
I’m not interested in the log (although maybe somebody else can spot something there) – it’s the query response that is returned on your query HTTP request (XML or JSON.) The specific parameter to add to your HTTP query request is “&debug=true”. -- Jack Krupansky From: Mauro Gregorio Bin

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Jack Krupansky
(Gulp!) You could also set the debug parameter (temporarily) in the defaults section of your query request handler. But you still need to dump the text of the query response. -- Jack Krupansky -Original Message- From: Mauro Gregorio Binetti Sent: Wednesday, February 5, 2014 12:47

Re: Partial Word Search

2014-02-05 Thread Jack Krupansky
ce it will query for all the sub-terms, but AND will only work if all the sub-terms occur in the document field. -- Jack Krupansky -Original Message- From: Teague James Sent: Wednesday, February 5, 2014 4:52 PM To: solr-user@lucene.apache.org Subject: Partial Word Search I cannot

Re: Default core for updates in multicore setup

2014-02-05 Thread Jack Krupansky
Tom, I did make an effort to "sort out" both the old and newer solr.xml features in my Solr 4.x Deep Dive e-book. -- Jack Krupansky -Original Message- From: Tom Burton-West Sent: Wednesday, February 5, 2014 5:56 PM To: solr-user@lucene.apache.org Subject: Re: Default core f

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Jack Krupansky
Yes. Look at the example solrconfig.xml for a section labeled "defaults" for the "/select" request handler. You should see "df" as one parameter. Just copy that and change "df" to "debug" and change the field name to "true". -- Jack

Re: Import data from mysql to sold

2014-02-05 Thread Jack Krupansky
It appears that at this moment the best approach would be to write a Java program that reads from MongoDB and writes to Solr (Solr XML update requests.) Or, write a program that reads from MongDB and outputs a CSV format text file and then import that directly into Solr. -- Jack Krupansky

Re: Performance impact using edismax over dismax

2014-02-06 Thread Jack Krupansky
omewhat to query complexity, albeit in the name of better relevancy. -- Jack Krupansky -Original Message- From: Srinivasa7 Sent: Thursday, February 6, 2014 9:30 AM To: solr-user@lucene.apache.org Subject: Performance impact using edismax over dismax Hi All, I have a requirement to sear

Re: Partial Word Search

2014-02-06 Thread Jack Krupansky
work, at least for some simple cases. -- Jack Krupansky -Original Message- From: Teague James Sent: Thursday, February 6, 2014 11:11 AM To: solr-user@lucene.apache.org Subject: RE: Partial Word Search Jack, Thanks for responding! I had tried configuring this asymmetrically before

Re: Performance impact using edismax over dismax

2014-02-06 Thread Jack Krupansky
Use the pf parameter and then you won't have to modify the original query at all! And you can add a boost for the phrase, which is a common practice. pf=search-field^10.0 -- Jack Krupansky -Original Message- From: Srinivasa7 Sent: Thursday, February 6, 2014 11:21 AM To: solr

Re: ExtendedDismax and NOT operator

2014-02-07 Thread Jack Krupansky
I suspect that's a bug. The phrase boost code should have the logic to exclude negated terms. File a Jira. Thanks for reporting this. -- Jack Krupansky -Original Message- From: Geert Van Huychem Sent: Friday, February 7, 2014 9:40 AM To: solr-user@lucene.apache.org Su

Re: Need help for integrating solr-4.5.1 with UIMA

2014-02-07 Thread Jack Krupansky
The UIMA component is not very error-friendly - NPE gets thrown for missing or misspelled parameter names. Basically, you have to look at the source code based on that stack trace to find out which parameter was missing. -- Jack Krupansky -Original Message- From: rashi gandhi Sent

Re: Max Limit to Schema Fields - Solr 4.X

2014-02-08 Thread Jack Krupansky
that more will definitely cause problems, but because you will be beyond common usage and increasingly sensitive to amount of data and Java/JVM performance capabilities. -- Jack Krupansky -Original Message- From: Mike L. Sent: Saturday, February 8, 2014 2:12 PM To: solr-user@lucene

Re: A bit lost in the land of schemaless Solr

2014-02-08 Thread Jack Krupansky
analyzer, etc. More like type as merely inheriting attributes from another field/type. -- Jack Krupansky -Original Message- From: Benson Margulies Sent: Saturday, February 8, 2014 2:37 PM To: solr-user@lucene.apache.org Subject: A bit lost in the land of schemaless Solr Say that I have 10

Re: Deciding how to correctly use Solr multicore

2014-02-09 Thread Jack Krupansky
ction would rarely need to be sharded. You didn't speak at all about HA (High Availability) requirements or replication. Or about query latency requirements or query load - which can impact replication requirements. -- Jack Krupansky -Original Message- From: Pisarev, Vit

Re: Problem querying large StrField?

2014-02-10 Thread Jack Krupansky
. That said, given Lucene/Solr's rich support for large tokenized fields, they might be a better choice for representing large lists of entities - if denormalization is not quite practical. -- Jack Krupansky -Original Message- From: Luis Lebolo Sent: Monday, February 10, 2014

Re: positionIncrementGap in schema.xml - Doesn't seem to work

2014-02-10 Thread Jack Krupansky
"Eric solrUser"~102 would match. -- Jack Krupansky -Original Message- From: Nirali Mehta Sent: Monday, February 10, 2014 3:13 PM To: solr-user@lucene.apache.org Subject: Re: positionIncrementGap in schema.xml - Doesn't seem to work Erick, Here is the example. The

Re: positionIncrementGap in schema.xml - Doesn't seem to work

2014-02-10 Thread Jack Krupansky
Try the complex phrase query parser: https://issues.apache.org/jira/browse/SOLR-1604 Or in LucidWorks Search you can say: J* NEAR:5 K* -- Jack Krupansky -Original Message- From: Kashish Sent: Monday, February 10, 2014 6:12 PM To: solr-user@lucene.apache.org Subject: Re

Re: solr-query with NOT and OR operator

2014-02-11 Thread Jack Krupansky
)))+OR+(field2:value2). -- Jack Krupansky -Original Message- From: Johannes Siegert Sent: Tuesday, February 11, 2014 10:57 AM To: solr-user@lucene.apache.org Subject: solr-query with NOT and OR operator Hi, my solr-request contains the following filter-query: fq=((-(field1:value1

Re: Indexing in the case of entire shard failure

2014-02-12 Thread Jack Krupansky
a fault-tolerant, fully-distributed system. Your application can/should make its own decision as to what it will do if an indexing operation cannot be serviced. -- Jack Krupansky -Original Message- From: elmerfudd Sent: Wednesday, February 12, 2014 7:54 AM To: solr-user

Re: Solr perfromance with commitWithin seesm too good to be true. I am afraid I am missing something

2014-02-12 Thread Jack Krupansky
computing environment, coupled with multi-core processors and parallel threads. -- Jack Krupansky -Original Message- From: Pisarev, Vitaliy Sent: Wednesday, February 12, 2014 10:28 AM To: solr-user@lucene.apache.org Subject: RE: Solr perfromance with commitWithin seesm too good to be true. I

Re: Question about how to upload XML by using SolrJ Client Java Code

2014-02-12 Thread Jack Krupansky
x+Handlers#UploadingDatawithIndexHandlers-UsingXSLTtoTransformXMLIndexUpdates -- Jack Krupansky -Original Message- From: Eric_Peng Sent: Wednesday, February 12, 2014 11:42 AM To: solr-user@lucene.apache.org Subject: Re: Question about how to upload XML by using SolrJ Client Java Code Tha

Re: Using numeric ranges in Solr query

2014-02-12 Thread Jack Krupansky
Is price a float/double field? price:[99.5 TO 100.5] -- price near 100 price:[900 TO 1000] or price:[899.5 TO 1000.5] -- Jack Krupansky -Original Message- From: jay67 Sent: Wednesday, February 12, 2014 12:03 PM To: solr-user@lucene.apache.org Subject: Using numeric ranges in Solr

Re: Weird issue with q.op=AND

2014-02-12 Thread Jack Krupansky
Did you mean to use "||" for the OR operator? A single "|" is not treated as an operator - it will be treated as a term and sent through normal term analysis. -- Jack Krupansky -Original Message- From: Shamik Bandopadhyay Sent: Wednesday, February 12, 2014 5

Re: Multiple Column Condition with Relevance/Rank

2014-02-13 Thread Jack Krupansky
Use the OR operator between the specific clauses. -- Jack Krupansky -Original Message- From: EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) Sent: Thursday, February 13, 2014 9:09 AM To: solr-user@lucene.apache.org Subject: Multiple Column Condition with Relevance/Rank

Re: Exact word match

2014-02-14 Thread Jack Krupansky
Set the default query operator (q.op parameter) to AND, or enclosed the full phrase in quotes. -- Jack Krupansky -Original Message- From: Sohan Kalsariya Sent: Friday, February 14, 2014 4:29 AM To: solr-user@lucene.apache.org Subject: Exact word match Hello, I want to the exact

Re: Boost Query Example

2014-02-18 Thread Jack Krupansky
Add debugQuery=true to your queries and look at the scoring in the "explain" section. From the intermediate scoring by field, you should be able to do the math to figure out what boost would be required to rank your exact match high enough. -- Jack Krupansky -Original Message-

Re: Additive boost function

2014-02-18 Thread Jack Krupansky
The edismax query parser "bf" parameter gives you an additive boost. See: http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29 -- Jack Krupansky -Original Message- From: Zwer Sent: Tuesday, February 18, 2014 12:52 PM To: solr-user@lucene.apache.o

Re: Weird behavior of stopwords in search query

2014-02-18 Thread Jack Krupansky
Without "and", the terms are OR'ed, which is the default query operator. -- Jack Krupansky -Original Message- From: Shamik Bandopadhyay Sent: Tuesday, February 18, 2014 8:53 PM To: solr-user@lucene.apache.org Subject: Weird behavior of stopwords in search query Hi,

Re: Weird behavior of stopwords in search query

2014-02-19 Thread Jack Krupansky
eve the default setting! Rather, it should tell you how to override the default setting. -- Jack Krupansky -Original Message- From: Ahmet Arslan Sent: Wednesday, February 19, 2014 4:16 AM To: solr-user@lucene.apache.org Subject: Re: Weird behavior of stopwords in search query Hi Sami

Re: Getting fields from query

2014-02-19 Thread Jack Krupansky
y of the Lucene query parser. -- Jack Krupansky -Original Message- From: Jamie Johnson Sent: Wednesday, February 19, 2014 8:05 PM To: solr-user@lucene.apache.org ; Ahmet Arslan Subject: Re: Getting fields from query On closer inspection this isn't quite what I'm looking for. The

Re: Fwd: help on edismax_dynamic fields

2014-02-22 Thread Jack Krupansky
e was ever a Jira for it. A quick search did not find one. Feel free to file one. -- Jack Krupansky -Original Message- From: rashi gandhi Sent: Saturday, February 22, 2014 1:34 AM To: solr-user@lucene.apache.org Subject: Fwd: help on edismax_dynamic fields Hello, I am using edismax p

Re: Slow query time on stemmed fields

2014-02-24 Thread Jack Krupansky
Maybe some heap/GC issue from using more of this 20 GB index. Maybe it was running at the edge and just one more field was too much for the heap. The "timing" section of the debug query response should shed a little light. -- Jack Krupansky -Original Message- From: Eric

Re: Format of the spellcheck.q used to get suggestions in current filter

2014-02-26 Thread Jack Krupansky
Could you post the request URL and the XML/JSON Solr response? And the solrconfig for both the query request handler and the spellcheck component. Is your spell check component configured for both fields, field1 and field2? -- Jack Krupansky -Original Message- From: Hakim Benoudjit

Re: Solr cloud: Faceting issue on text field

2014-02-26 Thread Jack Krupansky
Are you sure you want to be faceting on a text field, as opposed to a string field? I mean, each term (word) from the text will be a separate facet value. How many facet values do you typically returning? How many unique terms occur in the facet field? -- Jack Krupansky -Original

Re: Tracing Solr Query Execution and Performance

2014-02-26 Thread Jack Krupansky
I don't recall seeing anything related to passing the debug/debugQuery parameters on for inter-node shard queries and then add that to the aggregated response (if debug/debugQuery was specified.) Sounds worth a Jira. -- Jack Krupansky -Original Message- From: KNitin Sent: Wedn

Re: How does Solr parse schema.xml?

2014-02-26 Thread Jack Krupansky
the next (unpublished) release of my book (that's one of them.) That handler returns all token details, but if you wanted to roll your own, start there. The handler is: org.apache.solr.handler.FieldAnalysisRequestHandler -- Jack Krupansky -Original Message- From: Software Dev Se

Re: Parallel queries to Solr

2014-02-26 Thread Jack Krupansky
Just send the queries to Solr in parallel using multiple threads in your application layer. Solr can handle multiple, parallel queries as separate, parallel requests, but does not have a way to bundle multiple queries on a single request. -- Jack Krupansky -Original Message- From

Re: Search score problem using bf edismax

2014-02-26 Thread Jack Krupansky
The bf parameter adds the value of a function query to the document store. Your example did not include a bf parameter. -- Jack Krupansky -Original Message- From: Ing. Andrea Vettori Sent: Wednesday, February 26, 2014 12:26 PM To: solr-user@lucene.apache.org Subject: Search score

Re: Searching with special chars

2014-02-27 Thread Jack Krupansky
Backslashes are used to escape special characters in queries, but the backslash must in turn be encoded in the URL as %5C. -- Jack Krupansky -Original Message- From: deniz Sent: Thursday, February 27, 2014 1:36 AM To: solr-user@lucene.apache.org Subject: Searching with special chars

Re: Date query not returning results only some time

2014-02-28 Thread Jack Krupansky
requires explicitly referring to all documents before applying the negation. So, AND -tag_id:268702 should be: AND (*:* -tag_id:268702) Or, maybe you actually wanted this: first_publish_date:[NOW/DAY-33DAYS TO NOW/DAY-3DAYS] -tag_id:268702 -- Jack Krupansky -Original Message- From

Re: stopwords issue with edismax

2014-02-28 Thread Jack Krupansky
not ignore stopwords, so the dismax for "of" will not be empty (although the clause for some of the fields will not be present since the stop word filter eliminates them) so that the dismax fails to match anything and since q.op=AND, the whole query matches nothing. -- Jack

Re: Solr is NoSQL database or not?

2014-03-01 Thread Jack Krupansky
o "reindex". Can you imagine database developers being told that they must delete all their existing data and "start over"? -- Jack Krupansky -Original Message- From: nutchsolruser Sent: Friday, February 28, 2014 11:09 PM To: solr-user@lucene.apache.org Subject: Sol

Re: Solr is NoSQL database or not?

2014-03-01 Thread Jack Krupansky
requirements needed for a System of Record or where strict ACID and heavy real-time updates are required. It's up to the individual application project to make all of these suitability judgments. -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Saturday, March 1, 20

Re: stopwords issue with edismax

2014-03-02 Thread Jack Krupansky
As I suggested, you have a couple of field that do not ignore stop words, so the stop word must be present in at least one of those fields: (number:of^3.0 | all_code:of^2.0) The solution would be to remove the "number" and "all_code" fields from qf. -- Jack Krupansky -

Re: Solr is NoSQL database or not?

2014-03-03 Thread Jack Krupansky
up to accommodate them." -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Monday, March 3, 2014 10:58 AM To: solr-user@lucene.apache.org Subject: Re: Solr is NoSQL database or not? Hi; I said that: "What are the main differences between ElasticSearch and

Re: Multiple partial match

2014-03-03 Thread Jack Krupansky
term frequency as well in the function query. boost=product(tf(name,'co'),10) -- Jack Krupansky -Original Message- From: Zwer Sent: Monday, March 3, 2014 10:34 AM To: solr-user@lucene.apache.org Subject: Multiple partial match Hi Guys, Faced with a problem: make query to S

Re: Id As URL for Solrj

2014-03-04 Thread Jack Krupansky
Or, maybe he is and shouldn't since deleteById is a SolrXML update handler feature, not a query parser feature. For example: http://stackoverflow.com/questions/2657409/deleting-index-from-solr-using-solrj-as-a-client -- Jack Krupansky -Original Message- From: Markus Jelsma

Re: stopwords issue with edismax

2014-03-04 Thread Jack Krupansky
Yes, if they are tokenized text fields, but I was assuming that "number" was a strictly numeric field. That said, you could have numeric and non-tokenized string fields, but copyField them to text fields (or a single text field) for purposes of queries. -- Jack Krupansky ---

Re: Indexing huge data

2014-03-05 Thread Jack Krupansky
Make sure you're not doing a commit on each individual document add. Commit every few minutes or every few hundred or few thousand documents is sufficient. You can set up auto commit in solrconfig.xml. -- Jack Krupansky -Original Message- From: Rallavagu Sent: Wednesday, Ma

Re: explaination of query processing in SOLR

2014-08-17 Thread Jack Krupansky
In any case, besides the raw code and the similarity Javadoc, Lucene does have Javadoc for "file formats": http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/codecs/lucene49/package-summary.html -- Jack Krupansky -Original Message- From: Aman Tandon Sent: Sunday,

Re: Substring and Case In sensitive Search

2014-08-19 Thread Jack Krupansky
se case to confirm whether you really need to use "string" as opposed to "text" field. -- Jack Krupansky -Original Message- From: Nishanth S Sent: Tuesday, August 19, 2014 12:03 PM To: solr-user@lucene.apache.org Subject: Substring and Case In sensitive Search Hi,

Re: Performance of Boolean query with hundreds of OR clauses.

2014-08-19 Thread Jack Krupansky
ew minutes" CPU-bound or I/O-bound? -- Jack Krupansky -Original Message- From: SolrUser1543 Sent: Tuesday, August 19, 2014 2:57 PM To: solr-user@lucene.apache.org Subject: Performance of Boolean query with hundreds of OR clauses. I am using Solr to perform search for finding s

Re: Help with StopFilterFactory

2014-08-19 Thread Jack Krupansky
What release of Solr? Do you have autoGeneratePhraseQueries="true" on the field? And when you said "But any of these does", did you mean "But NONE of these does"? -- Jack Krupansky -Original Message- From: heaven Sent: Tuesday, August

Re: Help with StopFilterFactory

2014-08-21 Thread Jack Krupansky
For the sake of completeness, please post the parsed query that you get when you add the debug=true parameter. IOW, how Solr/Lucene actually interprets the query itself. -- Jack Krupansky -Original Message- From: Shawn Heisey Sent: Thursday, August 21, 2014 10:03 AM To: solr-user

Re: Substring and Case In sensitive Search

2014-08-21 Thread Jack Krupansky
ver decent performance, as long as the prefix isn't too short (e.g., "cat*"). See PrefixQuery: http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/search/PrefixQuery.html ngram filters can also be used, but... that can make the index rather large. -- Jack Krupansky -Original

Re: Strange Behavior

2014-08-23 Thread Jack Krupansky
ended use case clearly - there may be some better way to try to achieve it. Use the analysis page of the Solr Admin UI to see the detailed query and index analysis of your terms. You'll be surprised. -- Jack Krupansky -Original Message- From: EXTERNAL Taminidi Ravi (ETI, Automoti

Re: Minimum Match with filters that add tokens

2014-08-23 Thread Jack Krupansky
original query, the implementation (BooleanQuery) uses the terms generated by the analysis process, which can break up source terms into multiple terms and generate extra terms as well. Any MM number or percentage will count the terms output by analysis, not the source terms. -- Jack Krupansky

Re: Integrating DictionaryAnnotator and Solr

2014-08-23 Thread Jack Krupansky
Uhhh... UIMA... and parameter checking... NOT. You're probably missing something, but there is so much stuff. I have some examples in my e-book that show various errors you can get for missing/incorrect parameters for UIMA: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-

Re: Exact search with special characters

2014-08-24 Thread Jack Krupansky
erms of how it treats text to be indexed and how you expect to be able to query that text. -- Jack Krupansky -Original Message- From: Shay Sofer Sent: Sunday, August 24, 2014 5:58 AM To: solr-user@lucene.apache.org Subject: Exact search with special characters Hi all,

Re: embedded documents

2014-08-24 Thread Jack Krupansky
problematic. -- Jack Krupansky -Original Message- From: Michael Pitsounis Sent: Wednesday, August 20, 2014 7:14 PM To: solr-user@lucene.apache.org Subject: embedded documents Hello everybody, I had a requirement to store complicated json documents in solr. i have modified the

Re: Help with StopFilterFactory

2014-08-24 Thread Jack Krupansky
y - you've confused the discussion here by failing to do so on at least one occasion, and possibly in this latest response although I can't tell for sure. 5. We'll confirm either any mistakes you've made, recommendations, and whether there are any bugs. Fair enough? -- Jack Kr

Re: Help with StopFilterFactory

2014-08-24 Thread Jack Krupansky
rue query parameter and post the parsed query so that we can see what was really generated for the query. -- Jack Krupansky -Original Message- From: heaven Sent: Sunday, August 24, 2014 12:04 PM To: solr-user@lucene.apache.org Subject: Re: Help with StopFilterFactory I don'

Re: Help with StopFilterFactory

2014-08-24 Thread Jack Krupansky
Just to confirm, the generated phrase query is generated using the analyzed terms, so if the stop filter is removing the terms, they won't appear in the generated query. It will be interesting to see what does get generated. -- Jack Krupansky -Original Message- From: heaven

Re: embedded documents

2014-08-25 Thread Jack Krupansky
That's a completely different concept, I think - the ability to return a single field value as a structured JSON object in the "writer", rather than simply "loading" from a nested JSON object and distributing the key values to normal Solr fields. -- Jack Krupans

Re: Exact search with special characters

2014-08-25 Thread Jack Krupansky
d for whatever degree of "exactness" you require. -- Jack Krupansky -Original Message- From: Shay Sofer Sent: Monday, August 25, 2014 8:02 AM To: solr-user@lucene.apache.org Subject: RE: Exact search with special characters Hi, Thanks for your reply. I thought that google

Re: Help with StopFilterFactory

2014-08-25 Thread Jack Krupansky
lr users is not well highlighted for Solr users. Sorry about that. In any case, try adding enablePositionIncrements="false", reindex, and see what happens. -- Jack Krupansky -Original Message- From: heaven Sent: Monday, August 25, 2014 3:37 AM To: solr-user@lucene.apache.org

Re: embedded documents

2014-08-25 Thread Jack Krupansky
ng to make Solr more automatic and more approachable, not an even more complicated "toolkit". -- Jack Krupansky -Original Message- From: Erik Hatcher Sent: Monday, August 25, 2014 9:32 AM To: solr-user@lucene.apache.org Subject: Re: embedded documents Jack et al - there’s now

Re: embedded documents

2014-08-25 Thread Jack Krupansky
And a comparison to Elasticsearch would be helpful, since ES gets a lot of mileage from their super-easy JSON support. IOW, how much of the ES "advantage" is eliminated. -- Jack Krupansky -Original Message- From: Noble Paul Sent: Monday, August 25, 2014 1:59 PM To:

Re: Help with StopFilterFactory

2014-08-26 Thread Jack Krupansky
this attribute: luceneMatchVersion="4.3" But... the old behavior is now "deprecated", so it mostly likely will not be in Solr 5.0. I'll think about this some more as to whether there might be some workaround or alternative. -- Jack Krupansky -Original Message- From

Re: Help with StopFilterFactory

2014-08-26 Thread Jack Krupansky
I agree that it's a bad situation, and wasn't handled well by the Lucene guys. They may have had good reasons, but they didn't execute a decent plan for how to migrate existing behavior. -- Jack Krupansky -Original Message- From: heaven Sent: Tuesday, August 26,

Re: Solr range query issue

2014-08-27 Thread Jack Krupansky
or "text" field (which)? If a text field, does it have a lower case filter, in which case you don't need lower case. Worst case, you could use a regex query term, but better to avoid that if at all possible. -- Jack Krupansky -Original Message- From: nutchsolruser Sen

Re: Solr content limits?

2014-08-27 Thread Jack Krupansky
u load a range of documents on your chosen hardware, both a single machine and a small cluster, and measure how much load it can handle and how it performs. And then scale your cluster based on that application-specific performance data. -- Jack Krupansky -Original Message- From: lalitja

Re: Solr CPU Usage

2014-08-27 Thread Jack Krupansky
Is the high usage just suddenly happening after a long period of up-time without it, or is this on a server restart? The latter can happen if you have a large commit log to replay because you haven't done hard commits. -- Jack Krupansky -Original Message- From: Shawn Heisey

Re: Query regarding URL Analysers

2014-08-28 Thread Jack Krupansky
-core/org/apache/solr/update/processor/URLClassifyProcessor.html The official doc is... pitiful, but I have doc and examples in my e-book: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html -- Jack Krupansky -Original Message

Re: external indexer for Solr Cloud

2014-08-29 Thread Jack Krupansky
What exactly are you referring to by the term "external indexer"? -- Jack Krupansky -Original Message- From: Lee Chunki Sent: Friday, August 29, 2014 7:21 AM To: solr-user@lucene.apache.org Subject: external indexer for Solr Cloud Hi, Is there any way to run external i

Re: Specify Analyzer per field

2014-08-29 Thread Jack Krupansky
5098 That said, maybe you could provide a couple of examples of exactly what you want to do. -- Jack Krupansky -Original Message- From: Ankit Jain Sent: Friday, August 29, 2014 8:16 AM To: solr-user@lucene.apache.org Subject: Specify Analyzer per field Hi All, I would like to use s

Re: Specify Analyzer per field

2014-08-29 Thread Jack Krupansky
Different field TYPES, not different fields. -- Jack Krupansky -Original Message- From: Ahmet Arslan Sent: Friday, August 29, 2014 8:49 AM To: solr-user@lucene.apache.org Subject: Re: Specify Analyzer per field Hi, I think he wants to change query analyzer dynamically, where index

Re: Specify Analyzer per field

2014-08-29 Thread Jack Krupansky
But that doesn't let him change or override the analyzer for the field type. -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Friday, August 29, 2014 11:55 AM To: solr-user Subject: Re: Specify Analyzer per field Can't you just use old fashion dyna

Re: external indexer for Solr Cloud

2014-08-29 Thread Jack Krupansky
My other thought was that maybe he wants to do index updates outside of the cluster that is handling queries, and then copy in the completed index. Or... maybe take replicas out of the query rotation while they are updated. Or... maybe this is yet another X-Y problem! -- Jack Krupansky

Re: solr result handler??

2014-08-30 Thread Jack Krupansky
could override that filter. Or, do an application layer that forces that filter to be added. -- Jack Krupansky -Original Message- From: cmd.ares Sent: Saturday, August 30, 2014 2:10 AM To: solr-user@lucene.apache.org Subject: solr result handler?? I have a blacklist save some keywords

Re: Scaling to large Number of Collections

2014-08-31 Thread Jack Krupansky
is not a supported scenario at this time. Certainly suggestions for future enhancement can be made though. -- Jack Krupansky -Original Message- From: Christoph Schmidt Sent: Sunday, August 31, 2014 4:04 AM To: solr-user@lucene.apache.org Subject: Scaling to large Number of Collections we se

Re: AW: Scaling to large Number of Collections

2014-08-31 Thread Jack Krupansky
You close with two great questions for the community! We have a similar issue over in Apache Cassandra database land (thousands of tables). There is no immediate, easy, great answer. Other than the kinds of "workarounds" being suggested. -- Jack Krupansky -Original Message-

Re: Scaling to large Number of Collections

2014-08-31 Thread Jack Krupansky
e sharded for a few shards or even just a single shard, and to instead focus the attention on large number of collections rather than heavily-sharded collections. -- Jack Krupansky -Original Message- From: Erick Erickson Sent: Sunday, August 31, 2014 12:04 PM To: solr-user@lucene.

Re: Specify Analyzer per field

2014-09-01 Thread Jack Krupansky
w you how to define and use custom analyzers as well." No, Solr does not have that feature per se - you have to specify a custom field TYPE to specify the analyzer. -- Jack Krupansky -Original Message- From: Ankit Jain Sent: Monday, September 1, 2014 2:14 AM To:

Re: AW: Scaling to large Number of Collections

2014-09-01 Thread Jack Krupansky
for each tenant's collection(s). This raises the question: How many of your collections need to be simultaneously active? Say, in a one-hour period, how many of them will be updating and serving queries, and what query load per-collection and total query load do you need to design fo

<    3   4   5   6   7   8   9   10   11   12   >