Solr Exact match boost Reduce the results

2015-06-12 Thread JACK
I have two fields, one is copy field. I have to get Exact match results first along with entire result of fuzzy search. Its filed definition is given below

Re: Solr Exact match boost Reduce the results

2015-06-12 Thread JACK
Hi Alessandro Benedetti , What i meant is that suppose if i have items like this dell laptop with bag dell laptop dell laptop without bag dell inspiron laptop with bag if i query for "dell laptop", the result should be like this dell laptop dell laptop with bag dell laptop without bag dell inspir

Re: Solr Exact match boost Reduce the results

2015-06-12 Thread JACK
Hi, I have to search on the field product_name.It is found that in order to get exact matches first, I made one copy field named as dummy_name with the above field definition.And while query, just boost the copy field. I done this. So as to get exact matches I need to put quotes around the search w

Re: Solr Exact match boost Reduce the results

2015-06-12 Thread JACK
The quoted search words will be different and it will be any word or more than one word. In the query it's just example -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352p4211410.html Sent from the Solr - User mailing list ar

Re: Solr Exact match boost Reduce the results

2015-06-12 Thread JACK
As explained above, actually I have around 10 lack data not 5 row. It's not about synonyms . When I checked in the FAQ page of Solr wiki, it is found that if we need to get exact match results first, use a copy field with different configuration. That's why I followed this way. -- View this mes

RE: Solr Exact match boost Reduce the results

2015-06-14 Thread JACK
Hi chillra, I have changed the index and query filed configuration to But still my problem not solved , it won't resolve my problem. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352p4211788.html Sent from the Solr - U

Re: Solr Exact match boost Reduce the results

2015-06-15 Thread JACK
Hi Alessandro Benedetti, The query is http://localhost:8983/solr/MYDBCORE/select?q=product_name:(laptop+bag)&wt=json&indent=true 1.Dell Inspiron 3542 Laptop (Black) without Laptop Bag 2.Dell 3542 15-inch Laptop with Laptop Bag by Dell 3.Dell Inspiron N3137 11-inch Laptop without Laptop Bag by De

Re: Solr Exact match boost Reduce the results

2015-06-15 Thread JACK
Hi Alessandro Benedetti, Its my Analysis value index WT text raw_bytes start end positionLength type position laptop [6c 61 70 74 6f 70] 0 6 1 word 1 bag [62 61 67] 7 10 1 word 2 SF text raw_bytes start end positionLength type position laptop [6c 61 7

Re: Solr Exact match boost Reduce the results

2015-06-22 Thread JACK
Hi Alessandro Benedetti, I have changed the query like this. /select?q=("dell+laptop"~13+OR+"dell+laptop")&df=product_name&defType=edismax&qf=product_name^0.001+dummy_product_name^2&fl=product_name&wt=json&indent=true&debug=true The corresponding results also given in the below link.Now am

Re: Solr Exact match boost Reduce the results

2015-06-23 Thread JACK
Hi Alessandro Benedetti, Can you check what happens to the below given product_name , just check its score "product_name":"LAPTOP BATTERY DELL Inspiron 6400 1501 E1505 RD859 UD267 XU937" How is this product comes , ? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Ex

Solr Boost Search word before Specific Content

2015-07-07 Thread JACK
I am using Solr 5.0.0, I have one question in relevance boost: If I search for laptop table like words, is there any way to boost results search word before the words like by with or without etc. I used this query: ? defType = dismax & q = foo bar & bq = (*:* -by)^999 But, this will

Re: Solr Boost Search word before Specific Content

2015-07-07 Thread JACK
Hi Ahmet, Can you elaborate it more? Is it possible to solve my problem in Solr 5.0.0? if yes can just explain how? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Boost-Search-word-before-Specific-Content-tp4216072p4216257.html Sent from the Solr - User mailing list a

Re: Ranking based on term position

2015-07-09 Thread JACK
Hi Li Li, I am experiencing the same problem. can you Explain little detailed? Where do i change these methods? I am using Solr 5.0.0, And How do i query this? Is there any change while query? -- View this message in context: http://lucene.472066.n3.nabble.com/Ranking-based-on-term-position-tp

Get TF-IDF from index?

2007-11-19 Thread Jack
t from the whole corpus, or, simply to enumerate all indexed terms. Maybe solor doesn't expose this data, but I can use Lucene jar to get them? Since I am not familiar with Lucene, either, any pointer is helpful. Thanks, jack

Re: LSA Implementation

2007-11-26 Thread Jack
Interesting. Patents are valid for 20 years so it expires next year? :) PLSA does not seem to have been patented, at least not mentioned in http://en.wikipedia.org/wiki/Probabilistic_latent_semantic_analysis On Nov 26, 2007 6:58 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > LSA (http://en.wikip

Re: the time factor

2008-05-20 Thread Jack
en another query for the previous 24 hours ... but this really isn't very efficient. Maybe OK for news because I may need to serve for up to 7 days. Still, 7 solr queries for a front-end query doesn't sound ideal. So I'm still in search for a better way ... Thanks, Jack On Tue, May 13

Re: How to limit number of pages per domain

2008-05-22 Thread Jack
I think I'll give it a try. I haven't done this before. Are there any instructions regarding how to apply the patch? I see 9 files, some displayed in gray links, some in blue links; some named as .diff, some .patch; one has 1.3 in file name, one has 1.3, I suppose the other files are for both versi

Re: Append fields to a document

2015-12-16 Thread Jack Krupansky
update has various caveats so that it is only useful in a subset of use cases. -- Jack Krupansky On Wed, Dec 16, 2015 at 10:09 AM, Jamie Johnson wrote: > I have a use case where we only need to append some fields to a document. > To retrieve the full representation is very expensive but I can

Re: Slow query response.

2015-12-17 Thread Jack Krupansky
or to return a large bulk of documents? -- Jack Krupansky On Thu, Dec 17, 2015 at 7:01 AM, Modassar Ather wrote: > Hi, > > I have a field f which is defined as follows. > omitNorms="true"/> > > Solr-5.2.1 is used. The index is spread across 12 shards (no replic

Re: While idexing millions of data Getting error

2015-12-18 Thread Jack Krupansky
the exact practical limit depends on your particular hardware and your particular data model and the data itself. How large is each document, roughly? Hundreds, thousands, or millions of bytes? Are some documents extremely large? -- Jack Krupansky On Fri, Dec 18, 2015 at 10:30 AM, Toke Eskild

Re: Schema/Index design for disparate data sources (Federated / Google like search)

2015-12-22 Thread Jack Krupansky
formance is consumed when you have a lot of fields which are not present for a particular data source. -- Jack Krupansky On Tue, Dec 22, 2015 at 11:25 AM, Susheel Kumar wrote: > Hello, > > I am going thru few use cases where we have kind of multiple disparate data > sources which in

Re: How to check when a search exceeds the threshold of timeAllowed parameter

2015-12-22 Thread Jack Krupansky
, making more copies of the index that can each be searched in parallel. How long do queries take when the site is operating normally? Make sure that you have enough system memory to cache the index, otherwise the machine wish be thrashing with lots of I/O for competing requests. -- Jack Krupansky On

Re: Best practices on monitoring Solr

2015-12-23 Thread Jack Krupansky
itself (other than raw JMX and ping.) -- Jack Krupansky On Wed, Dec 23, 2015 at 6:27 AM, Emir Arnautovic < emir.arnauto...@sematext.com> wrote: > Hi Shail, > As William mentioned, our SPM <https://sematext.com/spm/index.html> > allows you to monitor all main Solr/Jvm/Host me

Re: Changing Solr Schema with Data

2015-12-28 Thread Jack Krupansky
abase. Was someone telling you something different? -- Jack Krupansky On Mon, Dec 28, 2015 at 1:48 PM, Salman Ansari wrote: > Hi, > > I am facing an issue where I need to change Solr schema but I have crucial > data that I don't want to delete. Is there a way where I can chan

Re: Adding the same field value question

2015-12-28 Thread Jack Krupansky
Is the field multivalued? -- Jack Krupansky On Sun, Dec 27, 2015 at 11:16 PM, Jamie Johnson wrote: > What is the difference of adding a field with the same value twice or > adding it once and boosting the field on add? Is there a situation where > one approach is preferred? > > Jamie >

Re: Issue with if() statement

2015-12-31 Thread Jack Krupansky
need function queries there as well. -- Jack Krupansky On Thu, Dec 31, 2015 at 6:50 PM, William Bell wrote: > We are getting weird results with if(exists(a),b,c). We are getting b+c!! > > > http://localhost:8983/solr/providersearch/select?q=*:*&wt=json&state=state:%22CO%22&stat

Re: Multiple solr instances on one server

2016-01-04 Thread Jack Krupansky
ctory should contain a solr.xml file, unless solr.xml exists in ZooKeeper. The default value is server/solr. " https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference -- Jack Krupansky On Mon, Jan 4, 2016 at 10:28 AM, Mugeesh Husain wrote: > you could start solr

Re: Many patterns against many sentences, storing all results

2016-01-05 Thread Jack Krupansky
://www.elastic.co/guide/en/elasticsearch/reference/current/search-percolate.html -- Jack Krupansky On Tue, Jan 5, 2016 at 11:05 AM, Allison, Timothy B. wrote: > Might want to look into: > > https://github.com/flaxsearch/luwak > > or > https://github.com/OpenSextant/Solr

Re: Count multivalued field issue

2016-01-06 Thread Jack Krupansky
ork, be sure to provide detail of what the symptom is rather than simply saying that it doesn't work. -- Jack Krupansky On Wed, Jan 6, 2016 at 8:43 AM, marotosg wrote: > Hi, > > I am trying to add a new field to my schema to add the number of items of a > multivalued field. &g

Re: Query behavior difference.

2016-01-06 Thread Jack Krupansky
ption.*" So that's a second reason - to avoid the max clause count limitation of Boolean Query. See: https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/MultiTermQuery.html#CONSTANT_SCORE_REWRITE https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/WildcardQuery

Re: Dynamically Adding query parameters in my custom Request Handler class

2016-01-09 Thread Jack Krupansky
and invest significant effort in a custom request handler when simpler techniques may suffice. -- Jack Krupansky On Sat, Jan 9, 2016 at 12:08 PM, Ahmet Arslan wrote: > Hi Mark, > > Yes this is possible. Better, you can use a custom SearchComponent for > this task too. > You retri

Re: Pro and cons of using Solr Cloud vs standard Master Slave Replica

2016-01-13 Thread Jack Krupansky
e considered a fresh new distributed Solr deployment with anything other than SolrCloud. (Hmmm... have any of the committers considered deprecating the old non-SolrCloud distributed mode features?) -- Jack Krupansky On Wed, Jan 13, 2016 at 9:02 AM, Shivaji Dutta wrote: > - SolrCloud uses

Re: &fq degrades qtime in a 20million doc collection

2016-01-13 Thread Jack Krupansky
I recall a couple of previous discussions regarding some sort of filter/field cache change in Lucene where they removed what had been an optimization for Solr. -- Jack Krupansky On Wed, Jan 13, 2016 at 8:10 PM, Erick Erickson wrote: > It's quite surprising that you're getting

Re: Monitor backup progress when location parameter is used.

2016-01-14 Thread Jack Krupansky
t" indicates success or "Exception while creating snapshot" indicates failure. If only that first message appeals, it means the backup is still in progress. -- Jack Krupansky On Thu, Jan 14, 2016 at 9:23 AM, Gian Maria Ricci - aka Alkampfer < alkamp...@nablasoft.com> wro

Re: &fq degrades qtime in a 20million doc collection

2016-01-14 Thread Jack Krupansky
That sounds like it. Sorry my memory is so hazy. Maybe Yonik can either confirm that that Jira is still outstanding or close it, and confirm if these symptoms are related. -- Jack Krupansky On Thu, Jan 14, 2016 at 10:54 AM, Erick Erickson wrote: > Jack: > > I think that was for facet

Re: Position increment in WordDelimiterFilter.

2016-01-14 Thread Jack Krupansky
Which release of Solr are you using? Last year (or so) there was a Lucene change that had the effect of keeping all terms for WDF at the same position. There was also some discussion about whether this was either a bug or a bug fix, but I don't recall any resolution. -- Jack Krupansky O

Re: Solr Query Tuning

2016-01-14 Thread Jack Krupansky
although even that should not be a big problem. And make sure the ID field is string or numeric, not tokenized text. -- Jack Krupansky On Thu, Jan 14, 2016 at 7:53 PM, Shawn Heisey wrote: > On 1/14/2016 5:20 PM, Shivaji Dutta wrote: > > I am working with a customer that has abou

Re: Solr Query Tuning

2016-01-14 Thread Jack Krupansky
ng parsing of the query) to send the request to exactly the node (or replica) that owns that token/ID. But if you really just trying to "query by ID", that should really have a nice clean API so you don't have to build query syntax. -- Jack Krupansky On Thu, Jan 14, 2016 at 8:41 P

Re: Pro and cons of using Solr Cloud vs standard Master Slave Replica

2016-01-15 Thread Jack Krupansky
shard, let alone all shards. Should backups be collection-based as well? -- Jack Krupansky On Fri, Jan 15, 2016 at 3:26 AM, Gian Maria Ricci - aka Alkampfer < alkamp...@nablasoft.com> wrote: > Yes, I've checked that jira some weeks ago and it is the reason why I was > telling

Re: Speculation on Memory needed to efficently run a Solr Instance.

2016-01-15 Thread Jack Krupansky
the entire index. If you actually don't need minimal latency, then of course you can feel free to trade off RAM for lower latency. -- Jack Krupansky On Fri, Jan 15, 2016 at 4:43 AM, Gian Maria Ricci - aka Alkampfer < alkamp...@nablasoft.com> wrote: > Hi, > > > > When it

Re: Issue with stemming and lemmatizing

2016-01-15 Thread Jack Krupansky
. Plenty of doc for you to start reading. Once you get the basics, then you can move on to more specific and advanced details: https://cwiki.apache.org/confluence/display/solr/Understanding+Analyzers%2C+Tokenizers%2C+and+Filters -- Jack Krupansky On Fri, Jan 15, 2016 at 2:58 PM, sara hajili

Re: Solr Block join not working after parent update

2016-01-15 Thread Jack Krupansky
ogether.*" They must also be updated together. -- Jack Krupansky On Fri, Jan 15, 2016 at 3:31 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > On Thu, Jan 14, 2016 at 10:01 PM, sairamkumar < > sairam.subraman...@gmail.com> > wrote: > > > This is a

Re: Returning all documents in a collection

2016-01-20 Thread Jack Krupansky
ients that automatically send requests to all the shards in a collection (or multiple collections) and then merge the sorted sets any way they wish." -- Jack Krupansky On Wed, Jan 20, 2016 at 8:41 AM, Susheel Kumar wrote: > Hello Salman, > > Please checkout the export fu

Re: Returning all documents in a collection

2016-01-20 Thread Jack Krupansky
te the doc for this stored field restriction, right?!) -- Jack Krupansky On Wed, Jan 20, 2016 at 9:38 AM, Joel Bernstein wrote: > CloudSolrStream is available in Solr 5. The "search" streaming expression > can used or CloudSolrStream can be used in directly. > > https://cwi

Re: Couple of question about Virtualization and Load Balancer

2016-01-21 Thread Jack Krupansky
issue for Solr. The only issue there is assuring that you have enough Solr shards and replicas to handle the aggregate request load. -- Jack Krupansky On Thu, Jan 21, 2016 at 6:37 AM, Gian Maria Ricci - aka Alkampfer < alkamp...@nablasoft.com> wrote: > Hi, > > > > I’ve

Re: One complex wildcard query lead solr OOM

2016-01-21 Thread Jack Krupansky
complex wildcard is used - should an exception be thrown, or... what? I suppose it might be simplest to have a Solr option to limit the number of wildcard characters used in a term, like to 4 or 8 or something like that. IOW, have Solr check the term before the WildcardQuery is generated. -- Jack

Re: Mix Solr 4 and 5?

2016-01-22 Thread Jack Krupansky
Just to be clear, are you talking about a single app that does SolrJ calls to both your CMS and your free text search index? So, one Java app that is simultaneously sending requests to two Solr instances (once 4, one 5)? -- Jack Krupansky On Fri, Jan 22, 2016 at 1:57 AM, wrote: > Hi, >

Re: Mix Solr 4 and 5?

2016-01-22 Thread Jack Krupansky
), the app should work fine. So... if you stick with SolrJ 4 and use the Solr 4 doc as your guide, you should be okay. That's the theory. Worst case, you would have to deploy a Solr 4 server. That's not the preferred choice, but is a decent backup plan. -- Jack Krupansky On Fri, Jan 22, 201

Re: Mix Solr 4 and 5?

2016-01-22 Thread Jack Krupansky
nts aren't using any new features there would be a reasonable expectation that they should continue to work. -- Jack Krupansky On Fri, Jan 22, 2016 at 10:40 AM, wrote: > Yeah, sort of. Solr isn't bundled in the CMS, it is in a separate Tomcat > instance. But our code is running

Re: Mix Solr 4 and 5?

2016-01-22 Thread Jack Krupansky
To be clear, having separate Solr servers on different versions should definitely not be a problem. The only potential difficulty here is the SolrJ vs. server back-compat issue. -- Jack Krupansky On Fri, Jan 22, 2016 at 10:57 AM, wrote: > Shawn wrote: > > > > If you are NOT ru

Re: Taking Solr to production

2016-01-22 Thread Jack Krupansky
ll as HA availability requirements. -- Jack Krupansky On Fri, Jan 22, 2016 at 5:45 PM, Toke Eskildsen wrote: > Aswath Srinivasan (TMS) wrote: > > * Totally about 2.5 million documents to be indexed > > * Documents average size is 512 KB - pdfs and htmls > > &g

Re: One complex wildcard query lead solr OOM

2016-01-24 Thread Jack Krupansky
Just escape them with a backslash. Or put each term in quotes. -- Jack Krupansky On Sun, Jan 24, 2016 at 5:21 AM, Jian Mou wrote: > Hi Jack, > > Thanks! Do you know how to disable wildcards, What I want is if input is > wildcards, just treat it as a normal char. I other words, >

Re: unmerged index segments

2016-01-25 Thread Jack Krupansky
What exacting are you merge policy settings in solrconfig? They control when the background merges will be performed. Sometimes they do need to be tweaked. -- Jack Krupansky On Mon, Jan 25, 2016 at 1:50 PM, James Mason wrote: > Hi, > > I’ve have a large index that has been adde

Re: unmerged index segments

2016-01-26 Thread Jack Krupansky
doc, which for Tiered is here: http://lucene.apache.org/core/5_4_0/core/org/apache/lucene/index/TieredMergePolicy.html I did doc all of these options (as of Solr 4.4) in my Solr 4.x Deep Dive e-book and I don't think much of that has changed since then: http://www.lulu.com/us/en/shop/jack-krupans

Re: Solr cannot return result when query with # * like title:#7654321*

2016-01-27 Thread Jack Krupansky
Just be to sure, please post the lines of code or command line that you are using to issue the query. -- Jack Krupansky On Wed, Jan 27, 2016 at 10:50 PM, Yonik Seeley wrote: > On Wed, Jan 27, 2016 at 10:47 PM, diyun2008 wrote: > > Hi Yonik > > > >I do actually en

Re: Adding new documents to the search results and rescoring. Is it possible?

2016-01-28 Thread Jack Krupansky
would never be a need to "re" score them. Are you simply looking for a way to shift/boost the scores somehow? Again, tell us more about what you are actually trying to achieve. -- Jack Krupansky On Thu, Jan 28, 2016 at 9:52 AM, vitaly bulgakov wrote: > I have Solr 4.2. Is it p

Re: Solr cannot return result when query with # * like title:#7654321*

2016-01-28 Thread Jack Krupansky
sing curl, please post the full curl command. -- Jack Krupansky On Thu, Jan 28, 2016 at 1:03 AM, diyun2008 wrote: > The query is rather simple: > http://127.0.0.1:8080/solr/collection1/select?q=title:#7654321* > > > > > -- > View this message in context: > http://l

Re: implement exact match for one of the search fields only?

2016-01-28 Thread Jack Krupansky
A simple boost query (bq) might do the trick, using edismax: q=dvd bracket bq=spp_keyword_exact:"dvd bracket"^100 qf=P_VeryShortDescription P_ShortDescription P_CatConcatKeyword -- Jack Krupansky On Thu, Jan 28, 2016 at 12:49 PM, Erick Erickson wrote: > bq: if you are interested

Re: Nested documents and many-many relation

2016-01-29 Thread Jack Krupansky
block must be written to a new segment. -- Jack Krupansky On Fri, Jan 29, 2016 at 5:13 AM, Sathyakumar Seshachalam < sathyakumar_seshacha...@trimble.com> wrote: > Hi, > > Am trying to investigate the possibility of using Block Join query parser > in a many-to-many

Re: How much JVM should we allocate

2016-01-29 Thread Jack Krupansky
have room to expand and handle spikes. 8. Run that final config for an extended period (days) with as realistic a load as possible 9. If it too hits OOM or frequent GC, you may have to bump up the heap some more, like another 10%. -- Jack Krupansky On Fri, Jan 29, 2016 at 11:51 AM, Erick Eri

Re: Increasing maxMergedSegmentMB value

2016-01-30 Thread Jack Krupansky
d not be possible with a limit of only 15GB. Maybe you could clue us in as to what effect you are trying to achieve. I mean, why should any app care whether segments are 10GB or 15GB? -- Jack Krupansky On Sat, Jan 30, 2016 at 6:28 PM, Shawn Heisey wrote: > On 1/30/2016 7:31 AM, Zheng Lin Edwin

Re: Increasing maxMergedSegmentMB value

2016-01-31 Thread Jack Krupansky
of 5GB. If you want to get a lot above that, you're in uncharted territory. Besides, if you start pushing your index well above the amount of available system memory your query performance will suffer. I'd watch for the latter before pushing on the former. -- Jack Krupansky On Sun, Jan

Re: URI is too long

2016-01-31 Thread Jack Krupansky
Or try the terms query parser that lets you eliminate all the OR operators: https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermsQueryParser -- Jack Krupansky On Sun, Jan 31, 2016 at 9:23 AM, Paul Libbrecht wrote: > How about using POST? > > paul >

Re: Determine if Merge is triggered in SOLR

2016-01-31 Thread Jack Krupansky
://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig -- Jack Krupansky On Sun, Jan 31, 2016 at 1:59 PM, abhi Abhishek wrote: > Hi All, > any suggestions/ ideas? > > Thanks, > Abhishek > > On Tue, Jan 26, 2016 at 9:16 PM, abhi Abhishek > wrote: > > >

Re: Error in UIMA, probably opencalais,

2016-02-01 Thread Jack Krupansky
At the bottom (the fine print!) it says: lineNumber: 15; columnNumber: 7; The element type "meta" must be terminated by the matching end-tag "". -- Jack Krupansky On Mon, Feb 1, 2016 at 10:45 AM, Gian Maria Ricci - aka Alkampfer < alkamp...@nablasoft.com> wrote: >

Re: alternative forum for SOLR user

2016-02-01 Thread Jack Krupansky
Some people prefer to use Stack Overflow, but this mailing list is still the definitive "forum" for Solr users. See: http://stackoverflow.com/questions/tagged/solr -- Jack Krupansky On Mon, Feb 1, 2016 at 10:58 AM, Shawn Heisey wrote: > On 2/1/2016 1:13 AM, Jean-Jacques MONOT wr

Re: Error configuring UIMA

2016-02-01 Thread Jack Krupansky
does not exist. -- Jack Krupansky On Mon, Feb 1, 2016 at 10:18 AM, alkampfer wrote: > > > From: outlook_288fbf38c031d...@outlook.com > To: solr-user@lucene.apache.org > Cc: > Date: Mon, 1 Feb 2016 15:59:02 +0100 > Subject: Error configuring UIMA > > I've solv

Re: Error configuring UIMA

2016-02-01 Thread Jack Krupansky
Yeah, that's exactly the kind of innocent user error that UIMA simply has no code to detect and reasonably report. -- Jack Krupansky On Mon, Feb 1, 2016 at 12:13 PM, Gian Maria Ricci - aka Alkampfer < alkamp...@nablasoft.com> wrote: > It was a stupid error, I've mi

Re: implement exact match for one of the search fields only?

2016-02-04 Thread Jack Krupansky
. Besides, the general goal is to avoid app clients talking directly to Solr anyway. -- Jack Krupansky On Thu, Feb 4, 2016 at 2:57 AM, Derek Poh wrote: > Hi Erick > > << > The manual way of doing this would be to construct an elaborate query, > like q=spp_keyword_e

Re: large number of fields

2016-02-05 Thread Jack Krupansky
uot;, definitely not "quite long." That said, the starting point for any data modeling effort is to look at the full range of desired queries and that should drive the data model. So, give us more info on queries, in terms of plain English descriptions of what the user is trying to achieve.

Re: indexing pdf binary stored in mongodb?

2016-02-05 Thread Jack Krupansky
://docs.mongodb.org/manual/reference/program/mongofiles/ -- Jack Krupansky On Fri, Feb 5, 2016 at 3:13 PM, Arnett, Gabriel wrote: > Anyone have any experience indexing pdfs stored in binary form in mongodb? > > . > Gabe Arnett > Senior Dir

Re: URI is too long

2016-02-06 Thread Jack Krupansky
And you're sure that you can't use the terms query parser, which was explicitly designed for handling a very long list of terms to be implicitly ORed? -- Jack Krupansky On Sat, Feb 6, 2016 at 2:26 PM, Salman Ansari wrote: > It looked like there was another issue with my query. I

Re: Solr architecture

2016-02-08 Thread Jack Krupansky
can execute them or if they require fanout to other shards and then aggregation of results from those other shards. -- Jack Krupansky On Mon, Feb 8, 2016 at 11:24 AM, Erick Erickson wrote: > Short form: You really have to prototype. Here's the long form: > > > https://lucidwo

Re: Solr architecture

2016-02-08 Thread Jack Krupansky
;t require fanout to other shards, which effectively turns each incoming query into n queries where n is the number of shards. -- Jack Krupansky On Mon, Feb 8, 2016 at 12:07 PM, Jack Krupansky wrote: > So is there any aging or TTL (in database terminology) of older docs? > > And do all o

Re: Need to move on SOlr cloud (help required)

2016-02-10 Thread Jack Krupansky
, then that one node should be replaced. There are indeed plenty of good reasons to prefer SolrCloud over traditional master-slave replication, but so far you haven't touched on any of them. How much data (number of documents) do you have? What is your typical query latency? -- Jack Krup

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Jack Krupansky
Is this a scenario that was working fine and suddenly deteriorated, or has it always been slow? -- Jack Krupansky On Thu, Feb 11, 2016 at 4:33 AM, Matteo Grolla wrote: > Hi, > I'm trying to optimize a solr application. > The bottleneck are queries that request 100

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Jack Krupansky
wildcards or function queries, or is it very simple keywords? How many operators? Have you used the debugQuery=true parameter to see which search components are taking the time? -- Jack Krupansky On Thu, Feb 11, 2016 at 9:42 AM, Matteo Grolla wrote: > Hi Yonic, > after the first q

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Jack Krupansky
Good to know. Hmmm... 200ms for 10 rows is not outrageously bad, but still relatively bad. Even 50ms for 10 rows would be considered barely okay. But... again it depends on query complexity - simple queries should be well under 50 ms for decent modern hardware. -- Jack Krupansky On Thu, Feb 11

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Jack Krupansky
Again, first things first... debugQuery=true and see which Solr search components are consuming the bulk of qtime. -- Jack Krupansky On Thu, Feb 11, 2016 at 11:33 AM, Matteo Grolla wrote: > virtual hardware, 200ms is taken on the client until response is written to > disk > qtime o

Re: optimize requests that fetch 1000 rows

2016-02-12 Thread Jack Krupansky
? How many fields are in fl? Any function queries in fl? -- Jack Krupansky On Fri, Feb 12, 2016 at 4:57 AM, Matteo Grolla wrote: > Hi Jack, > tell me if I'm wrong but qtime accounts for search time excluding the > fetch of stored fields (I have a 90ms qtime and a ~30s time

Re: query knowledge graph

2016-02-12 Thread Jack Krupansky
"knowledge graph" is kind of vague - what did you have in mind? An example would help. -- Jack Krupansky On Fri, Feb 12, 2016 at 7:27 AM, Midas A wrote: > Please suggest how to create query knowledge graph for e-commerce > application . > > > please describe in detail

Re: Negating multiple array fileds

2016-02-14 Thread Jack Krupansky
Due to a bug (or poorly designed feature), you need to explicitly include a non-negative query term in a purely negative sub-query. Usually this means using *:* to select all documents. Note that the use of parentheses introduces a sub-query. So, (-persons:*) s.b. (*:* -persons:*). -- Jack

Re: "pf" not supported by edismax?

2016-02-14 Thread Jack Krupansky
pf stands for phrase boosting, which implies tokenized text... spp_keyword_exact sounds like it is not tokenized. -- Jack Krupansky On Sun, Feb 14, 2016 at 10:08 PM, Derek Poh wrote: > Hi > > Correct me If I am wrong, edismax is an extension of dismax, so it will > support "p

Re: "pf" not supported by edismax?

2016-02-14 Thread Jack Krupansky
term. -- Jack Krupansky On Mon, Feb 15, 2016 at 12:11 AM, Derek Poh wrote: > It is using KeywordTokenizerFactory. It is still consider as tokenized? > > Here's the field definition: > type="gs_keyword_exact" multiValued="true&

Re: Negating multiple array fileds

2016-02-15 Thread Jack Krupansky
I should also have noted that your full query: (-persons:*)AND(-places:*)AND(-orgs:*) can be written as: -persons:* -places:* -orgs:* Which may work as is, or can also be written as: *:* -persons:* -places:* -orgs:* -- Jack Krupansky On Mon, Feb 15, 2016 at 1:57 AM, Salman Ansari wrote

Re: Near Duplicate Documents, "authorization"? tf/idf implications, spamming the index?

2016-02-15 Thread Jack Krupansky
pothetical similarity schema could both specify any number of document categories. But that's speculation on my part. -- Jack Krupansky On Mon, Feb 15, 2016 at 6:42 PM, Chris Morley wrote: > Hey Solr people: > > Suppose that we did not want to break up our document set into separate &g

Re: Negating multiple array fileds

2016-02-17 Thread Jack Krupansky
won't give anybody grief for using it as a way of compensating for the brain-damaged way that Lucene and Solr handle single-asterisk and negated single-asterisk queries. -- Jack Krupansky On Tue, Feb 16, 2016 at 8:17 PM, Shawn Heisey wrote: > On 2/15/2016 9:22 AM, Jack Krupansky wrote:

Re: Reverse Eningeer Query For a Given Result Set?

2016-02-18 Thread Jack Krupansky
positive or false negative as new documents are added to the index that are no longer in the same pattern as the old results by still within the pattern of the original Oracle query. The trick may be whether the delta is meaningful for the actual application use case. -- Jack Krupansky On Thu, Feb 18

Re: WhitespaceTokenizerFactory and PathHierarchyTokenizerFactory

2016-02-24 Thread Jack Krupansky
Your statement makes no sense. Please clarify. Express your requirement(s) in plain English first before dragging in possible solutions. Technically, path elements can have embedded spaces. -- Jack Krupansky On Wed, Feb 24, 2016 at 6:53 AM, Anil wrote: > HI, > > i need to

Re: WhitespaceTokenizerFactory and PathHierarchyTokenizerFactory

2016-02-25 Thread Jack Krupansky
enize all of the elements of a path or IP as separate terms. Ditto for a query, so you can effectively do bth keyword and phrase queries to match individual terms (e.g., path elements) or phrases or sequences of path elements or IP address components. -- Jack Krupansky On Thu, Feb 25, 2016 at 12:

Re: Query time de-boost

2016-02-25 Thread Jack Krupansky
0.1 is a fractional boost - all intra-query boosts are multiplicative, not additive, so term^0.1 reduces the term by 90%. -- Jack Krupansky On Wed, Feb 24, 2016 at 11:29 AM, shamik wrote: > Binoy, 0.1 is still a positive boost. With title getting the highest > weight, > this won&#

Re: Query time de-boost

2016-02-26 Thread Jack Krupansky
. IOW, it de-boosts occurrences of the term. The point remains that you do not need a "negative boost" to de-boost a term. -- Jack Krupansky On Fri, Feb 26, 2016 at 4:01 AM, Emir Arnautovic < emir.arnauto...@sematext.com> wrote: > Hi Jack, > I just checked on 5.5 and

Re: Solr regex documenation

2016-02-27 Thread Jack Krupansky
See: https://lucene.apache.org/core/5_5_0/core/org/apache/lucene/search/RegexpQuery.html https://lucene.apache.org/core/5_5_0/core/org/apache/lucene/util/automaton/RegExp.html I vaguely recall a Jira about regex not working at all in Solr. I don't recall reading about a resolution. --

Re: Query time de-boost

2016-02-28 Thread Jack Krupansky
in the bq parameter do you need to use negative boost values - in all the other contexts a fractional boost is sufficient. It's unfortunate that the ref guide isn't more clear about this key distinction. Now hopefully we (and others!) are on the same page. -- Jack Krupansky On Sun, Feb

Re: ExtendedDisMax configuration nowhere to be found

2016-02-28 Thread Jack Krupansky
handler in solrconfig.xml. -- Jack Krupansky On Sun, Feb 28, 2016 at 2:42 PM, wrote: > Hi, > > I want to setup ExtendedDisMax in our solr 4.6 server, but I can't seem to > find any example configuration for this. Ie the configuration needed in > solrconfig.xml. In the wiki page

Re: ExtendedDisMax configuration nowhere to be found

2016-02-28 Thread Jack Krupansky
list of fields to query (qf) and your actual query text (q). I know, I know... some people just can't handle automatic. (Some people hate DisneyLand/World!) -- Jack Krupansky On Sun, Feb 28, 2016 at 5:16 PM, wrote: > I'm sorry, but I am still confused. I'm expecting

Re: ExtendedDisMax configuration nowhere to be found

2016-02-28 Thread Jack Krupansky
So, all this hard work that people have put into Solr to make it more like a Disney theme park is just... wasted... on you? Sigh. Okay, I guess we can't please everyone. -- Jack Krupansky On Sun, Feb 28, 2016 at 5:40 PM, wrote: > I have no problem with automatic. It is "automa

Re: ExtendedDisMax configuration nowhere to be found

2016-02-29 Thread Jack Krupansky
arted with the current Solr Reference Guide. The old edismax wiki should in fact have a tombstone warning that indicates that it is obsolete and redirect people to the new doc. Out of curiosity, how did you get to that old wiki page in the first place? -- Jack Krupansky On Mon, Feb 29, 2016 at 3:20

Re: ExtendedDisMax configuration nowhere to be found

2016-02-29 Thread Jack Krupansky
It is indeed a problem that the old edismax wiki is result #1 from Google. I find that annoying as well since I also use Google search as my first step in accessing doc on everything. -- Jack Krupansky On Mon, Feb 29, 2016 at 10:03 AM, wrote: > Thanks Shawn, > > I had more or less ass

Re: ExtendedDisMax configuration nowhere to be found

2016-02-29 Thread Jack Krupansky
es in old emails. In any case, a proper tombstone is probably the best step at this point. -- Jack Krupansky On Mon, Feb 29, 2016 at 10:39 AM, Jack Krupansky wrote: > It is indeed a problem that the old edismax wiki is result #1 from Google. > I find that annoying as well since I also use

  1   2   3   4   5   6   7   8   9   10   >