Re: Re: Re: Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
I don't think you can synonym-ize both the multi-token phrase and each individual token in the multi-token phrase at the same time. But anyone else feel free to chime in! Best, Audrey Lorberfeld On 3/16/20, 12:40 PM, "atin janki" wrote: I aim to achieve an expansion like - Syno

Re: Re: Re: Re: Re: Query Autocomplete Evaluation

2020-02-28 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
are given 0s) / total suggestions displayed ? > > If the above is true, wouldn't Selection to Display be binary? I.e. it's > either 1/# of suggestions displayed (assuming this is a constant) or 0? > > Best, > Audrey > > > __

Re: Re: Re: Re: Query Autocomplete Evaluation

2020-02-28 Thread Paras Lehana
suggestions displayed (assuming this is a constant) or 0? > > Best, > Audrey > > > > From: Paras Lehana > Sent: Thursday, February 27, 2020 2:58:25 AM > To: solr-user@lucene.apache.org > Subject: [EXTERNAL] Re: Re: Re: Query Autocomplete Eval

Re: Re: Re: Re: Query Autocomplete Evaluation

2020-02-27 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
, wouldn't Selection to Display be binary? I.e. it's either 1/# of suggestions displayed (assuming this is a constant) or 0? Best, Audrey From: Paras Lehana Sent: Thursday, February 27, 2020 2:58:25 AM To: solr-user@lucene.apache.org Subject: [EXTERNA

Re: Re: Re: Query Autocomplete Evaluation

2020-02-26 Thread Paras Lehana
Hi Audrey, For MRR, we assume that if a suggestion is selected, it's relevant. It's also assumed that the user will always click the highest relevant suggestion. Thus, we calculate position selection for each selection. If still, I'm not understanding your question correctly, feel free to contact

Re: Re: Re: Query Autocomplete Evaluation

2020-02-25 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
This article http://wwwconference.org/proceedings/www2011/proceedings/p107.pdf also indicates that MRR needs binary relevance labels, p. 114: "To this end, we selected a random sample of 198 (query, context) pairs from the set of 7,311 pairs, and manually tagged each of them as related (i.e., th

Re: Re: Re: Anyone have experience with Query Auto-Suggestor?

2020-01-31 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Hi all, reviving this thread. For those of you who use an external file for your suggestions, how do you decide from your query logs what suggestions to include? Just starting out with some exploratory analysis of clicks, dwell times, etc., and would love to hear from the community any advise.

Re: Re: Re: Re: Anyone have experience with Query Auto-Suggestor?

2020-01-24 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
David, True! But we are hoping that these are purely seen as suggestions and that people, if they know exactly what they are wanting to type/looking for, will simply ignore the dropdown options. On 1/24/20, 10:03 AM, "David Hastings" wrote: This is a really cool idea! My only concern is

Re: Re: Re: Anyone have experience with Query Auto-Suggestor?

2020-01-24 Thread Lucky Sharma
Hi Audrey, As suggested by Erik, you can index the data into a seperate collection and You can instead of adding weights inthe document you can also use LTR(Learning to Rank) with in Solr to rerank on the documents. And also to increase more relevance with in the Autosuggestion and making positiona

Re: Re: Re: Anyone have experience with Query Auto-Suggestor?

2020-01-24 Thread David Hastings
This is a really cool idea! My only concern is that the edge case searches, where a user knows exactly what they want to find, would be autocomplete into something that happens to be more "successful" rather than what they were looking for. for example, i want to know the legal implications of ja

Re: Re: Re: Anyone have experience with Query Auto-Suggestor?

2020-01-24 Thread Lucky Sharma
Hi Audrey, As suggested by Erik, you can index the data into a seperate collection and You can instead of adding weights inthe document you can also use LTR with in Solr to rerank on the features. Regards, Lucky Sharma On Fri, 24 Jan, 2020, 8:01 pm Audrey Lorberfeld - audrey.lorberf...@ibm.com,

Re: Re: Re: Anyone have experience with Query Auto-Suggestor?

2020-01-24 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Hi Alessandro, I'm so happy there is someone who's done extensive work with QAC here! Right now, we measure nDCG via a Dynamic Bayesian Network. To break it down, we: - use a DBN model to generate a "score" for each query_url pair. - We then plug that score into a mathematical formula we foun

Re: Re: Re: Anyone have experience with Query Auto-Suggestor?

2020-01-24 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Erik, Thank you! Yes, that's exactly how we were thinking of architecting it. And our ML engineer suggested something else for the suggestion weights, actually -- to build a model that would programmatically update the weights based on those suggestions' live clicks @ position k, etc. Pretty co

Re: Re: Re: Re: Handling overlapping synonyms

2020-01-20 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Hm, I'm not sure what you mean, but I am pretty new to Solr. Apologies! On 1/20/20, 12:01 PM, "fiedzia" wrote: >From my understanding, if you want regional sales manager to be indexed as both director of sales and area manager, you >would have to type: > >Regional sales ma

Re: Re: Re: Handling overlapping synonyms

2020-01-20 Thread fiedzia
>From my understanding, if you want regional sales manager to be indexed as both director of sales and area manager, you >would have to type: > >Regional sales manager -> director of sales, area manager that works for searching, but because everything is in the same position, searching for "dir

Re: Re: Re: Handling overlapping synonyms

2020-01-20 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
From my understanding, if you want regional sales manager to be indexed as both director of sales and area manager, you would have to type: Regional sales manager -> director of sales, area manager I do not believe you can chain synonyms. Re: bigrams/trigrams, I was more interested in you want

Re: Re: Re: POS Tagger

2019-10-25 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Oh I see I see -- Audrey Lorberfeld Data Scientist, w3 Search IBM audrey.lorberf...@ibm.com On 10/25/19, 12:21 PM, "David Hastings" wrote: oh i see what you mean, sorry, i explained it incorrectly. those sentences are what would be in the index, and a general search for 'rush

Re: Re: Re: POS Tagger

2019-10-25 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
How can a field itself be tagged with a part of speech? -- Audrey Lorberfeld Data Scientist, w3 Search IBM audrey.lorberf...@ibm.com On 10/25/19, 12:12 PM, "David Hastings" wrote: nope, i boost the fields already tagged at query time against teh query On Fri, Oct 25, 2019 at 12

Re: Re: Re: Re: Re: Protecting Tokens from Any Analysis

2019-10-09 Thread David Hastings
yup. youre going to find solr is WAY more efficient than you think when it comes to complex queries. On Wed, Oct 9, 2019 at 3:17 PM Audrey Lorberfeld - audrey.lorberf...@ibm.com wrote: > True...I guess another rub here is that we're using the edismax parser, so > all of our queries are inherent

Re: Re: Re: Re: Re: Protecting Tokens from Any Analysis

2019-10-09 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
True...I guess another rub here is that we're using the edismax parser, so all of our queries are inherently OR queries. So for a query like 'the ibm way', the search engine would have to: 1) retrieve a document list for: --> "ibm" (this list is probably 80% of the documents) --> "the" (th

Re: Re: Re: Re: Protecting Tokens from Any Analysis

2019-10-09 Thread David Hastings
if you have anything close to a decent server you wont notice it all. im at about 21 million documents, index varies between 450gb to 800gb depending on merges, and about 60k searches a day and stay sub second non stop, and this is on a single core/non cloud environment On Wed, Oct 9, 2019 at 2:5

Re: Re: Re: Protecting Tokens from Any Analysis

2019-10-09 Thread David Hastings
only in my more like this tools, but they have a very specific purpose, otherwise no On Wed, Oct 9, 2019 at 2:31 PM Audrey Lorberfeld - audrey.lorberf...@ibm.com wrote: > Wow, thank you so much, everyone. This is all incredibly helpful insight. > > So, would it be fair to say that the majority o

Re: Re: Re: Re: Protecting Tokens from Any Analysis

2019-10-09 Thread David Hastings
oh and by 'non stop' i mean close enough for me :) On Wed, Oct 9, 2019 at 2:59 PM David Hastings wrote: > if you have anything close to a decent server you wont notice it all. im > at about 21 million documents, index varies between 450gb to 800gb > depending on merges, and about 60k searches a

Re: Re: Re: Re: Protecting Tokens from Any Analysis

2019-10-09 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Also, in terms of computational cost, it would seem that including most terms/not having a stop ilst would take a toll on the system. For instance, right now we have "ibm" as a stop word because it appears everywhere in our corpus. If we did not include it in the stop words file, we would have t

Re: Re: Re: Protecting Tokens from Any Analysis

2019-10-09 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Wow, thank you so much, everyone. This is all incredibly helpful insight. So, would it be fair to say that the majority of you all do NOT use stop words? -- Audrey Lorberfeld Data Scientist, w3 Search IBM audrey.lorberf...@ibm.com On 10/9/19, 11:14 AM, "David Hastings" wrote: However,

Re: Re: Re: Multi-lingual Search & Accent Marks

2019-09-04 Thread Walter Underwood
On Sep 3, 2019, at 1:13 PM, Audrey Lorberfeld - audrey.lorberf...@ibm.com wrote: > > The main issue we are anticipating with the above strategy surrounds scoring. > Since we will be increasing the frequency of accented terms, we might bias > our page ranker... You will not be increasing the f

Re: Re: Re: Re: Multi-lingual Search & Accent Marks

2019-09-04 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Thanks, Alex! We'll look into this. -- Audrey Lorberfeld Data Scientist, w3 Search IBM audrey.lorberf...@ibm.com On 9/3/19, 4:27 PM, "Alexandre Rafalovitch" wrote: What about combining: 1) KeywordRepeatFilterFactory 2) An existing folding filter (need to check it ignores Keyword

Re: Re: Re: Multi-lingual Search & Accent Marks

2019-09-03 Thread Alexandre Rafalovitch
What about combining: 1) KeywordRepeatFilterFactory 2) An existing folding filter (need to check it ignores Keyword marked word) 3) RemoveDuplicatesTokenFilterFactory That may give what you are after without custom coding. Regards, Alex. On Tue, 3 Sep 2019 at 16:14, Audrey Lorberfeld - audrey

Re: Re: Re: Multi-lingual Search & Accent Marks

2019-09-03 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Toke, Thank you! That makes a lot of sense. In other news -- we just had a meeting where we decided to try out a hybrid strategy. I'd love to know what you & everyone else thinks... - Since we are concerned with the overhead created by "double-fielding" all tokens per language (because I'm not

Re: Re: Re: Multi-lingual Search & Accent Marks

2019-09-03 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Thank you, Erick! -- Audrey Lorberfeld Data Scientist, w3 Search Digital Workplace Engineering CIO, Finance and Operations IBM audrey.lorberf...@ibm.com On 8/30/19, 3:49 PM, "Erick Erickson" wrote: It Depends (tm). In this case on how sophisticated/precise your users are. If your users

Re: Re: Re: obfuscated password error

2019-03-20 Thread Branham, Jeremy (Experis)
Hard to see in email, particularly because my email server strips urls, but a few thinigs I would suggest – Be sure there aren’t any spaces after your line continuation characters ‘\’. This has bit me before. Check the running processes JVM args and compare `ps –ef | grep solr` Also, I’d recomme

Re: Re: Re: Suppress stack trace in error response

2019-02-22 Thread Branham, Jeremy (Experis)
BTW – Congratulations on joining the PMC! Jeremy Branham jb...@allstate.com On 2/22/19, 9:46 AM, "Branham, Jeremy (Experis)" wrote: Thanks Jason – That’s what I was thinking too. It would require some development. Jeremy Branham jb...@allstate.com On 2/22/1

Re: Re: Re: Page faults

2019-01-09 Thread Erick Erickson
bq: We could create 2 separate collections. - Requires re-indexing - Code changes in our APIs and indexing process - Lost ability to query all the docs at once *** *** Not quite true. You can create an alias that points to multiple collections. HOWEVER, since the scores are computed using differen

Re: Re: Re: Page faults

2019-01-09 Thread Branham, Jeremy (Experis)
Thanks for the information Erick – I’ve learned there are 2 ‘classes’ of documents being stored in this collection. There are about 4x as many documents in class A as class B. When the documents are indexed, the document ID includes the key prefix like ‘A/1!’ or ‘B/1!’, which I understand spreads

Re: Re: Re: Config for massive inserts into Solr master

2016-10-10 Thread Reinhard Budenstecher
> > Just a sanity check. That directory mentioned, what kind of file system is > that on? NFS, NAS, RAID? I'm using Ext4 with options "noatime,nodiratime,barrier=0" on a hardware RAID10 with 4 SSD disks __ Gesendet mit Maills.de - mehr als n

Re: Re: Re: MoreLikeThis Component - how to get fields of documents

2016-05-09 Thread Dr. Jan Frederik Maas
Hey Alessandro, it seems that the edismax MLThandler is not able to work correctly with a solr cloud/sharding: https://issues.apache.org/jira/browse/SOLR-4414 Using the MLThandler we got randomly a response for only very few requests, while the MLTcomponent works fine (except for the problem

Re: Re: Re: Some problems when upload data to index in cloud environment

2015-12-21 Thread 周建二
Erick: Thank your so much for your advise. Now we do not index a large number of files, but in future we may. I will pay more attention to ExtractingRequestHandler. Thanks again. Best regard, Jianer > -原始邮件- > 发件人: "Erick Erickson" > 发送时间: 2015年12月22日 星期二 > 收件人: solr-user > 抄送: >

Re: Re: Re: Re: Re: Re: concept and choice: custom sharding or autosharding?

2015-09-03 Thread scott chu
, 01:26:38 Subject: Re: Re: Re: Re: Re: concept and choice: custom sharding or autosharding? scott chu wrote: > No, both. But first I have to face the indexing performance problem. > Where can I see information about concurrent/parallel indexing on Solr? Depends on how you index. If you

Re: Re: Re: Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-03 Thread Erick Erickson
Ah, that may make my suggestions unworkable re: just reindexing. Still, how much time are we talking about here? I've very often found that indexing performance isn't gated by the Solr processing, but by whatever is feeding Solr. A quick test is to fire up your indexing and see if the CPU utilizat

Re: Re: Re: Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-03 Thread Toke Eskildsen
scott chu wrote: > No, both. But first I have to face the indexing performance problem. > Where can I see information about concurrent/parallel indexing on Solr? Depends on how you index. If you use a Java program, http://lucene.apache.org/solr/5_2_0/solr-solrj/index.html?org/apache/solr/client/s

Re: Re: Re: Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-03 Thread scott chu
solr-user,妳好 No, both. But first I have to face the indexing performance problem. Where can I see information about concurrent/parallel indexing on Solr? Thanks in advance. - Original Message - From: Toke Eskildsen To: solr_user lucene_apache Date: 2015-09-04, 00:57:51 Subject: Re

Re: Re: Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-03 Thread Toke Eskildsen
scott chu wrote:   > I keep forgeting to mention one thing along the discussion session. > Our data is Chinese news articles and we use CJK tokenizer > (i.e. 2-gram) currently. The time spent to indexing is quite slow, > compared to indexing english articles. That's why I am so > worrying about in

Re: Re: Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-03 Thread scott chu
year using MMSeg algorithm or 1-ngram+query-preprocessor). - Original Message - From: Erick Erickson To: solr-user Date: 2015-09-04, 00:07:43 Subject: Re: Re: Re: concept and choice: custom sharding or auto sharding? bq: If you switch to SolrCloud, will you still keep numShards para

Re: Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-03 Thread Erick Erickson
bq: If you switch to SolrCloud, will you still keep numShards parameter to 1 yes. Although if you want to add more replicas you might want to specify that. For 10M documents, I wouldn't be very fancy. Indexing them shouldn't take very long, and I think your time would be better spent on other thi

Re: Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-03 Thread scott chu
solr-user,妳好 If you switch to SolrCloud, will you still keep numShards parameter to 1? If you are migrating to SolrCloud and going to split that single shard into multple shards, Wouldn't you have to reindex the data? Is it possible just put that single shard into SolrCloud and call SPLITSHAR

Re: Re: Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-02 Thread scott chu
solr-user,妳好 Sorry ,wrong again. Auto sharding is not implicit router. - Original Message - From: scott chu To: solr-user Date: 2015-09-02, 23:50:20 Subject: Re: Re: Re: concept and choice: custom sharding or auto sharding? solr-user,妳好 Thanks! I'll go back to check m

Re: Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-02 Thread scott chu
solr-user,妳好 Thanks! I'll go back to check my old environment and that article is really helpful. BTW, I think I got wrong about compositeID. In the reference guide, it said compositeID needs numShards. That means what I describe in question 5 seems wrong cause I intend to plan one shard one

RE : RE : RE : Shards don't return documents in same order

2014-05-06 Thread Francois Perron
ck...@gmail.com] Envoyé : 6 mai 2014 11:39 À : solr-user@lucene.apache.org Objet : Re: RE : RE : Shards don't return documents in same order copyField should be working fine on all servers. What it sounds like to me is that somehow your schema.xml file was different on one machine. Now, this shoul

Re: RE : RE : Shards don't return documents in same order

2014-05-06 Thread Erick Erickson
copyField should be working fine on all servers. What it sounds like to me is that somehow your schema.xml file was different on one machine. Now, this shouldn't be happening if you follow the practice of altering your schema, pushing to ZooKeeper, _and_ restarting or reloading your Solr nodes. So

Re: Re: Re: Need help importing OOXML custom properties into Solr

2014-03-18 Thread Alexandre Rafalovitch
The metadata fields can be all sorts of strange, including spaces and other strange characters. So, often, there is some issue on mapping. But yes, please, add the howto to Wiki. You will need to get your account whitelisted first (due to spammers), so send a separate email with your Apache wiki i

Re: Re: Re: how to use "LukeRequestHandler" count top term more than once in each document? (v 4.6.1, 4.7.0)

2014-03-07 Thread Ahmet Arslan
Hi, Looks like totaltermfreq (ttf) is equals to collection frequency.  Please see other relevancy functions :  http://wiki.apache.org/solr/FunctionQuery#Relevance_Functions Ahmet On Friday, March 7, 2014 6:38 PM, cqlangyi wrote: hi Ahmet, thank you, quite clear!!! so now i could get 'df' vi

Re: Re: Re: Re: Shard update error when using DIH

2013-05-09 Thread heaven
Thank you all, guys. Your advises work great and I don't see any errors in Solr logs anymore. Best, Alex Monday 29 April 2013, you wrote: On 29 April 2013 14:55, heaven <[hidden email][1]> wrote: > Got these errors after switching the field type to long: > * *crm-test:* > org.apache.

Re: Re: Re: Re: Shard update error when using DIH

2013-04-29 Thread heaven
Whoops, yes, that works. Will check if that helped to fix the original error now. Monday 29 April 2013, you wrote: On 29 April 2013 14:55, heaven <[hidden email][1]> wrote: > Got these errors after switching the field type to long: > * *crm-test:* > org.apache.solr.common.SolrException:o

Re: Re: Re: Shard update error when using DIH

2013-04-29 Thread Gora Mohanty
On 29 April 2013 14:55, heaven wrote: > Got these errors after switching the field type to long: > * *crm-test:* > org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: > Unknown fieldtype 'long' specified on field _version_ You have probably edited your schema. The def

Re: Re: Re: Shard update error when using DIH

2013-04-29 Thread heaven
Got these errors after switching the field type to long: * *crm-test:* org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Unknown fieldtype 'long' specified on field _version_ * *crm-prod:* org.apache.solr.common.SolrException:org.apache.solr.common.SolrExcep

Re: Re: Re: Solr Cell Questions

2012-09-25 Thread Erick Erickson
OK, I was thinking more along the lines of this blog: http://searchhub.org/dev/2012/02/14/indexing-with-solrj/ which uses Tika directly to process the docs on the client (wherever you run it) and only sends the results to Solr The SolrJ program you're referencing uses a different approach...

RE: Re: Re: Invariants on a specific fq value

2010-09-09 Thread Markus Jelsma
It works as expected. The append, well, appends the parameter and because each collection has a unique value, specifying two filters on different collections will always yield zero results.   This, of course, won't work for values that are shared between collections.   -Original message

RE: Re: Re: Invariants on a specific fq value

2010-09-08 Thread Markus Jelsma
Excellent! You already made my day for tomorrow! I'll check it's behavior with fq parameters specifying the a filter for the same field! -Original message- From: Chris Hostetter Sent: Wed 08-09-2010 21:04 To: solr-user@lucene.apache.org; Subject: RE: Re: Re: Invariants on a s

RE: Re: Re: Invariants on a specific fq value

2010-09-08 Thread Chris Hostetter
: Sounds great! I'll be very sure to put it to the test tomorrow and : perhaps add documentation on these types to the solrconfigxml wiki page : for reference. SolrConfigXml wouldn't really be an appropriate place to document this -- it's not a general config item, it's a feature of the Search

RE: Re: Re: Invariants on a specific fq value

2010-09-08 Thread Markus Jelsma
Sounds great! I'll be very sure to put it to the test tomorrow and perhaps add documentation on these types to the solrconfigxml wiki page for reference.     -Original message- From: Yonik Seeley Sent: Wed 08-09-2010 19:38 To: solr-user@lucene.apache.org; Subject: Re: Re: Invariants o

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread MitchK
Otis, And again I wished I were registred. I will check the JIRA and when I feel comfortable with it, I will open it. Kind regards - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-and-Nutch-Droids-to-use-or-not-to-use-tp900069p904145.html Sent from the Solr - U

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread Otis Gospodnetic
10 12:07:13 PM > Subject: Re: Re: Re: Solr and Nutch/Droids - to use or not to use? > > Otis, you are right. I wasn't aware of this. At least not with such a > large dataList (let's think of an index with 4mio docs, this would mean we > got an ExternalFile with 4m

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread MitchK
t; From: MitchK >> To: solr-user@lucene.apache.org >> Sent: Thu, June 17, 2010 4:15:27 AM >> Subject: Re: Re: Re: Solr and Nutch/Droids - to use or not to use? >> >> > > >> Solr doesn't know anything about OPIC, but I suppose you can >> feed

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread Otis Gospodnetic
.com/ - Original Message > From: MitchK > To: solr-user@lucene.apache.org > Sent: Thu, June 17, 2010 4:15:27 AM > Subject: Re: Re: Re: Solr and Nutch/Droids - to use or not to use? > > > Solr doesn't know anything about OPIC, but I suppose you can > feed t

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread MitchK
> Solr doesn't know anything about OPIC, but I suppose you can feed the OPIC > score computed by Nutch into a Solr field and use it during scoring, if > you want, say with a function query. > Oh! Yes, that makes more sense than using the OPIC as doc-boost-value. :-) Anywhere at the Lucene Mail

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread Otis Gospodnetic
al Message > From: MitchK > To: solr-user@lucene.apache.org > Sent: Thu, June 17, 2010 1:52:32 AM > Subject: RE: Re: Re: Solr and Nutch/Droids - to use or not to use? > > Good morning! Great feedback from you all. This really helped a lot > to get an impression of what is

RE: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread MitchK
Good morning! Great feedback from you all. This really helped a lot to get an impression of what is possible and what is not. What is interesting to me are some detail questions. Let's assume Solr is possible to work on his own with distributed indexing, so that the client does not need to know

RE: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread Markus Jelsma
You're right. Currently clients need to take care of this, in this case, Nutch would be the client but it cannot be configured as such. It would, indeed, be more appropriate for Solr to take care of this. We can already query any server with a set of shard hosts specified, so it would make sense

Re: Re : Re : wildcard searches

2009-10-06 Thread Avlesh Singh
You are right, Angel. The problem would still persist. Why don't you consider putting the original data in some field. While querying, you can query on both the fields - analyzed and original one. Wildcard queries will not give you any results from the analyzed field but would match the data in you

Re : Re : Re : Indexing fields dynamically

2009-09-11 Thread nourredine khadri
Ok, i'll try the transformer (javascript needs jdk1.6 i think) Thanks again. Noble Paul wrote : > >If you use DIH for indexing writing a transformer is the simplest >thing. You can even write it in javascript >

Re: Re : Re : Indexing fields dynamically

2009-09-11 Thread Noble Paul നോബിള്‍ नोब्ळ्
If you use DIH for indexing writing a transformer is the simplest thing. You can even write it in javascript On Fri, Sep 11, 2009 at 1:13 PM, nourredine khadri wrote: > > The pb is that i don't handle fields name. It can be anything (i want to let > the developpers free for this) > Where and how

Re: Re : Re : Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-11 Thread Noble Paul നോബിള്‍ नोब्ळ्
thanks for reporting the issue. On Fri, Sep 11, 2009 at 2:54 PM, nourredine khadri wrote: > Great! it works ! > > Thanks Paul. I appreciate your reactivity. > > > > nourredine khadri wrote : >> > >>Thanks! I'll test it ASAP! >> > > > >>Noble Paul wrote : >>> >>>https://issues.apache.org/jira/brow

Re : Re : Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-11 Thread nourredine khadri
Great! it works ! Thanks Paul. I appreciate your reactivity. nourredine khadri wrote : > >Thanks! I'll test it ASAP! > >Noble Paul wrote : >> >>https://issues.apache.org/jira/browse/SOLR-1421 >>

Re : Re : Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-11 Thread nourredine khadri
Thanks! I'll test it ASAP! Noble Paul wrote : > >https://issues.apache.org/jira/browse/SOLR-1421 >

Re: Re : Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
https://issues.apache.org/jira/browse/SOLR-1421 2009/9/10 Noble Paul നോബിള്‍ नोब्ळ् : > I guess there is a bug. I shall raise an issue. > > > > 2009/9/10 Noble Paul നോബിള്‍  नोब्ळ् : >> everything looks fine and it beats me completely. I guess you will >> have to debug this >> >> On Thu, Sep 10,

Re: Re : Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess there is a bug. I shall raise an issue. 2009/9/10 Noble Paul നോബിള്‍ नोब्ळ् : > everything looks fine and it beats me completely. I guess you will > have to debug this > > On Thu, Sep 10, 2009 at 6:17 PM, nourredine khadri > wrote: >> Some fields are null but not the one parsed by XPat

Re: Re : Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
everything looks fine and it beats me completely. I guess you will have to debug this On Thu, Sep 10, 2009 at 6:17 PM, nourredine khadri wrote: > Some fields are null but not the one parsed by XPathEntityProcessor (named > XML) > > 10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.LogTra

Re : Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread nourredine khadri
Some fields are null but not the one parsed by XPathEntityProcessor (named XML) 10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.LogTransformer transformRow FIN: Map content : {KEYWORDS=pub, SPECIFIC=null, FATHERSID=, CONTAINERID=, ARCHIVEDDATE=0, SITE=12308, LANGUAGE=null, ARCHIVESTATE

Re: Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
what do you see if you keep the logTemplate="${document}". I'm trying to figure out the contents of the map

RE: RE: Re:

2007-12-02 Thread Andrew Nagy
Ugh ... I shouldn't be coding on a sunday night - especially after the eagles lost again! I spelled separator correctly this time :) - But still no luck. curl 'http://localhost:8080/solr/update/csv?header=true&separator=%7C&encapsulator=%22&commit=true&stream.file=import/homes.csv' -H 'Content

Re: RE: Re:

2007-12-02 Thread Brian Whitman
On Dec 2, 2007, at 6:00 PM, Andrew Nagy wrote: On Dec 2, 2007, at 5:43 PM, Ryan McKinley wrote: try \& rather then %26 or just put quotes around the whole url. I think curl does the right thing here. I tried all the methods: converting & to %26, converting & to \& and encapsulating

Re: RE: Re:

2007-12-02 Thread Brian Whitman
On Dec 2, 2007, at 5:29 PM, Andrew Nagy wrote: Sorry for not explaining my self clearly: I have header=true as you can see from the curl command and there is a header line in the csv file. was this your actual curl request? curl http://localhost:8080/solr/update/csv?header=true%26seper

Re: Re: Re: Re: Solr replication

2007-10-01 Thread ycrux
Perfect. Thanks for all guys. cheers Y. Message d'origine >Date: Tue, 2 Oct 2007 01:01:37 +1000 >De: climbingrose >A: solr-user@lucene.apache.org >Sujet: Re: Re: Re: Solr replication > boundary="=_Part_11644_22377225.1191250897674" > >sh

Re: Re: Re: Solr replication

2007-10-01 Thread climbingrose
sh /bin/commit should trigger a refresh. However, this command should be executed as part of snapinstaller so you should have to run it manually. On 10/1/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > One more question about replication. > > Now that the replication is working, how can I see

Re: Re: Re: Solr replication

2007-10-01 Thread ycrux
One more question about replication. Now that the replication is working, how can I see the changes on slave nodes ? The page statistics : "http://solr.slave1:8983/solr/admin/stats.jsp"; doesn't reflect the correct number of indexed documents and still shows numDocs=0. Is there any command

Re: Re: Re: Re: Recommended Update Batch Size?

2006-11-02 Thread Mike Klaas
On 11/2/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 11/2/06, Mike Klaas <[EMAIL PROTECTED]> wrote: > The one thing I'm worried about is closing the writer while documents > are being added to it. IndexWriter is nominally thread-safe, but I'm > not sure what happens to documents that are being

Re: Re: Re: Recommended Update Batch Size?

2006-11-02 Thread Yonik Seeley
On 11/2/06, Mike Klaas <[EMAIL PROTECTED]> wrote: The one thing I'm worried about is closing the writer while documents are being added to it. IndexWriter is nominally thread-safe, but I'm not sure what happens to documents that are being added at the time. Looking at IndexWriter.java, it seems l

Re: Re: Re: Recommended Update Batch Size?

2006-11-02 Thread Mike Klaas
On 11/2/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 11/1/06, Mike Klaas <[EMAIL PROTECTED]> wrote: > DUH2.doDeletions() would also highly benefit from sorting the id terms > before looking them up in these types of cases (as it would trigger > optimizations in lucene as well as being kinder to

Re: Re: Re: downloaded wars can't deploy

2006-10-29 Thread Mike Klaas
On 10/29/06, netsql <[EMAIL PROTECTED]> wrote: I take it that others can get it to work. I read the tutorial. (I did not see anything about solar config xml there). I did the post, that seemed to work. I then surfed to /admin and it gave me jsp errors on all 3 containers. It is possible that y

Re: Re: Re: IIS web server and Solr integration

2006-09-10 Thread Tim Archambault
Thanks Jeff, I am going to run Solr for our beta site, mobile.bangordailynews.net, the mobile device version of our site. I'm just running it on Jetty right now as a completely separate web app under a different port. The Jetty port is not available on the web. I'm using Coldfusion to "get" the r