Display name of a facet

2014-12-18 Thread david fernandes
Hi guys, I have this field in my schema: And I need to use this field as a facet but with a different display name, it means that instead of to display ds_orgao_julgador I'd like to display Órgão Julgador. I tried this: {!key='Órgão Julgador'}ds_orgao_julgador_colegiado but I got: "ERROR 40

Re: unable to upload the solr configuration to zookeeper

2014-12-31 Thread David Philip
Hi Aman, This error could be because the solr instance is looking for the dependent logger jars. You should copy the jar files from solr download ( solr/example/lib/ext) to tomcat lib[1]. [1] https://wiki.apache.org/solr/SolrLogging#Using_the_example_logging_setup_in_containers_other_than_Jet

Re: Queries not supported by Lucene Query Parser syntax

2015-01-01 Thread David Philip
Hi Leonid, Have you had a look at edismax query parser[1]? Isn't that any use to your requirement? I am not sure whether it is something that you are looking for. But the question seemed to be having a query related to that. [1] http://wiki.apache.org/solr/ExtendedDisMax#Query_Syntax On Th

Re: Frequent deletions

2015-01-11 Thread David Santamauro
expect search slowdown and best not to index while this is going on either. David On Sun, 2015-01-11 at 06:46 -0700, ig01 wrote: > Hi, > > It's not an option for us, all the documents in our index have same deletion > probability. > Is there any other solution to perform an opti

Slow faceting performance on a docValues field

2015-01-13 Thread David Smith
econd query, the Solr JVM pegs one CPU, with little or no I/O activity detected on the drive that holds the 175GB index.  I have 48GB of RAM, 1/2 of that dedicated to the OS and the other to the Solr JVM. I do NOT have any fieldValue caches configured as yet, because my (perhaps too simplistic?) reading of the documentation was that DocValues eliminates the need for a field-level cache on this facet field. Any suggestions welcome. Regards, David

Re: Slow faceting performance on a docValues field

2015-01-13 Thread David Smith
Shawn, Thanks for the suggestion, but experimentally, in my case the same query with facet.method=enum returns in almost the same amount of time. Regards David On Tuesday, January 13, 2015 12:02 PM, Shawn Heisey wrote: On 1/13/2015 10:35 AM, David Smith wrote: > I have a qu

Re: Slow faceting performance on a docValues field

2015-01-13 Thread David Smith
ilters. The API is different, you'll have to provide the intervals yourself. Tomás On Tue, Jan 13, 2015 at 10:01 AM, Shawn Heisey wrote: > On 1/13/2015 10:35 AM, David Smith wrote: > > I have a query against a single 50M doc index (175GB) using Solr 4.10.2, > that exhibits th

Re: Slow faceting performance on a docValues field

2015-01-13 Thread David Smith
I missing?   Regards, David On Tuesday, January 13, 2015 1:12 PM, Tomás Fernández Löbbe wrote: No, you are not misreading, right now there is no automatic way of generating the intervals on the server side similar to range faceting... I guess it won't work in your case. Maybe you

Re: Slow faceting performance on a docValues field

2015-01-13 Thread David Smith
f day, my facet time drops to 2.5 seconds.  Now, I can't meet my user needs this way, but it does show the relationship between # of buckets and faceting time. Regards, David

Solr Cloud Stress Test

2015-01-16 Thread david mitche
the recommended way to get started with it? Thanks. David

Problem with faceting

2015-02-06 Thread david . davila
"0336Z", 2, "0002J", 2, "04889446Z", 2, "H", 2, "12312312K", 2, "12345655Z", 2, "48261207P", 2, "77760302T", 2, "77760631F", 2, "77763453T", 2, "77765788N", 2, We are using Solr 4.7 in cloud configuration with 2 shards. Any idea what it is happening? Thanks in advance, David Dávila Atienza AEAT - Departamento de Informática Tributaria Subdirección de Tecnologías de Análisis de la Información e Investigación del Fraude

Re: Problem with faceting

2015-02-06 Thread david . davila
ting Hi David, Yes it sounds weird. Just for testing purpose, It would be nice to have the ID_bent fieldtype definition. Regards. On Fri, Feb 6, 2015 at 9:05 AM, wrote: > Hello, > > we have been using faceting for a long time, but now I have discovered a > problem that I can'

Re: Problem with faceting

2015-02-09 Thread david . davila
Hi, that was the problem. I don't know why, but we have some documents duplicated in the two shards, maybe we have had our config file wrong some time. Thank very much, David Dávila Atienza AEAT De: Erick Erickson Para: solr-user@lucene.apache.org, Fecha: 08/02/2015 11:21 A

Re: Solr indexer and Hadoop

2013-06-26 Thread David Larochelle
Pardon, my unfamiliarity with the Solr development process. Now that it's in the trunk, will it appear in the next 4.X release? -- David On Wed, Jun 26, 2013 at 9:42 AM, Erick Erickson wrote: > Well, it's been merged into trunk according to the comments, so > > Tr

RE: Newbie SolR - Need advice

2013-07-02 Thread David Quarterman
Hi Fabio, Like Jack says, try the tutorial. But to answer your question, SOLR isn't a bolt on to SQLServer or any other DB. It's a fantastically fast indexing/searching tool. You'll need to use the DataImportHandler (see the tutorial) to import your data from the DB into the indices that SOLR u

RE: Newbie SolR - Need advice

2013-07-02 Thread David Quarterman
obile ---- Original message From: "David Quarterman [via Lucene]" Date: 02/07/2013 16:57 (GMT+00:00) To: fabio1605 Subject: RE: Newbie SolR - Need advice Hi Fabio, Like Jack says, try the tutorial. But to answer your question, SOLR isn't a bolt on to S

RE: Newbie SolR - Need advice

2013-07-03 Thread David Quarterman
Hi Fabio, Sandeep is right - it'll take time. SOLR isn't straightforward when you first start out but the tutorial is the best first step. You can then adapt the various config files in the tutorial to adapt to your situation. I'd recommend a simple approach to get the hang of it and just index

SOLR 4.0 frequent admin problem

2013-07-04 Thread David Quarterman
Hi, About once a week the admin system comes up with SolrCore Initialization Failures. There's nothing in the logs and SOLR continues to work in the application it's supporting and in the 'direct access' mode (i.e. http://123.465.789.100:8080/solr/collection1/select?q=bingo:*). The cure is to

RE: SOLR 4.0 frequent admin problem

2013-07-04 Thread David Quarterman
m Yes :-) see SOLR-118, seems an old issue... On 4 Jul 2013 06:43, "David Quarterman" wrote: > Hi, > > About once a week the admin system comes up with SolrCore > Initialization Failures. There's nothing in the logs and SOLR > continues to work in the application

RE: Commit different database rows to solr with same "id" value?

2013-07-10 Thread David Quarterman
Hi Jason, Assuming you're using DIH, why not build a new, unique id within the query to use as the 'doc_id' for SOLR? We do something like this in one of our collections. In MySQL, try this (don't know what it would be for any other db but there must be equivalents): select @rownum:=@rownum+1

RE: Facet sorting seems weird

2013-07-15 Thread David Quarterman
Hi Henrik, Try setting up a copyfield in your schema and set the copied field to use something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on the copyfield. Regards, DQ -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent:

SolrCloud and Joins

2013-07-29 Thread David Larochelle
iteID autorouting.) Any suggestions? -- Thanks, David

Re: SolrCloud and Joins

2013-07-29 Thread David Larochelle
ng all 600 million documents or writing complicated application logic to find out which sentences to update. Hence joins seem like a cleaner solution. -- David On Mon, Jul 29, 2013 at 11:22 AM, Walter Underwood wrote: > Denormalize. Add media_set_id to each sentence document. Done. > >

Re: SolrCloud and Joins

2013-07-31 Thread David Larochelle
that). Then you reindex > them. With small documents like this, it is probably fairly fast. > > If you can't estimate how often the media sets will change or the size of > the changes, then you aren't ready to choose a design. > > wunder > > On Jul 29, 2013, at 8:41

Problem with SynonymFilter and StopFilterFactory

2013-09-16 Thread david . davila
have worked with, I had always put the SynonymFilter previous to StopFilter, but in this I prefered using this order because of the big number of synonym that the list has (i.e. I don't want to generate a lot of synonyms for a word that I really wanted to remove). Thanks, David Dá

Problems with gaps removed with SynonymFilter

2013-09-22 Thread david . davila
worked with, I had always put the SynonymFilter previous to StopFilter, but in this I prefered using this order because of the big number of synonym that the list has (i.e. I don't want to generate a lot of synonyms for a word that I really wanted to remove). Thanks, David Dávila Atienza

Using CachedSqlEntityProcessor with delta imports in DIH

2013-09-23 Thread David Larochelle
uration won't work as expected. -- Thanks, David

dynamic field question

2013-10-08 Thread Twomey, David
I am having trouble trying to return a particular dynamic field only instead of all dynamic fields. Imagine I have a document with an unknown number of sections. Each section can have a 'title' and a 'body' I have each section title and body as dynamic fields such as section_title_* and se

Re: dynamic field question

2013-10-09 Thread Twomey, David
uot; >field >to tie them together. > >Dynamic fields work best when used in moderation. Your use case seems >like >an excessive use of dynamic fields. > >-- Jack Krupansky > >-Original Message- >From: Twomey, David >Sent: Tuesday, October 08, 2013 6:5

Solr's Filtering approaches

2013-10-09 Thread David Philip
get the entire result set of only those groupid that intersected. Is this better way? Can I use any cache technique in this case? - David.

Re: Solr's Filtering approaches

2013-10-11 Thread David Philip
roups - finally, collect only those documents to return to user. How do I try this approach? Any pointers for bit set? Thanks - David On Thu, Oct 10, 2013 at 5:25 PM, Erick Erickson wrote: > Well, my first question is why 50K groups is necessary, and > whether you can simplify that. How a user

Storing 2 dimension array in Solr

2013-10-11 Thread David Philip
hieve this? I am thankful to the forum for replying with patience, on achieving this, i will blog and will share it with all. Thanks - David

Re: Storing 2 dimension array in Solr

2013-10-12 Thread David Philip
isease2 disease3 group1exist slight not found groups2 slightnot foundexist group3slight exist groupK-na exist Thanks - David On Sat, Oct 12, 2013 at 11:39 PM, Erick Erickson wrote: > David: > > Th

Slow queries for common terms

2013-03-21 Thread David Parks
d to consider more?) . Long: Implement sharding, get more hardware resources for these boxes and split up the index across multiple servers. Am I on track in my thinking here? Thanks, David My long query: 0 15464 true true cook book fourth bab

RE: Slow queries for common terms

2013-03-21 Thread David Parks
how much RAM, whether you utilize disk caching well enough and many other things which could affect this situation. But the pure fact that only a few common search words trigger such a delay would suggest commongrams as a possible way forward. -- Jan Høydahl, search solution architect Cominvent AS

Slow queries for common terms

2013-03-21 Thread David Parks
d to consider more?) * Long: Implement sharding, get more hardware resources for these boxes and split up the index across multiple servers. Am I on track in my thinking here? Thanks, David My long query: 0 15464 true true cook book fourth baby xml

RE: Slow queries for common terms

2013-03-21 Thread David Parks
to "work" for a few users. If you could elaborate a bit on your thinking I'd be quite grateful. David -Original Message- From: Jan Høydahl [mailto:jan@cominvent.com] Sent: Thursday, March 21, 2013 8:01 PM To: solr-user@lucene.apache.org Subject: Re: Slow queries for

RE: Slow queries for common terms

2013-03-23 Thread David Parks
the webservers). I'll work towards some improved IO performance and maybe more shards and see how things go. I'll also be able to up the RAM in just a couple of weeks. Are there any settings I should think of in terms of improving cache performance when I can give it say 10GB of RAM?

RE: Slow queries for common terms

2013-03-25 Thread David Parks
ng as it doesn't have an fq clause. Best Erick On Sat, Mar 23, 2013 at 3:10 AM, David Parks wrote: > I see the CPU working very hard, and at the same time I see 2 MB/sec > disk access for that 15 seconds. I am not running it this instant, but > it seems to me that there

RE: MoreLikeThis - Odd results - what am I doing wrong?

2013-04-02 Thread David Parks
Isn't this an AWS security groups question? You should probably post this question on the AWS forums, but for the moment, here's the basic reading material - go set up your EC2 security groups and lock down your systems. http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-s

SolrCloud loadbalancing, replication, and failover

2013-04-18 Thread David Parks
Step 1: distribute processing We have 2 servers in which we'll run 2 SolrCloud instances on. We'll define 2 shards so that both servers are busy for each request (improving response time of the request). Step 2: Failover We would now like to ensure that if either of the servers goes down (we

RE: SolrCloud loadbalancing, replication, and failover

2013-04-18 Thread David Parks
On Apr 18, 2013 3:11 AM, "David Parks" wrote: > Step 1: distribute processing > > We have 2 servers in which we'll run 2 SolrCloud instances on. > > We'll define 2 shards so that both servers are busy for each request > (improving response time of the req

RE: SolrCloud loadbalancing, replication, and failover

2013-04-18 Thread David Parks
disk performance, and CPU regardless of how you lay out the cluster otherwise performance will suffer. My guess is if each Solr had sufficient resources, you wouldn't actually notice much difference in query performance. Tim On Thu, Apr 18, 2013 at 8:03 AM, David Parks wrote: > But my con

RE: SolrCloud loadbalancing, replication, and failover

2013-04-18 Thread David Parks
---Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: Friday, April 19, 2013 11:51 AM To: solr-user@lucene.apache.org Subject: Re: SolrCloud loadbalancing, replication, and failover On 4/18/2013 8:12 PM, David Parks wrote: > I think I still don't understand something h

RE: SolrCloud loadbalancing, replication, and failover

2013-04-19 Thread David Parks
uery over every single GB of data. If you only actually query over, say, 500MB of the 120GB data in your dev environment, you would only use 500MB worth of RAM for caching. Not 120GB On Fri, Apr 19, 2013 at 7:55 AM, David Parks wrote: > Wow! That was the most pointed, concise discussion of h

RE: SolrCloud loadbalancing, replication, and failover

2013-04-19 Thread David Parks
day, April 19, 2013 4:19 PM To: solr-user@lucene.apache.org Subject: Re: SolrCloud loadbalancing, replication, and failover On 4/19/2013 2:15 AM, David Parks wrote: > Interesting. I'm trying to correlate this new understanding to what I > see on my servers. I've got one server

RE: SolrCloud loadbalancing, replication, and failover

2013-04-19 Thread David Parks
Wow, thank you for those benchmarks Toke, that really gives me some firm footing to stand on in knowing what to expect and thinking out which path to venture down. It's tremendously appreciated! Dave -Original Message- From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk] Sent: Frida

RE: SolrCloud loadbalancing, replication, and failover

2013-04-19 Thread David Parks
;m presuming this is doable in solr cloud, but I haven't put it to task yet). If I could purpose Hadoop to index the shards, that would be ideal, though I haven't quite figured out how to go about it yet. David -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent:

Bug? JSON output changes when switching to solr cloud

2013-04-21 Thread David Parks
We just took an installation of 4.1 which was working fine and changed it to run as solr cloud. We encountered the most incredibly bizarre apparent bug: In the JSON output, a colon ':' changed to a comma ',', which of course broke the JSON parser. I'm guessing I should file this as a bug, but it

RE: Bug? JSON output changes when switching to solr cloud

2013-04-22 Thread David Parks
Subject: Re: Bug? JSON output changes when switching to solr cloud Thanks David, I've confirmed this is still a problem in trunk and opened https://issues.apache.org/jira/browse/SOLR-4746 -Yonik http://lucidworks.com On Sun, Apr 21, 2013 at 11:16 PM, David Parks wrote: > We just

Atomic update issue with 4.0 and 4.2.1

2013-04-25 Thread David Fennessey
Hi everyone , We have hit this strange bug using the atomic update functionality of both SOLR 4.0 and SOLR 4.2.1. We're currently posting a JSON formatted file to the core's updater using a simple curl method however we've run a very bizarre error where periodically it will fail and return a 4

Indexing off of the production servers

2013-05-06 Thread David Parks
s from degraded performance during large index processing periods? Thanks! David

RE: Indexing off of the production servers

2013-05-06 Thread David Parks
of them and every shard has 2 replica. When you > > send a query into a SolrCloud every replica will help you for > > searching and if > you > > add more replicas to your SolrCloud your search performance will improve. > > > > > > 2013/5/6 David Parks >

RE: Indexing off of the production servers

2013-05-06 Thread David Parks
So, am I following this correctly by saying that, this proposed solution would present us a way to index a collection on an offline/dev solr cloud instance and *move* that pre-prepared index to the production server using an alias/rename trick? That seems like a reasonably doable solution. I also

RE: Solr Cloud with large synonyms.txt

2013-05-06 Thread David Parks
Wouldn't it make more sense to only store a pointer to a synonyms file in zookeeper? Maybe just make the synonyms file accessible via http so other boxes can copy it if needed? Zookeeper was never meant for storing significant amounts of data. -Original Message- From: Jan Høydahl [mailto:

RE: Solr Cloud with large synonyms.txt

2013-05-08 Thread David Parks
I can see your point, though I think edge cases would be one concern, if someone *can* create a very large synonyms file, someone *will* create that file. What would you set the zookeeper max data size to be? 50MB? 100MB? Someone is going to do something bad if there's nothing to tell them not to

RE: More Like This and Caching

2013-05-09 Thread David Parks
f memory. There was a very extensive discussion on this list not long back titled: "Re: SolrCloud loadbalancing, replication, and failover" look that thread up and you'll get a lot of in-depth on the topic. David -Original Message- From: Giammarco Schisani [mailto:giamma...

RE: Is the CoreAdmin RENAME method atomic?

2013-05-09 Thread David Parks
Find the discussion titled "Indexing off the production servers" just a week ago in this same forum, there is a significant discussion of this feature that you will probably want to review. -Original Message- From: Lan [mailto:dung@gmail.com] Sent: Friday, May 10, 2013 3:42 AM To: so

Boosting documents with terms derived from clustering - good idea?

2013-05-14 Thread David Parks
We have a number of queries that produce good results based on the textual data, but are contextually wrong (for example, an "SSD hard drive" search matches the music album "SSD hip hop drives us crazy". Textually a fair match, but SSD is a term that strongly relates to technical documents.

Continue committing after out of memory of contrib library. (tika)

2013-05-14 Thread David Vdd
I'm using a combination of tika and custom code to extract text from files. (with solrj) I was looking at the amount of files I had in my index and noticed many of them where missing. Then I went to the solradmin panel and noticed this in the logfiles: SEVERE SolrCore java.

Aggregate word counts over a subset of documents

2013-05-16 Thread David Larochelle
mple, TermFrequencyComponent would tell me that car occurs 3 times over all documents in the index and 1 time in document 1 but not that it occurs 2 times over cat1 documents and 1 time over cat2 documents. Is there a good way to use Solr/Lucene to gather aggregate results like this? I've been focusing on just using Solr with XML files but I could certainly write Java code if necessary. Thanks, David

Re: Aggregate word counts over a subset of documents

2013-05-16 Thread David Larochelle
Jason, Thanks so much for your suggestion. This seems to do what I need. -- David On Thu, May 16, 2013 at 3:59 PM, Jason Hellman < jhell...@innoventsolutions.com> wrote: > David, > > A Pivot Facet could possibly provide these results by the following syntax: > > facet.pi

Re: Fast faceting over large number of distinct terms

2013-05-22 Thread David Larochelle
nd one that provided aggregate word counts for just the documents matching a particular query rather than an individual documents or the entire index. -- David On Wed, May 22, 2013 at 10:32 PM, Brendan Grainger < brendan.grain...@gmail.com> wrote: > Hi David, > > Out of interest, w

Re: Fast faceting over large number of distinct terms

2013-05-23 Thread David Larochelle
ldn't find any documentation on using the results of a solr query to feed a map reduce operation. -- David On Wed, May 22, 2013 at 11:12 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Here's a possibility: > > At index time extract important terms (and/or p

strField

2011-08-30 Thread Twomey, David
I have a string fieldtype defined as so And I have a field defined as The fields are of this format 92E8EF8FC9F362BBE0408CA5785A29D4 But in the index they are like this: [B@520ed128 I thought it must be compression but compression=true|false is no longer supported by strField I don't see

Re: strField

2011-08-30 Thread Twomey, David
ing a toString on a Java object. You're >sending over a Java object address, not the string itself. A simple >change to your indexer should fix this. > >Erik > >On Aug 30, 2011, at 08:42 , Twomey, David wrote: > >> >> I have a string fieldtype defined

Re: strField

2011-08-30 Thread Twomey, David
Ok. Figured it out. Thanks for the pointer. The field was of type "RAW" in Oracle so it was being converted to a java string by DIH with the behaviour below. I just changed the SQL query in DIH to add RAWTOHEX(guid) On 8/30/11 11:03 AM, "Twomey, David" wrote: >Hmmm

How to obtain the Explained output programmatically ?

2011-10-03 Thread David Ryan
umDocs=26) 0.15625 = fieldNorm(field=text, doc=1) I could see the explained result by clicking the "toggle explain" button in the web browser. Is there a way to access the explained output programmatically? Regards, David

Re: How to obtain the Explained output programmatically ?

2011-10-03 Thread David Ryan
Thanks Hoss! debug.explain.structured is definitely helpful. It adds some structure to the plain explained output. Is there a way to access these structured outputs in Java code (e.g., via Solr plugin class)? We could wr

Scoring of DisMax in Solr

2011-10-04 Thread David Ryan
Hi, When I examine the score calculation of DisMax in Solr, it looks to me that DisMax is using tf x idf^2 instead of tf x idf. Does anyone have insight why tf x idf is not used here? Here is the score contribution from one one field: score(q,c) = queryWeight x fieldWeight =

How to empty SolR Cache

2011-10-05 Thread David GUYOT
n with all SolR caches empty; is there a way to erase SolR caches by a command or to restart SolR with an option to avoid cache autowarming? Thank you in advance. Kind regards. -- David GUYOT Sys admin Europe Camions Interactive Moulin Collot F-88500 AMBACOURT Tel.:03.29.30.47.85

Re: Scoring of DisMax in Solr

2011-10-05 Thread David Ryan
Thanks! What's the procedure to report this if it's a bug? EDisMax has similar behavior. On Tue, Oct 4, 2011 at 11:24 PM, Bill Bell wrote: > This seems like a bug to me. > > On 10/4/11 6:52 PM, "David Ryan" wrote: > > >Hi, > > > > > >When

Re: Scoring of DisMax in Solr

2011-10-05 Thread David Ryan
Hi Markus, The idf calculation itself is correct. What I am trying to understand here is why idf value is multiplied twice in the final score calculation. Essentially, tf x idf^2 is used instead of tf x idf. I'd like to understand the rational behind that. On Wed, Oct 5, 2011 at 9:43 AM, Ma

Re: Scoring of DisMax in Solr

2011-10-05 Thread David Ryan
Ok, here is the calculation of the score: 0.18314168 = *2.3121865* * 0.15502669 * 1.4142135 * *2.3121865* * 0.15625 *2.3121865 is *multiplied twice here. That is what I mean tf x idf^2 is used instead of tf x idf. On Wed, Oct 5, 2011 at 10:42 AM, Markus Jelsma wrote: > Hi, > > I don't see

New scoring models in LUCENE/SOLR (LUCENE-2959)

2011-10-05 Thread David Ryan
Hi, According to the IRA issue 2959, https://issues.apache.org/jira/browse/LUCENE-2959 BM25 will be included in the next release of LUCENE. 1). Will BM25F be included in the next release as well as part of LUCENE-2959? 2). What's the timeline of the next release that new scoring modules will be

Re: Scoring of DisMax in Solr

2011-10-05 Thread David Ryan
The example does not include the evidence. But we do use eDisMax for scoring in Solr. The following is from solrconfig.xml: edismax Here is a short snippet of the explained result, where 0.1 is the Tie breaker in DisMax/eDisMax. 6.446447 = (MATCH) max plus 0.1 times others of: 0.63826215

Re: New scoring models in LUCENE/SOLR (LUCENE-2959)

2011-10-05 Thread David Ryan
Do you mean both BM25 and BM25F? On Wed, Oct 5, 2011 at 11:44 AM, Robert Muir wrote: > On Wed, Oct 5, 2011 at 2:23 PM, David Ryan wrote: > > Hi, > > > > According to the IRA issue 2959, > > https://issues.apache.org/jira/browse/LUCENE-2959 > > > > BM25

Improving Solr Spell Checker Results

2012-01-13 Thread David Radunz
d being searched. i.e. Searching for an actor would only use the dictionary fields from the actor. This makes sense on many levels, as when you are field searching its useless to get a correction from another field as no values would match in any case. Hopefully someone can help! Thanks in advance, David

Re: Improving Solr Spell Checker Results

2012-01-20 Thread David Radunz
r.runChild(LuceneTestCaseRunner.java:57) Should I ignore this (and other failed tests) and continue anyway? Cheers, David On 17/01/2012 5:32 AM, Dyer, James wrote: David, The spellchecker normally won't give suggestions for any term in your index. So even if "wever" is

Re: Improving Solr Spell Checker Results

2012-01-21 Thread David Radunz
James, Thanks again for your lengthy and informative response. I updated from SVN trunk again today and was successfully able to run 'ant test'. So I proceeded with trying your suggestions (for question 1 so far): On 17/01/2012 5:32 AM, Dyer, James wrote: David, The sp

Re: Improving Solr Spell Checker Results

2012-01-21 Thread David Radunz
On 19/01/2012 12:21 AM, O. Klein wrote: Dyer, James wrote David, The spellchecker normally won't give suggestions for any term in your index. So even if "wever" is misspelled in context, if it exists in the index the spell checker will not try correcting it. There are 3 work

Failure noticed from new...@zju.edu.cn

2012-01-21 Thread David Radunz
Hey, Every time I send a reply to the list I get a failure for new...@zju.edu.cn. Should I just ignore this? I am unsure if the message has been delivered... Cheers, David

Re: Improving Solr Spell Checker Results

2012-01-22 Thread David Radunz
+series_name:"sigorney+wever"^50&spellcheck.q=sigorney+wever&fq=store_id:"1"&rows=5 Cheers, David On 22/01/2012 2:03 AM, David Radunz wrote: James, Thanks again for your lengthy and informative response. I updated from SVN trunk again today and was successf

Re: Improving Solr Spell Checker Results

2012-01-22 Thread David Radunz
o whats entered surrounding the term. Example: "Sigourney Wever" would never appear in a document ever. "Sigourney Weaver" however has many 'hits' in exactly that order of words. So there needs to be a way to boost suggestions based on adjacency... M

Re: Improving Solr Spell Checker Results

2012-01-22 Thread David Radunz
Hey, I am trying to send this again as 'plain-text' to see if it delivers ok this time. All of the previous messages I sent should be below.. Cheers, David On 22/01/2012 11:42 PM, David Radunz wrote: Hey James, I have played around a bit more with the settings and trie

Re: Failure noticed from new...@zju.edu.cn

2012-01-22 Thread David Radunz
Hey, That seems to have helped, I didn't get a failure notice re-sending the message. I'll have to keep that in mind. Thanks very much, David On 23/01/2012 12:41 PM, Erick Erickson wrote: I've seen the spam filter be pretty aggressive with HTML formatting etc, what h

Re: Improving Solr Spell Checker Results

2012-01-22 Thread David Radunz
Hey Erick, Sure, can you explain the process to create the patch and upload it and i'll do it first thing tomorrow. Thanks again for your help, David On 23/01/2012 12:51 PM, Erick Erickson wrote: I can't help with your *real* problem, but when looking at patches, if the &

Re: Improving Solr Spell Checker Results

2012-01-23 Thread David Radunz
Hey, Thanks for that, I have uploaded a new patch as advised. Cheers, David On 23/01/2012 1:01 PM, Erick Erickson wrote: David: There's some good info here: http://wiki.apache.org/solr/HowToContribute#Working_With_Patches But the short form is to go into solr_home and issue

Re: solr not working with magento enterprise 1.11

2012-01-24 Thread David Radunz
Hey, Shouldn't you be asking this question to the Magento people? You have an Enterprise edition, so you have paid for their support. Cheers, David On 25/01/2012 2:57 PM, vishal_asc wrote: I am integrating solr 3.5 with jetty in magento EE 1.11. I have followed all the necessary

Re: solr not working with magento enterprise 1.11

2012-01-24 Thread David Radunz
urces go towards the OS. Cheers, David On 25/01/2012 3:30 PM, vishal_asc wrote: Thanks David. As of now we are configuring it on local WAMP server and we have only development version provided by sales team. Do you when where solr saves information or push the xml docs when we run index manag

Re: Multiple Data Directories and 1 SOLR instance

2012-01-26 Thread David Radunz
olr instance (which will scale better as the resources will be shared). Have a read anyway: http://wiki.apache.org/solr/CoreAdmin Cheers, David On 27/01/2012 8:18 AM, Nitin Arora wrote: Hi, We are using SOLR/Lucene to index/search the data about the user's of an organization. The nature

Re: SpellCheck Help

2012-01-26 Thread David Radunz
something Magento is doing, so try seeking support their first (Try their mailing lists if they have any, or on IRC: irc.freenode.org #magento). I am not trying to be rude, rather to save you time and others effort. Cheers, David On 27/01/2012 5:37 PM, vishal_asc wrote: Downloaded Apache Sol

Multiple Solr servers and a shared index vs master+slaves

2010-06-30 Thread David Thompson
I'm a newbie looking at setting up an intranet search service using Solr, so I'm having a hard time understanding why I should forego the high availability and clustering mechanisms we already have available, and use Solr's implementations instead. I'm hoping some experienced Solr architects co

PDF remote streaming extract with lots of multiValues

2010-07-09 Thread David Thompson
How would I go about setting a large number of literal values in a call to index a remote PDF? I'm currently calling: http://host/solr/update/extract?literal.id=abc&literal.mycategory=blah&stream.url=http://otherhost/some/file.pdf And that works great, except now I'm coming across usecases wh

Re: PDF remote streaming extract with lots of multiValues

2010-07-09 Thread David Thompson
cepts all the parameters. -dKt ____ From: David Thompson To: solr-user@lucene.apache.org Sent: Fri, July 9, 2010 12:17:59 PM Subject: PDF remote streaming extract with lots of multiValues How would I go about setting a large number of literal values in a call to

Solr 3.1 and ExtractingRequestHandler resulting in blank content

2010-07-26 Thread David Thibault
Hello all, I’m working on a project with Solr. I had 1.4.1 working OK using ExtractingRequestHandler except that it was crashing on some PDFs. I noticed that Tika bundled with 1.4.1 was 0.4, which was kind of old. I decided to try updating to 0.7 as per the directions here: http://wiki.apac

Re: How to Combine Drupal solrconfig.xml with Nutch solrconfig.xml?

2010-07-26 Thread David Stuart
ation http://drupal.org/project/nutch Regards, David Stuart On 26 Jul 2010, at 21:37, Savannah Beckett wrote: > I am using Drupal ApacheSolr module to integrate solr with drupal. I already > integrated solr with nutch. I already moved nutch's solrconfig.xml and > schema.xm

Re: How to Combine Drupal solrconfig.xml with Nutch solrconfig.xml?

2010-07-27 Thread David Stuart
ache solr module. I encounter a field that is not mergeable. > From drupal module: > > From solr/nutch setup: > required="true"/> > I am not sure if there are any more stuff like this that is not mergeable. > > Is there a easy way to deal with schem

RE: Extracting PDF text/comment/callout/typewriter boxes with Solr CELL/Tika/PDFBox

2010-07-27 Thread David Thibault
Alessandro & all, I was having the same issue with Tika crashing on certain PDFs. I also noticed the bug where no content was extracted after upgrading Tika. When I went to the SOLR issue you link to below, I applied all the patches, downloaded the Tika 0.8 jars, restarted tomcat, posted a f

How to 'filter' facet results

2010-07-27 Thread David Thompson
Is there a way to tell Solr to only return a specific set of facet values? I feel like the facet query must be able to do this, but I'm not really understanding the facet query. In my specific case, I'd like to only see facet values for the same values I pass in as query filters, i.e. if I run

<    1   2   3   4   5   6   7   8   9   10   >