RE: Extracting PDF text/comment/callout/typewriter boxes with Solr CELL/Tika/PDFBox

2010-07-28 Thread David Thibault
Hope this helps. Tommaso 2010/7/27 David Thibault > Alessandro & all, > > I was having the same issue with Tika crashing on certain PDFs. I also > noticed the bug where no content was extracted after upgrading Tika. > > When I went to the SOLR issue you link to below, I applied a

RE: Solr 3.1 and ExtractingRequestHandler resulting in blank content

2010-07-28 Thread David Thibault
uments and bring back the file name. Your app has to then use the file name. Solr/Lucene is not intended as a general-purpose content store, only an index. The ERH wiki page doesn't quite say this. It describes what the ERH does rather than what it does not do :) On Mon, Jul 26, 2010 at 12:0

RE: Extracting PDF text/comment/callout/typewriter boxes with Solr CELL/Tika/PDFBox

2010-07-28 Thread David Thibault
Patch is more Stable. Until 4.0 will be released .... 2010/7/28 David Thibault > Yesterday I did get this working with version 4.0 from trunk. I haven't > fully tested it yet, but the content doesn't come through blank anymore, so > that's good. Would it be more stable to st

RE: Extracting PDF text/comment/callout/typewriter boxes with Solr CELL/Tika/PDFBox

2010-07-28 Thread David Thibault
ectory but forgot to update the war inside the example/webapps directory (that is inside Jetty). Hope this helps. Tommaso 2010/7/27 David Thibault > Alessandro & all, > > I was having the same issue with Tika crashing on certain PDFs. I also > noticed the bug where no content was

Solr 1.4.1 and 3x: Grouping of query changes results

2010-08-08 Thread David Benson
? The exact placement of the parens isn't key, just adding a level of nesting changes the query results. Thanks, David

A few query issues with solr

2010-08-26 Thread David Yang
for "macys" is different - currently I force macy's to macys * Searching for "at&t" gets converted to "at", "t" which are both stop worded - I am forced to convert at&t=>att before indexing and querying Is there a nice way to handle these or will I always need to resort to manual fixes for these? Cheers David

How to use TermsComponent when I need a filter

2010-09-07 Thread David Yang
E.g. Names beginning with Bob in association A5. Is this possible? I would prefer not to have to have one index per association, since the number of associations is pretty large Cheers, David

How to use TermsComponent when I need a filter

2010-09-08 Thread David Yang
E.g. Names beginning with Bob in association A5. Is this possible? I would prefer not to have to have one index per association, since the number of associations is pretty large Cheers, David

Delta Import with something other than Date

2010-09-08 Thread David Yang
Hi, I have a table that I want to index, and the table has no datetime stamp. However, the table is append only so the primary key can only go up. Is it possible to store the last primary key, and use some delta query="select id where id>${last_id_value}" Cheers, David

RE: Delta Import with something other than Date

2010-09-08 Thread David Yang
xternally and calling the delta-import with a parameter and using ${dataimporter.request.last_primary_key} but that seems like a very brittle approach Cheers, David -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Wednesday, September 08, 2010 6:38 PM

Autocomplete with Filter Query

2010-09-10 Thread David Yang
r issue is that I use DisMax to do my search (name, title, phone number etc) - so it might be more complex to do autocomplete. I could have a copy field to copy all dismax terms into one big field. Cheers, David

RE: Autocomplete with Filter Query

2010-09-10 Thread David Yang
eperated word as they type it, this won't do THAT either. Trying to get all those things to work becomes even more complicated -- especially with the requirement that you want to be able to apply the 'fq's from your current search context to the auto-complete. I haven't entirely

Our SOLR instance seems to be single-threading and therefore not taking advantage of its multi-proc host

2010-09-13 Thread David Crane
f SOLR. Thanks, David Crane -- View this message in context: http://lucene.472066.n3.nabble.com/Our-SOLR-instance-seems-to-be-single-threading-and-therefore-not-taking-advantage-of-its-multi-proc-t-tp1470282p1470282.html Sent from the Solr - User mailing list archive at Nabble.com.

DataImportHandler with multiline SQL

2010-09-16 Thread David Yang
do random stuff, but the first select x needs to return results. Does anybody know exactly how DIH handles multiple sql statements in the query? Cheers, David

Re: DataImportHandler dynamic fields clarification

2010-09-30 Thread David Stuart
Two things, one are your DB column uppercase as this would effect the out. Second what does your db-data-config.xml look like Regards, Dave On 30 Sep 2010, at 03:01, harrysmith wrote: > > Looking for some clarification on DIH to make sure I am interpreting this > correctly. > > I have a wide

Query slop vs. phrase slop

2010-10-05 Thread David Boxenhorn
Can anyone explain to me the practical difference (i.e. in terms of results) between query slop and phrase slop?

Re: Query slop vs. phrase slop

2010-10-05 Thread David Boxenhorn
Thank you. I am talking about dismax's parameters. This is how I understand things, please tell me where I'm wrong: Query slop (qs) = how many words you can move the query to match the text. Phrase slop (ps) (when used in conjunction with &pf=text - is there another possibility?) = how many words

Re: Query slop vs. phrase slop

2010-10-07 Thread David Boxenhorn
Does anybody know the answer to this? On Tue, Oct 5, 2010 at 7:19 PM, David Boxenhorn wrote: > Thank you. I am talking about dismax's parameters. > > This is how I understand things, please tell me where I'm wrong: > > Query slop (qs) = how many words you can move th

Re: Query slop vs. phrase slop

2010-10-07 Thread David Boxenhorn
d out of the entire "q" param. > > qs (Query Phrase Slop) affects matching. If you play with qs, numFound > changes. This parameter is about when you have explicit phrase query in your > raw query. i.e. &q="apache lucene" > > --- On Thu, 10/7/10, David Boxenh

Re: Query slop vs. phrase slop

2010-10-07 Thread David Boxenhorn
Got it! Thanks a lot. On Thu, Oct 7, 2010 at 3:00 PM, Ahmet Arslan wrote: > > > ps = ? - something that affects boosting, but how? > > Lets say your query is apache solr. (without quotation marks) > > Lets say these three documents contains all of these words and returned. > > 1-) solr is built

Re: Query slop vs. phrase slop

2010-10-10 Thread David Boxenhorn
"phrase > match" means to the pf boost. > > > David Boxenhorn wrote: > >> Does anybody know the answer to this? >> >> On Tue, Oct 5, 2010 at 7:19 PM, David Boxenhorn >> wrote: >> >> >> >>> Thank you. I am talking about dism

Re: Does Solr reload schema.xml dynamically?

2010-10-26 Thread David Stuart
If you are using Solr Multicore http://wiki.apache.org/solr/CoreAdmin you can issue a Reload command http://localhost:8983/solr/admin/cores?action=RELOAD&core=core0 On 26 Oct 2010, at 11:09, Swapnonil Mukherjee wrote: > Hi Everybody, > > If I change my schema.xml to, do I have to restart Solr.

Shuffle results a little

2010-11-12 Thread David Yang
index. Cheers, David

Configuring multiple spell checkers

2012-11-24 Thread Giannone, David
f so, what is the configuration required?Any help is much appreciated. Thanks, David Giannone textSpell default NAME_SPELL true ./spellchecker_name_spell_en 0.7 textSpell_es spe

RE: inconsistent number of results returned in solr cloud

2012-11-29 Thread Buttler, David
u're seeing _sure_ sounds similar Best Erick On Mon, Nov 19, 2012 at 12:49 PM, Buttler, David wrote: > Answers inline below > > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Saturday, November 17, 2012 6:40 AM > To: solr-us

Performance improvement in large OR query using boosting (also, cache doesn't work?)

2012-12-14 Thread David Radunz
s:"eye+gouging"^5+OR+matching_keywords:"cat+suit"^5+OR+matching_keywords:"sabotage"^5+OR+matching_keywords:"interrogation"^5+OR+matching_keywords:"humanity+in+peril"^5+OR+matching_keywords:"cnn+reporter"^5+OR+matching_keywords:"prisoner"^5+OR+matching_keywords:"hit+with+a+chair"^5+OR+matching_keywords:"stabbed+in+the+chest"^5+OR+matching_keywords:"scepter"^5+OR+matching_keywords:"action+hero"^5+OR+matching_keywords:"plane+crash"^5+OR+matching_keywords:"escape"^5+OR+matching_keywords:"walkie+talkie"^5+OR+matching_keywords:"transformation"^5+OR+matching_keywords:"fighter+jet"^5+OR+matching_keywords:"rocket"^5+OR+matching_keywords:"exploding+car"^5+OR+matching_keywords:"news+report"^5+OR+matching_keywords:"gatling+gun"^5+OR+matching_keywords:"orchestra"^5+OR+matching_keywords:"wormhole"^5+OR+matching_keywords:"lasersight"^5+OR+matching_keywords:"laser"^5+OR+matching_keywords:"fighting+in+the+air"^5+OR+matching_keywords:"war+veteran"^5+OR+matching_keywords:"masked+hero"^5+OR+matching_keywords:"fictional+war"^5+OR+matching_keywords:"female+assassin"^5+OR+matching_keywords:"war+hero"^5+OR+matching_keywords:"mixed+martial+arts"^5+OR+matching_keywords:"shapeshifting"^5+OR+matching_keywords:"female+agent"^5+OR+matching_keywords:"exploding+bus"^5+OR+matching_keywords:"eye+scanning"^5+OR+matching_keywords:"machinegun"^5+OR+matching_keywords:"laser+gun"^5+OR+matching_keywords:"electrocution"^5+OR+matching_keywords:"title+appears+in+text"^5+OR+matching_keywords:"super+villain"^5+OR+matching_keywords:"ejector+seat"^5+OR+matching_keywords:"impalement"^5+OR+matching_keywords:"artificial+intelligence"^5+OR+matching_keywords:"manipulation"^5+OR+matching_keywords:"car+chase"^5+OR+matching_keywords:"technology"^5+OR+matching_keywords:"grenade"^5+OR+matching_keywords:"masked+man"^5+OR+matching_keywords:"fight"^5+OR+matching_keywords:"manhattan+new+york+city"^5+OR+matching_keywords:"ex+soldier"^5+OR+matching_keywords:"rescue"^5+OR+matching_keywords:"shot+to+death"^5+OR+matching_keywords:"shot+in+the+throat"^5+OR+matching_keywords:"warrior"^5+OR+matching_keywords:"held+at+gunpoint"^5+OR+matching_keywords:"betrayal"^5+OR+matching_keywords:"fighting+brothers"^5+OR+matching_keywords:"drone"^5+OR+matching_keywords:"critically+acclaimed"^5+OR+matching_keywords:"hologram"^5+OR+matching_keywords:"stabbed+in+the+back"^5+OR+matching_keywords:"death+of+friend"^5+OR+directors_attr_opt_id:"389553"^20+OR+writers_attr_opt_id:"389555"^10+OR+writers_attr_opt_id:"386357"^10+OR+writers_attr_opt_id:"433326"^10+OR+writers_attr_opt_id:"418867"^10+OR+producers_attr_opt_id:"414426"^10+OR+producers_attr_opt_id:"374643"^10+OR+producers_attr_opt_id:"435974"^10+OR+producers_attr_opt_id:"453901"^10+OR+producers_attr_opt_id:"453902"^10+OR+producers_attr_opt_id:"435975"^10+OR+producers_attr_opt_id:"413760"^10+OR+actors_attr_opt_id:"378011"^50+OR+actors_attr_opt_id:"393240"^50+OR+actors_attr_opt_id:"380532"^50+OR+actors_attr_opt_id:"381965"^50+OR+actors_attr_opt_id:"374641"^5+OR+actors_attr_opt_id:"380065"^5+OR+actors_attr_opt_id:"378007"^5+OR+actors_attr_opt_id:"375604"^5+OR+actors_attr_opt_id:"384913"^5+OR+actors_attr_opt_id:"374116"^5+OR+actors_attr_opt_id:"389785"^5+OR+actors_attr_opt_id:"431500"^5+OR+actors_attr_opt_id:"375806"^5+OR+actors_attr_opt_id:"388110"^5+OR+actors_attr_opt_id:"386208"^5+OR+actors_attr_opt_id:"456229"^5+OR+actors_attr_opt_id:"372628"^5+OR+actors_attr_opt_id:"387507"^5+OR+actors_attr_opt_id:"391970"^5+OR+actors_attr_opt_id:"456230"^5+OR+actors_attr_opt_id:"386655"^5+OR+actors_attr_opt_id:"385701"^5+OR+actors_attr_opt_id:"381993"^5+OR+actors_attr_opt_id:"456231"^5+OR+actors_attr_opt_id:"403517"^5+OR+actors_attr_opt_id:"456232"^5+OR+actors_attr_opt_id:"451976"^5+OR+year_made:[2010+TO+2014]^500)&wt=php&fq=website_id:"1"+AND+avail_status_attr_opt_id:"available"+AND+(format_attr_opt_id:"372745")&rows=4} hits=2319 status=0 QTime=713 As you may have guessed what I am trying to do is produce a dynamic related products list based on keywords and other things. Now, this seems to work really well - but as you can see it's very slow. As you might also notice the boost figure gradually decreases for each type (which is configurable, like 1st keyword might have 1000, the next one 900, the next 800, the next 700.. and so on.. then the rest say 200 -- so it matches the most appropriate result.. being ones with higher rated keywords/genres to this one). This kind of needs to happen like this because we have 30,000 products and some products can have up to 200 keywords. As I said, it works pretty well -- just slow :( Is there anything I can do to improve the load time? Also, any ideas why the second query is about as slow as the first? Thanks heaps, David

Re: Performance improvement in large OR query using boosting (also, cache doesn't work?)

2012-12-14 Thread David Radunz
solr) 2nd Query: http://pastebin.com/4NbSdEHC (as before, just the same query again) Seemingly, the slowness happens in 'processing'. But yeah, i'm sure you guys would better understand all of that :) Cheers, David On 14/12/2012 11:13 PM, Markus Jelsma wrote: Hi, T

Getting a patch evaluated/committed? Using stats with multivalued facet (SOLR-1782)

2012-12-16 Thread David Christianson
Hi I've have a real need for being able to use the StatsComponent with multivalued facets but have run into bugs, am wondering what I can do to help fix this bug in particular: https://issues.apache.org/jira/browse/SOLR-1782 For a while I've been using a patch filed on an older distributions b

MoreLikeThis supporting multiple document IDs as input?

2012-12-25 Thread David Parks
I'm unclear on this point from the documentation. Is it possible to give Solr X # of document IDs and tell it that I want documents similar to those X documents? Example: - The user is browsing 5 different articles - I send Solr the IDs of these 5 articles so I can present the user other simi

RE: MoreLikeThis supporting multiple document IDs as input?

2012-12-26 Thread David Parks
ually merge the values from the base documents > and then you could POST that text back to the MLT handler and find > similar documents using the posted text rather than a query. Kind of > messy, but in theory that should work. > > -- Jack Krupansky > > -Original

RE: solr + jetty deployment issue

2012-12-27 Thread David Parks
Do you see any errors coming in on the console, stderr? I start solr this way and redirect the stdout and stderr to log files, when I have a problem stderr generally has the answer: java \ -server \ -Djetty.port=8080 \ -Dsolr.solr.home=/opt/solr \ -Dsolr.data.dir=/

MoreLikeThis only returns 1 result

2012-12-27 Thread David Parks
I'm doing a query like this for MoreLikeThis, sending it a document ID. But the only result I ever get back is the document ID I sent it. The debug response is below. If I read it correctly, it's taking "id:1004401713626" as the term (not the document ID) and only finding it once. But I want it to

RE: MoreLikeThis only returns 1 result

2012-12-27 Thread David Parks
23.102.164:8080/solr/mlt?q=... Or, use the MoreLikeThis search component: http://localhost:8983/solr/select?q=...&mlt=true&;... See: http://wiki.apache.org/solr/MoreLikeThis -- Jack Krupansky -Original Message- From: David Parks Sent: Thursday, December 27, 201

RE: MoreLikeThis supporting multiple document IDs as input?

2012-12-27 Thread David Parks
the components as they are. You would have to manually merge the values from the base documents and then you could POST that text back to the MLT handler and find similar documents using the posted text rather than a query. Kind of messy, but in theory that should work. -- Jack Krupansky

RE: MoreLikeThis supporting multiple document IDs as input?

2012-12-27 Thread David Parks
each search request. If you open solrconfig.xml you will see how they are defined and used. HTH Otis Solr & ElasticSearch Support http://sematext.com/ On Dec 28, 2012 12:06 AM, "David Parks" wrote: > I'm somewhat new to Solr (it's running, I've been through the books, &g

What do I need to research to solve the problem of returning good results for a generic term?

2012-12-28 Thread David Parks
I'm sure this is a complex problem requiring many iterations of work, so I'm just looking for pointers in the right direction of research here. I have a base term, such as let's say "black dress" that I might search for. Someone searching on this term is most logically looking for black dresses

RE: MoreLikeThis supporting multiple document IDs as input?

2013-01-03 Thread David Parks
et to). So add in a debugQuery=on parameter and I see this, possibly useful reference: LuceneQParser It also appears that the MoreLikeThisComponent did indeed run So maybe I should ask exactly what results I should be expecting here? Thanks very much! David -Original Message

RE: MoreLikeThis supporting multiple document IDs as input?

2013-01-04 Thread David Parks
Aha! &mlt=true, that was the key I hadn't worked out before (thought it was &qt=mlt that achieved that), things are looking rosy now, and these results are a perfect fit for my needs. Thanks very much for your time to help explain this!! David -Original Message- From: J

Search strategy - improving search quality for short search terms such as "doll"

2013-01-16 Thread David Parks
I'm a beginner-intermediate solr admin, I've set up the basics for our application and it runs well. Now it's time for me to dig in and start tuning and improving queries. My next target is searches on simple terms such as "doll" which, in google, would return documents about, well, "toy do

RE: Search strategy - improving search quality for short search terms such as "doll"

2013-01-16 Thread David Parks
ex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Wed, Jan 16, 2013 at 4:40 AM, David Parks wr

RE: Search strategy - improving search quality for short search terms such as "doll"

2013-01-16 Thread David Parks
result set. What I understood that you are talking about the context of the query. For example if you search "books on MK Gandhi" and "books by MK Gandhi" both queries have different context. Context based search at some level achieved by natural language processing. This on

Field Collapsing - Anything in the works for multi-valued fields?

2013-01-17 Thread David Parks
I want to configure Field Collapsing, but my target field is multi-valued (e.g. the field I want to group on has a variable # of entries per document, 1-N entries). I read on the wiki (http://wiki.apache.org/solr/FieldCollapsing) that grouping doesn't support multi-valued fields yet. Anything in

RE: Field Collapsing - Anything in the works for multi-valued fields?

2013-01-17 Thread David Parks
18, 2013 2:32 AM To: solr-user Subject: Re: Field Collapsing - Anything in the works for multi-valued fields? David, What's the documents and the field? It can help to suggest workaround. On Thu, Jan 17, 2013 at 5:51 PM, David Parks wrote: > I want to configure Field Collapsing, but m

RE: Field Collapsing - Anything in the works for multi-valued fields?

2013-01-18 Thread David Parks
ti-valued fields, would parent-child setup for you here? See http://search-lucene.com/?q=solr+join&fc_type=wiki Otis -- Solr & ElasticSearch Support http://sematext.com/ On Thu, Jan 17, 2013 at 8:04 PM, David Parks wrote: > The documents are individual products which come from 1 or

Re: How to combine Qparsers in a plugin?

2013-01-22 Thread David Vdd
Yes it worked like you explained. This is my new query class: public class QObject extends org.apache.lucene.search.Query { Query q; public QObject(Query q) { this.q = q; } @Override public void extractTerms(Set terms) { q.ex

Edismax odd results

2013-02-19 Thread David Quarterman
hing but 'engineer boots', same for 'ankle boots' - in fact, same result set of 1,873 mostly boots but a few other products mixed in. We're on SOLR 4.0 and the field we're querying is stemmed (snowball), lowercased on WhiteSpaceTokenizer. Any ideas? Regards, David Q

RE: Edismax odd results

2013-02-19 Thread David Quarterman
ing scored - you may have to add some specific query phrases to force "engineer boot" into the top results to comparing the scoring. -- Jack Krupansky -Original Message- From: David Quarterman Sent: Tuesday, February 19, 2013 6:21 AM To: solr-user@lucene.apache.org Subject: Edis

RE: Edismax odd results

2013-02-19 Thread David Quarterman
em there, then the Solr log (assuming you haven't changed the default log level of INFO) should have a record of what parameters were actually received when the query was made. Thanks, Shawn On 2/19/2013 9:14 AM, David Quarterman wrote: > Hi Jack, > > Here's q test query we&

RE: Edismax odd results

2013-02-19 Thread David Quarterman
Hi Shawn/Jack, The log shows the query going in okay, nothing gets stripped out so we're still at a loss to understand this. Could it be theta Snowball stemming is too invasive? Regards, DQ -Original Message- From: David Quarterman [mailto:da...@corexe.com] Sent: 19 February 20

RE: Edismax odd results

2013-02-19 Thread David Quarterman
mming reduces 'engineer' to 'engin' so I'd have expected a lot more results. Anyone got any ideas? Regards, DQ -----Original Message- From: David Quarterman [mailto:da...@corexe.com] Sent: 19 February 2013 17:09 To: solr-user@lucene.apache.org Subject: RE: Edismax od

Re: Edismax odd results

2013-02-19 Thread David Quarterman
Hi Shawn, Now finished for the day but will post the schema tomorrow. Thanks for the help (and Jack too). Regards, DQ P.S. did reindex after changing schema and the analyzer/query stuff matches precisely!! Shawn Heisey wrote: On 2/19/2013 11:16 AM, David Quarterman wrote: > This

RE: Edismax odd results

2013-02-20 Thread David Quarterman
2013 11:16 AM, David Quarterman wrote: > This is definitely driving us mad now! Changed to PorterStemming and there's > very little difference. > > If we add fq=engineer, we get 0 results. Add fq=engineer* and we get the 90 > in the system. Try with fq=ankle* and we get 2

RE: Edismax odd results

2013-02-20 Thread David Quarterman
rg Subject: Re: Edismax odd results When you get back to this tomorrow, also try and paste the parsed query bits you get back when you append &debug=all. Sometimes it's surprising what the parsed query _really_ looks like Best Erick On Tue, Feb 19, 2013 at 3:13 PM, David Quart

RE: Edismax odd results

2013-02-20 Thread David Quarterman
Finally, you may be getting bitten by scoring, field norms and all that. If you have a doc ID that you _know_ contains "engineers boots", try using debug with explainOther ( http://wiki.apache.org/solr/CommonQueryParameters#explainOther) which might help you understand what's

RE: If we Open Source our platform, would it be interesting to you?

2013-02-21 Thread David Quarterman
Hi Marcelo, Looked through your site and the framework looks very powerful as an aggregator. We do a lot of data aggregation from many different sources in many different formats (XML, JSON, text, CSV, etc) using RDBMS as the main repository for eventual SOLR indexing. A 'one-stop-shop' for all

RE: Edismax odd results

2013-02-22 Thread David Quarterman
Hi Erick, Funnily enough, I cracked it about 5 minutes before your email arrived! Problem was using WhiteSpaceTokenizer instead of Standard AND had the LowerCaseFilter after the PorterStemmingFilter. Getting them in the right order has solved all the problems and we get all our engineer boots,

Re: Get search results in the order of fields names searched

2013-02-26 Thread David Philip
Hi, Thank you for the references. I used edismax and it works. Thanks a lot. David On Tue, Feb 26, 2013 at 7:33 PM, Jan Høydahl wrote: > Check out dismax (http://wiki.apache.org/solr/ExtendedDisMax) > > q="John Hopkins"&defType=edismax&qf=Author^1000 Editors^

Solrj. How to set different default field on a (join)filter query?

2013-02-26 Thread David Vdd
Hi, I'm wondering if it's possible to use a default field on a filter query? This is what I'm doing. (shortend example in solrJ) String qry = "{!qJoin}" + searchText; sq.setQuery(qry); sq.setParam("df", "pageTxt"); sq.addFilterQuery("{!join fromInd

Re: Solrj. How to set different default field on a (join)filter query?

2013-02-27 Thread David Vdd
Hi, I think I fixed it by combining the join with edismax. sq.addFilterQuery{!join fromIndex=pageCore from=pageId to=fileId }{!edismax qf=description}Solr rocks) It seems to work now. -- View this message in context: http://lucene.472066.n3.nabble.com/Solrj-How-to-set-different-default-fiel

Re: debugQuery, explain tag - What does the fieldWeight value refer to?,

2013-03-04 Thread David Philip
= fieldNorm(field=title, doc=7) Thanks - David On Sat, Mar 2, 2013 at 12:23 PM, Chris Hostetter wrote: > > : In the explain tag (debugQuery=true) > : what does the *fieldWeight* value refer to?, > > fieldWeight is just a label being put on the the product of the tf, idf, >

How should I configure Solr to support multi-word synonyms?

2013-03-04 Thread David Sharpe
atch; however, KeywordTokenizer will treat the entire field as a a single token, so a document containing "something phrase one something" will not match the equivalent synonym, and also a query for "phrase" or "one" will not find the document. Thank you for your time. Sincerely, David Sharpe

RE: Building a central index with Lucene + Solr

2013-03-05 Thread David Quarterman
Hi Alvaro, I agree with Otis & Alexandre (esp. Windows + PHP!). However, there are plenty of people using Solr & PHP out there very successfully. There's another good package at http://code.google.com/p/solr-php-client/ which is easy to implement and has some example usage. Regards, DQ Fr

After upgrade to solr4, search doesn't work

2013-03-05 Thread David Parks
ot;*:*" but any query I do with a term returns 0 results, even though it's clear from the "*:*" query that solr has that document. Any ideas on where to start looking here? David

Re: After upgrade to solr4, search doesn't work

2013-03-05 Thread David Parks
uot;df" parameter in the /select request handler in solrconfig.xml to be your default query field name if it is not "text". -- Jack Krupansky -----Original Message- From: David Parks Sent: Wednesday, March 06, 2013 1:26 AM To: solr-user@lucene.apache.org Subject: After upgra

Re: After upgrade to solr4, search doesn't work

2013-03-05 Thread David Parks
6, 2013 at 11:56 AM, David Parks wrote: > I just upgraded from solr3 to solr4, and I wiped the previous work and > reloaded 500,000 documents. > > I see in solr that I loaded the documents, and from the console, if I do a > query "*:*" I see documents returned. > > I

Re: After upgrade to solr4, search doesn't work

2013-03-05 Thread David Parks
Oops, I didn't include the full XML there, hopefully this formats ok. From: David Parks To: "solr-user@lucene.apache.org" Sent: Wednesday, March 6, 2013 1:58 PM Subject: Re: After upgrade to solr4, search doesn't work All but the uni

Re: After upgrade to solr4, search doesn't work

2013-03-05 Thread David Parks
ied a comma separated list of my fields here but that was invalid. dvddvdid:dvdid:dvd From: David Parks To: "solr-user@lucene.apache.org" Sent: Wednesday, March 6, 2013 1:52 PM Subject: Re: After upgrade to solr4, search doesn't work Good th

RE: After upgrade to solr4, search doesn't work

2013-03-07 Thread David Parks
ry much for all your help on this, it certainly helped me get my configuration straight and the upgrade to 4 is now complete. All the best, David -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Wednesday, March 06, 2013 7:56 PM To: solr-user@lucene.apache.org;

Re: debugQuery, explain tag - What does the fieldWeight value refer to?,

2013-03-12 Thread David Philip
Hi, Any reply on this: How are the documents sequenced in the case when the product of tf idf , coord and fieldnorm is same for both the documents? Thanks - David P.S : This link was very useful to understand the scoring in detail: http://mail-archives.apache.org/mod_mbox/lucene-java-user

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread David Philip
Informative. Useful.Thanks On Thu, Mar 14, 2013 at 1:59 PM, Chantal Ackermann < c.ackerm...@it-agenten.com> wrote: > Hi all, > > > this is not a question. I just wanted to announce that I've written a blog > post on how to set up Maven for packaging and automatic testing of a SOLR > index config

Is Solr more CPU bound or IO bound?

2013-03-17 Thread David Parks
t wouldn't otherwise use the disk. Thoughts on this? Thanks, David

RE: Is Solr more CPU bound or IO bound?

2013-03-17 Thread David Parks
seems likely that we'll be CPU bound (heavy queries against a static index updated nightly from the master), thus nullifying the advantage of dual-purposing a box with another CPU bound app. Very useful discussion, I'll get proper load tests done in time but this helps direct my think

correct XPATH syntax

2012-04-30 Thread Twomey, David
Is this possible in DataImportHandler I want the following XML to all collapse into one Author field Sørlie T T Perou C M CM Tibshirani R R ... So my XPATH is like

Re: correct XPATH syntax

2012-04-30 Thread Twomey, David
Sorry hit send too soon. Continued the email below On 4/30/12 4:46 PM, "Twomey, David" wrote: > >Is this possible in DataImportHandler > >I want the following XML to all collapse into one mult-valued Author field > > > > Sørlie > T > T > >

Re: correct XPATH syntax

2012-04-30 Thread Twomey, David
Answering my own question: I think I can do this by writing a script that concats the Lastname, Forname and Initials and adding that to xpath = /AuthorList/Author Yes? On 4/30/12 4:49 PM, "Twomey, David" wrote: >Sorry hit send too soon. Continued the email below > >

Logging from data-config.xml

2012-05-01 Thread Twomey, David
I'm getting this error (below) when doing an import. I'd like to add a Log line so I can see if the file path is messed up. So my data-config.xml looks like below but I'm not getting any extra info in the solr.log file under jetty. Is there a way to log to this log file from data-import.xml?

Re: Logging from data-config.xml

2012-05-01 Thread Twomey, David
fixed the error, stupid typo, but log msg didn't appear until typo was fixed. I would have thought they would be unrelated. On 5/1/12 10:42 AM, "Twomey, David" wrote: > > >I'm getting this error (below) when doing an import. I'd like to add a >Log

Re: correct XPATH syntax

2012-05-01 Thread Twomey, David
onfig: XML Snippet for Author: Malathi K K Xiao Y Y Mitchell A P AP Response from SOLR: Journal of cancer research and clinical oncology Thanks David On 5/1/12 8:05 AM, "lboutros" wrote: >Hi Dav

Re: correct XPATH syntax

2012-05-03 Thread Twomey, David
When I set Xpath like this I get this in the index Starremans Patrick G J F PG Van der Kemp Annemiete W C M AW . . note: the forename field is included My author field in the schema.xml is So is this even possible with XPathEntityProcessor? Thanks David On 5/3/12 8:40 AM, "lboutros&

Re: Advanced search with results matrix

2012-05-04 Thread David Radunz
; count, if you ONLY want the results count simply set rows to 0 (but im guessing you will want both the results and the count as to avoid 2 trips). - The 'results count' is here: start="0"/> (being numFound) David On 4/05/2012 4:46 PM, Gnanakumar wrote: Hi, First o

Re: indexing unstructured text (tweets)

2012-05-28 Thread David Radunz
an use any programming language your comfortable with and load it into Solr via various means. Obviously you can add more 'meta' fields that you get from twitter if you want as well. David On 28/05/2012 9:37 PM, Giovanni Gherdovich wrote: Hi all. I am in the process of setting up Solr for my a

RE: DIH full-import failure, no real error message

2010-11-16 Thread Buttler, David
I am using the solr cloud branch on 6 machines. I first load PubMed into HBase, and then push the fields I care about to solr. Indexing from HBase to solr takes about 18 minutes. Loading to hbase takes a little longer (2 hours?), but it only happens once so I haven't spent much time trying to

Per field facet limit

2010-11-17 Thread David Yang
I tried a_id 30 b_id 3 Which didn't work, as well as plain 'facet.mincount' twice which also didn't work. Cheers, David.

RE: Per field facet limit

2010-11-17 Thread David Yang
Thanks! Is there any way to apply this to facet queries as well? (I could just apply a f.field.facet.limit to each and every field, and then apply a global facet.limit for facet queries.) Cheers david -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Wednesday

RE: Per field facet limit

2010-11-17 Thread David Yang
Sorry for the typo, I meant mincount, not limit... :p Cheers, David -Original Message- From: David Yang [mailto:dy...@nextjump.com] Sent: Wednesday, November 17, 2010 6:15 PM To: solr-user@lucene.apache.org Subject: RE: Per field facet limit Thanks! Is there any way to apply this to

RE: Per field facet limit

2010-11-17 Thread David Yang
;s up to you to supply a query that will give the count you want, it won't use facet.limit or facet.mincount, those parameters apply to ordinary facetting where you get many values per field, to filter the values per field. Each facet.query only gives you one count already. David Yang wro

Re: Solr DataImportHandler (DIH) and Cassandra

2010-12-01 Thread David Stuart
This is good timing I am/was just to embark on a spike if anyone is keen to help out On 30 Nov 2010, at 00:37, Mark wrote: > The DataSource subclass route is what I will probably be interested in. Are > there are working examples of this already out there? > > On 11/29/10 12:32 PM, Aaron Mort

Searchers and Warmups

2011-01-13 Thread David Cramer
I'm trying to understand the mechanics behind warming up, when new searchers are registered, and their costs. A quick Google didn't point me in the right direction, so hoping for some of that here. -- David Cramer

Adding metadata to a Solr schema

2011-01-19 Thread David McLaughlin
Hi, I need to add some meta data to a schema file in Solr - such a version and current transaction id. I need to be able to query Solr to get this information. What would be the best way to do this? Thanks, David

Re: Adding metadata to a Solr schema

2011-01-19 Thread David McLaughlin
able to set/get this data at runtime? Thanks, David On Wed, Jan 19, 2011 at 8:56 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > David, > > I'm not sure if you are asking about adding this to the schema.xml file or > to > the Solr schema and therefore the S

Tokenizer that Protects Phrases

2011-02-22 Thread David Yang
ether is a token. For ex. "laptop battery", "security camera". Kind of like protwords, but like protphrases. Is this a good idea to solve this problem? How do I implement it if it is the right way? If there is a better way of dealing with this what is it? Thanks for your time, David

RE: DIH and updating specific record

2011-02-22 Thread David Yang
Chris Hostetter answered this just recently: http://wiki.apache.org/solr/DataImportHandler#Accessing_request_paramete rs My addition: Pass a parameter like command=delta-import&idz=31415 And access it via 'sql where id=${dataimporter.request.idz}' If the idz is a string you might need to prequote

RE: Sort Stability With Date Boosting and Rounding

2011-02-22 Thread David Yang
One suggestion: use logarithms to compress the large time range into something easier to compare: 1/log(ms(now,date) -Original Message- From: Stephen Duncan Jr [mailto:stephen.dun...@gmail.com] Sent: Tuesday, February 22, 2011 6:03 PM To: solr-user@lucene.apache.org Subject: Sort Stabili

Detecting an empty index during start-up

2011-03-24 Thread David McLaughlin
mpler way to do what I want? Or will I just need to have a thread which waits until the Searcher is available before setting the state? Thanks, David

Re: Detecting an empty index during start-up

2011-03-25 Thread David McLaughlin
Thanks Chris. I dug into the SolrCore code and after reading some of the code I ended up going with core.getNewestSearcher(true) and this fixed the problem. David On Thu, Mar 24, 2011 at 7:20 PM, Chris Hostetter wrote: > : I am not familiar with Solr internals, so the approach I wanted to t

Dismax and worddelimiterfilter

2011-03-25 Thread David Yang
ams but that would impose a pretty large overhead, since the field could be a long normal string with one model number in it. I noticed when I used WordDelimiterFilterFactory the dismax would convert the parsed query to some pre-analyzed query. Cheers, David

does overwrite=false work with json

2011-04-03 Thread David Murphy
I'm doing some performance benchmarking of Solr and I started with a single big JSON file containing all the docs that I'm sending via curl. The results are fantastic - I'm achieving an indexing rate of about 44,000 docs/sec using this method (these are really small test docs). In the past I hav

Re: does overwrite=false work with json

2011-04-04 Thread David Murphy
I tried it with the example json documents, and even if I add overwrite=false to the URL, it still overwrites. Do this twice: curl 'http://localhost:8983/solr/update/json?commit=true&overwrite=false' --data-binary @books.json -H 'Content-type:application/json' Then do this query: curl 'http://l

Re: Solr architecture diagram

2011-04-07 Thread David MARTIN
Hi, Thank you for this contribution. Such a diagram could be useful in the official documentation. David On Thu, Apr 7, 2011 at 12:15 PM, Jeffrey Chang wrote: > This is awesome; thank you! > > On Thu, Apr 7, 2011 at 6:09 PM, Jan Høydahl wrote: > > > Hi, > > > >

problem getting Solr to commit

2011-05-27 Thread David Hill
We verified with the fiddler proxy server that when we use the Java CommonsHttpSolrServer to communicate with our Solr server we are not able to get the client to post a message back to Solr. The result is that we can't force the tail end of a batch job to commit after it has run and we can't

<    1   2   3   4   5   6   7   8   9   10   >