Re: Responses getting truncated

2009-09-03 Thread Rupert Fiasco
So we have been running LucidWorks for Solr for about a week now and have seen no problems - so I believe it was due to that buffering issue in Jetty 6.1.3, estimated here: >>> It really looks like you're hitting a lower-level IO buffering bug >>> (esp when you see a response starting off with the

Re: Sanity check: ResonseWriter directly to a database?

2009-09-03 Thread seanoc5
Avlesh, Great response, just what I was looking for. As far as QueryResponseWriter vs RequestHandler: you're absolutely right, request handling is the way to go. It looks like I can start with something like : public class SearchSavesToDBHandler extends RequestHandlerBase implements SolrCoreAwa

Re: Exact Word Search

2009-09-03 Thread bhaskar chandrasekar
Hi shalin,   Thanks for your reply. I am not sure as how the query is formed in Solr. If you could throw some light on this , it will be helpful. Is it achievable?.   Regards Bhaskar --- On Thu, 9/3/09, Shalin Shekhar Mangar wrote: From: Shalin Shekhar Mangar Subject: Re: Exact Word Search To

Re: Sanity check: ResonseWriter directly to a database?

2009-09-03 Thread Avlesh Singh
> > Are there any hidden gotchas--or even basic suggestions--regarding > implementing something like a DBResponseWriter that puts responses right > into a database? > Absolutely not! A QueryResponseWriter with an empty "write" method fulfills all interface obligations. My only question is, why do y

Re: Problem querying for a value with a "space"

2009-09-03 Thread Chris Hostetter
: Use +specific_LIST_s:(For Sale) : or : +specific_LIST_s:"For Sale" those are *VERY* different queries. The first is just syntac sugar for... +specific_LIST_s:For +specific_LIST_s:Sale ...which is not the same as the second query (especially when using StrField, or KeyworddTokenizer) -Ho

Re: Logging solr requests

2009-09-03 Thread Chris Hostetter
: - I think that the use of log files is discouraged, but i don't know if i : can modify solr settings to log to a server (via rmi or http) : - Don't want to drop down solr response performance discouraged by who? ... having aseperate process tail your log file and build an index that way is th

Solr, JNDI config, dataDir, and solr home problem

2009-09-03 Thread Archon810
Here's my problem. I'm trying to follow a multi Solr setup, straight from the Solr wiki - http://wiki.apache.org/solr/SolrTomcat#head-024d7e11209030f1dbcac9974e55106abae837ac. Here's the relevant code: Re: Re : Using SolrJ with Tika
See https://issues.apache.org/jira/browse/SOLR-1411 On Sep 3, 2009, at 6:47 AM, Angel Ice wrote: Hi This is the solution I was testing. I got some difficulties with AutoDetectParser but I think it's the solution I will use in the end. Thanks for the advice anyway :) Regards, Laurent

Re: Sorting performance + replication of index between cores

Did u guys find a solution? I am having a similar issue. Setup: One indexer box & 2 searcher box. Each having 6 different solr-cores We have a lot of updates (in the range of a couple thousand items every few mins). The Snappuller/Snapinstaller pulls and commits every 5 mins. Query response time

Re: Optimal Cache Settings, complicated by regular commits

: I'm trying to work out the optimum cache settings for our Solr server, I'll : begin by outlining our usage. ...but you didn't give any information about what your cache settings look like ... size is only part of the picture, the autowarm counts are more significant. : Commit frequency: some

Re: Impact of compressed=true attribute (in schema.xml) on Indexing/Query

: Now the question is, how the compressed=true flag impacts the indexing : and Querying operations. I am sure that there will be CPU utilization : spikes as there will be operation of compressing(during indexing) and : uncompressing(during querying) of the indexed data. I am mainly looking : f

Re: SnowballPorterFilterFactory stemming word question

: If i give "machine" why is that it stems to "machin", now from where does : this word come from : If i give "revolutionary" it stems to "revolutionari", i thought it should : stem to revolution. : : How does stemming work? the porter stemmer (and all of the stemmers provided with solr) are pr

Re: Searching with or without diacritics

Take a look at the MappingCharFilterFactory (in Solr 1.4) and/or the ISOLatin1AccentFilterFactory. : Date: Thu, 27 Aug 2009 16:30:08 +0200 : From: "[ISO-8859-1] Gy�rgy Frivolt" : Reply-To: solr-user@lucene.apache.org : To: solr-user : Subject: Searching with or without diacritics : : Hello, :

Single Core or Multiple Core?

It seems like it is really hard to decide when the Multiple Core solution is more appropriate.As I could understand from this list and wiki the Multiple Core feature was designed to address the need of handling different sets of data within the same solr instance, where the sets of data don't need

Re: Problem with ResponseBuilder

: DocListAndSet results = new DocListAndSet(); : Hits h = searcher.search(rb.getQuery()); ... : Is this the correct way to obtain the docs? Uh not really. why are you using the Hits method at all? why don't you call the searcher.search method that returns a DocListAndSet instead?

Re: how to get highlighter to only show matched term

: text. Basically, I just want to know which of the terms in my query : matched and in which field they matched (could be different from my : example). I assume that I may need to write my own Formatter for just : outputting nothing. But, I'm not sure where to start to get only my : needed ter

Re: Clarifications to Synonym Filter Wiki entry? (2 of 2)

: Earlier on the thread repeats the claim that, if you use index side : expansion, you won't have a problem. But it doesn't explain how/why that : fixes it, given that the Lucene parser still breaks on white space. because at query time, nothing knows (or cares) that that multiple variants were

Re: Best way to do a lucene matchAllDocs not using q.alt=*:*

The statistics page will also give you numDocs (it is an xml response). On Fri, Sep 4, 2009 at 2:24 AM, Uri Boness wrote: > you can use LukeRequestHandler http://localhost:8983/solr/admin/luke > > > Marc Sturlese wrote: > >> Hey there, >> I need a query to get the total number of documents in my

Re: Clarifications to Synonym Filter Wiki entry? (1 of 2)

: I believe the following section is a bit misleading; I'm sure it's correct : for the case it describes, but there's another case I've tested, which on : the surface seemed similar, but where the actual results were different and : in hindsight not really a conflict, just a surprise. the crux of

Sanity check: ResonseWriter directly to a database?

Hello all, Are there any hidden gotchas--or even basic suggestions--regarding implementing something like a DBResponseWriter that puts responses right into a database? My specific questions are: 1) Any problems adding non-trivial jars to a solr plugin? I'm thinkin JDBC and then perhaps Hibernate

Re: Using scoring from another program

Function queries is what you need: http://wiki.apache.org/solr/FunctionQuery Paul Tomblin wrote: Every document I put into Solr has a field "origScore" which is a floating point number between 0 and 1 that represents a score assigned by the program that generated the document. I would like it t

Using scoring from another program

Every document I put into Solr has a field "origScore" which is a floating point number between 0 and 1 that represents a score assigned by the program that generated the document. I would like it that when I do a query, it uses that origScore in the scoring, perhaps multiplying the Solr score to

Re: Best way to do a lucene matchAllDocs not using q.alt=*:*

you can use LukeRequestHandler http://localhost:8983/solr/admin/luke Marc Sturlese wrote: Hey there, I need a query to get the total number of documents in my index. I can get if I do this using DismaxRequestHandler: q.alt=*:*&facet=false&hl=false&rows=0 I have noticed this query is very memory

Re: Field Collapsing (was Re: Schema for group/child entity setup)

The collapsed documents are represented by one "master" document which can be part of the normal search result (the doc list), so pagination just works as expected, meaning taking only the returned documents in account (ignoring the collapsed ones). As for the scoring, the "master" document is

Re: how to scan dynamic field without specifying each field in query

I am thinking that my example was too simple/generic :-U. It is possible for more several dynamic fields to exist and other functionality to be required. i.e. what about if my example had read: http://localhost:8994/solr/select?q=((Foo1:3 OR Foo2:3 OR Foo3:3 OR … Foo999:3) AND (Bar1:1 OR Bar2:1

Re: how to scan dynamic field without specifying each field in query

Hi, maybe SIREn [1] can help you for this task. SIREn is a Lucene plugin that allows to index and query tabular data. You can for example create a SIREn field "foo", index n values in n cells, and then query a specific cell or a range of cells. Unfortunately, the Solr plugin is not yet availa

Re: how to scan dynamic field without specifying each field in query

A query parser, may be. But that would not help either. End of the day, someone has to create those many boolean queries in your case. Cheers Avlesh On Thu, Sep 3, 2009 at 10:59 PM, gdeconto wrote: > > thx for the reply. > > you mean into a multivalue field? possible, but was wondering if there

Re: How to use DataImportHandler with ExtractingRequestHandler?

Hi Khai, a few weeks ago, I was facing the same problem. In my case, this workaround helped (assuming, you're using Solr 1.3): For each row, extract the content from the corresponding pdf file using a parser library of your choice (I suggest Apache PDFBox or Apache Tika in case you need to pr

RE: how to scan dynamic field without specifying each field in query

thx for the reply. you mean into a multivalue field? possible, but was wondering if there was something more flexible than that. the ability to use a function (ie myfunction) would open up some possibilities for more complex searching and search syntax. I could write my own query parser with s

Re: how to scan dynamic field without specifying each field in query

> > I know I can do this via this: http://localhost:8994/solr/select?q=(Foo1:3OR > Foo2:3 OR Foo3:3 OR ... Foo999:3) > Careful! You may hit the upper limit for MAX_BOOLEAN_CLAUSES this way. > You can copy the dynamic fields value into a different field and query on > that field. > Good idea! Ch

RE: how to scan dynamic field without specifying each field in query

You can copy the dynamic fields value into a different field and query on that field. Thanks, Kalyan Manepalli -Original Message- From: gdeconto [mailto:gerald.deco...@topproducer.com] Sent: Thursday, September 03, 2009 12:06 PM To: solr-user@lucene.apache.org Subject: how to scan dynam

how to scan dynamic field without specifying each field in query

say I have a dynamic field called Foo* (where * can be in the hundreds) and want to search Foo* for a value of 3 (for example) I know I can do this via this: http://localhost:8994/solr/select?q=(Foo1:3 OR Foo2:3 OR Foo3:3 OR … Foo999:3) However, is there a better way? i.e. is there some way to

RE: Solr question

Response with id:doc4 is OK − 0 3 − on 0 id:doc4 2.2 10 − − − Sami Siren − application/pdf − − Example PDF document Tika Solr Cell This is a sample piece of content for Tika Solr Cell article. − Wed Dec 31 10:17:13 CET 2008 − Writer − OpenOffice.org 3.0 − applicati

Default Query Type For Facet Queries

We have a custom query parser plugin registered as the default for searches, and we'd like to have the same parser used for facet.query. Is there a way to register it as the default for FacetComponent in solrconfig.xml? I know I can add {!type=customparser} to each query as a workaround, but I'd

Best way to do a lucene matchAllDocs not using q.alt=*:*

Hey there, I need a query to get the total number of documents in my index. I can get if I do this using DismaxRequestHandler: q.alt=*:*&facet=false&hl=false&rows=0 I have noticed this query is very memory consuming. Is there any more optimized way in trunk to get the total number of documents of

Re: score = sum of boosts

You could start with a TF formula that ignores frequencies above 1. "onOffTF", I guess, returning 1 if the term is there one or more times. Or, you could tell us what you are trying to achieve. wunder On Sep 3, 2009, at 12:28 AM, Shalin Shekhar Mangar wrote: On Thu, Sep 3, 2009 at 4:09 AM, J

RE: Solr question

Thanks My idea was that is I have in schema.xml Eveything was stored in the index. The query "solr" or other stuff works well only with text given in the sample files Rgds Bruno > -Message d'origine- > De : Erik Hatcher [mailto:erik.hatc...@gmail.com] > Envoyé : jeudi 3 septembre 200

Re : Using SolrJ with Tika

Hi This is the solution I was testing. I got some difficulties with AutoDetectParser but I think it's the solution I will use in the end. Thanks for the advice anyway :) Regards, Laurent De : Abdullah Shaikh À : solr-user@lucene.apache.org Envoyé le : Jeu

Re: Exact Word Search

On Thu, Sep 3, 2009 at 1:33 PM, bhaskar chandrasekar wrote: > Hi, > > Can any one help me with the below scenario?. > > Scenario : > > I have integrated Solr with Carrot2. > The issue is > Assuming i give "bhaskar" as input string for search. > It should give me search results pertaining to bhaska

Re: Using SolrJ with Tika

Hi Laurent, I am not sure if this is what you need, but you can extract the content from the uploaded document (MS Docs, PDF etc) using TIKA and then send it to SOLR for indexing. String CONTENT = extract the content using TIKA (you can use AutoDetectParser) and then, SolrInputDocument doc = ne

Indexing docs using TIKA

I am not sure if this went to Mailing List before.. hence forwarding again Hi All, I want to search for a document containing "string to search", price between 100 to 200 and weight 10-20. SolrQuery query = new SolrQuery(); query.setQuery( "DOC_CONTENT: string to search"); query.setFilterQuerie

Re: Solr question

On Sep 3, 2009, at 1:24 AM, SEZNEC Bruno wrote: Hi, Following solr tuto, I send doc to solr by request : curl 'http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr_&map . content=attr_content&commit=true' --F "myfi...@oxiane.pdf" 023717 Reply seems OK, content is in the

Re: Question: How do I run the solr analysis tool programtically ?

Hi Yatir, The FieldAnalysisRequestHandler has the same behavior as the analysis tool. It will show you the list of tokens that are created after each of the filters have been applied. It can be used through normal HTTP requests, or you can use SolrJ's support. Thanks, Chris On Thu, Sep 3, 2009

Question: How do I run the solr analysis tool programtically ?

Form java code I want to contact solr through Http and supply a text buffer (or a url that returns text, whatever is easier) and I want to get in return the final list of tokens (or the final text buffer) after it went through all the query time filters defined for this solr instance (stemming, st

Re: Field Collapsing (was Re: Schema for group/child entity setup)

Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness wrote: > The development on this patch is quite active. It works well for single > solr instance, but distributed search (ie

Re: questions about solr

On Wed, Sep 2, 2009 at 10:44 PM, Zhenyu Zhong wrote: > Dear all, > > I am very interested in Solr and would like to deploy Solr for distributed > indexing and searching. I hope you are the right Solr expert who can help > me > out. > However, I have concerns about the scalability and management ov

Solr question

Hi, Following solr tuto, I send doc to solr by request : curl 'http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr_&map. content=attr_content&commit=true' --F "myfi...@oxiane.pdf" 023717 Reply seems OK, content is in the index, but after no query match the doc... TIA Regar

Exact Word Search

Hi,   Can any one help me with the below scenario?.   Scenario :   I have integrated Solr with Carrot2. The issue is Assuming i give "bhaskar" as input string for search. It should give me search results pertaining to bhaskar only.  Example: It should not display search results as "chandarbhaskar"

Re: Field Collapsing (was Re: Schema for group/child entity setup)

The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the for

Re: Return 2 fields per facet.. name and id, for example? / facet value search

On Fri, Aug 28, 2009 at 12:57 AM, Rihaed Tan wrote: > Hi, > > I have a similar requirement to Matthew (from his post 2 years ago). Is > this > still the way to go in storing both the ID and name/value for facet values? > I'm planning to use id#name format if this is still the case and doing a > p

Re: Problem querying for a value with a "space"

On Thu, Sep 3, 2009 at 1:45 AM, Adam Allgaier wrote: > > omitNorms="true"/> > ... > > > I am indexing the "specific_LIST_s" with the value "For Sale". > The document indexes just fine. A query returns the document with the > proper value: >For Sale > > However, when I try to query on that

Re: score = sum of boosts

On Thu, Sep 3, 2009 at 4:09 AM, Joe Calderon wrote: > hello *, what would be the best approach to return the sum of boosts > as the score? > > ex: > a dismax handler boosts matches to field1^100 and field2^50, a query > matches both fields hence the score for that row would be 150 > > Not really.

Re: WordDelimiterFilter to QueryParser to MultiPhraseQuery?

On Mon, Aug 31, 2009 at 10:47 PM, jOhn wrote: > This is mostly my misunderstanding of catenateAll="1" as I thought it would > break down with an OR using the full concatenated word. > > Thus: > > Jokers Wild -> { jokers, wild } OR { jokerswild } > > But really it becomes: { jokers, {wild, jokersw