So we have been running LucidWorks for Solr for about a week now and
have seen no problems - so I believe it was due to that buffering
issue in Jetty 6.1.3, estimated here:
>>> It really looks like you're hitting a lower-level IO buffering bug
>>> (esp when you see a response starting off with the
Avlesh,
Great response, just what I was looking for.
As far as QueryResponseWriter vs RequestHandler: you're absolutely right,
request handling is the way to go. It looks like I can start with something
like :
public class SearchSavesToDBHandler extends RequestHandlerBase implements
SolrCoreAwa
Hi shalin,
Thanks for your reply.
I am not sure as how the query is formed in Solr.
If you could throw some light on this , it will be helpful.
Is it achievable?.
Regards
Bhaskar
--- On Thu, 9/3/09, Shalin Shekhar Mangar wrote:
From: Shalin Shekhar Mangar
Subject: Re: Exact Word Search
To
>
> Are there any hidden gotchas--or even basic suggestions--regarding
> implementing something like a DBResponseWriter that puts responses right
> into a database?
>
Absolutely not! A QueryResponseWriter with an empty "write" method fulfills
all interface obligations. My only question is, why do y
: Use +specific_LIST_s:(For Sale)
: or
: +specific_LIST_s:"For Sale"
those are *VERY* different queries.
The first is just syntac sugar for...
+specific_LIST_s:For +specific_LIST_s:Sale
...which is not the same as the second query (especially when using
StrField, or KeyworddTokenizer)
-Ho
: - I think that the use of log files is discouraged, but i don't know if i
: can modify solr settings to log to a server (via rmi or http)
: - Don't want to drop down solr response performance
discouraged by who? ... having aseperate process tail your log file and
build an index that way is th
Here's my problem.
I'm trying to follow a multi Solr setup, straight from the Solr wiki -
http://wiki.apache.org/solr/SolrTomcat#head-024d7e11209030f1dbcac9974e55106abae837ac.
Here's the relevant code:
Re: Re : Using SolrJ with Tika
See https://issues.apache.org/jira/browse/SOLR-1411
On Sep 3, 2009, at 6:47 AM, Angel Ice wrote:
Hi
This is the solution I was testing.
I got some difficulties with AutoDetectParser but I think it's the
solution I will use in the end.
Thanks for the advice anyway :)
Regards,
Laurent
Did u guys find a solution?
I am having a similar issue.
Setup:
One indexer box & 2 searcher box. Each having 6 different solr-cores
We have a lot of updates (in the range of a couple thousand items every few
mins).
The Snappuller/Snapinstaller pulls and commits every 5 mins.
Query response time
: I'm trying to work out the optimum cache settings for our Solr server, I'll
: begin by outlining our usage.
...but you didn't give any information about what your cache settings look
like ... size is only part of the picture, the autowarm counts are more
significant.
: Commit frequency: some
: Now the question is, how the compressed=true flag impacts the indexing
: and Querying operations. I am sure that there will be CPU utilization
: spikes as there will be operation of compressing(during indexing) and
: uncompressing(during querying) of the indexed data. I am mainly looking
: f
: If i give "machine" why is that it stems to "machin", now from where does
: this word come from
: If i give "revolutionary" it stems to "revolutionari", i thought it should
: stem to revolution.
:
: How does stemming work?
the porter stemmer (and all of the stemmers provided with solr) are
pr
Take a look at the MappingCharFilterFactory (in Solr 1.4) and/or the
ISOLatin1AccentFilterFactory.
: Date: Thu, 27 Aug 2009 16:30:08 +0200
: From: "[ISO-8859-1] Gy�rgy Frivolt"
: Reply-To: solr-user@lucene.apache.org
: To: solr-user
: Subject: Searching with or without diacritics
:
: Hello,
:
It seems like it is really hard to decide when the Multiple Core solution is
more appropriate.As I could understand from this list and wiki the Multiple
Core feature was designed to address the need of handling different sets of
data within the same solr instance, where the sets of data don't need
: DocListAndSet results = new DocListAndSet();
: Hits h = searcher.search(rb.getQuery());
...
: Is this the correct way to obtain the docs?
Uh not really. why are you using the Hits method at all? why don't
you call the searcher.search method that returns a DocListAndSet instead?
: text. Basically, I just want to know which of the terms in my query
: matched and in which field they matched (could be different from my
: example). I assume that I may need to write my own Formatter for just
: outputting nothing. But, I'm not sure where to start to get only my
: needed ter
: Earlier on the thread repeats the claim that, if you use index side
: expansion, you won't have a problem. But it doesn't explain how/why that
: fixes it, given that the Lucene parser still breaks on white space.
because at query time, nothing knows (or cares) that that multiple
variants were
The statistics page will also give you numDocs (it is an xml response).
On Fri, Sep 4, 2009 at 2:24 AM, Uri Boness wrote:
> you can use LukeRequestHandler http://localhost:8983/solr/admin/luke
>
>
> Marc Sturlese wrote:
>
>> Hey there,
>> I need a query to get the total number of documents in my
: I believe the following section is a bit misleading; I'm sure it's correct
: for the case it describes, but there's another case I've tested, which on
: the surface seemed similar, but where the actual results were different and
: in hindsight not really a conflict, just a surprise.
the crux of
Hello all,
Are there any hidden gotchas--or even basic suggestions--regarding
implementing something like a DBResponseWriter that puts responses right
into a database? My specific questions are:
1) Any problems adding non-trivial jars to a solr plugin? I'm thinkin JDBC
and then perhaps Hibernate
Function queries is what you need: http://wiki.apache.org/solr/FunctionQuery
Paul Tomblin wrote:
Every document I put into Solr has a field "origScore" which is a
floating point number between 0 and 1 that represents a score assigned
by the program that generated the document. I would like it t
Every document I put into Solr has a field "origScore" which is a
floating point number between 0 and 1 that represents a score assigned
by the program that generated the document. I would like it that when
I do a query, it uses that origScore in the scoring, perhaps
multiplying the Solr score to
you can use LukeRequestHandler http://localhost:8983/solr/admin/luke
Marc Sturlese wrote:
Hey there,
I need a query to get the total number of documents in my index. I can get
if I do this using DismaxRequestHandler:
q.alt=*:*&facet=false&hl=false&rows=0
I have noticed this query is very memory
The collapsed documents are represented by one "master" document which
can be part of the normal search result (the doc list), so pagination
just works as expected, meaning taking only the returned documents in
account (ignoring the collapsed ones). As for the scoring, the "master"
document is
I am thinking that my example was too simple/generic :-U. It is possible for
more several dynamic fields to exist and other functionality to be required.
i.e. what about if my example had read:
http://localhost:8994/solr/select?q=((Foo1:3 OR Foo2:3 OR Foo3:3 OR …
Foo999:3) AND (Bar1:1 OR Bar2:1
Hi,
maybe SIREn [1] can help you for this task. SIREn is a Lucene plugin
that allows to index and query tabular data. You can for example create
a SIREn field "foo", index n values in n cells, and then query a
specific cell or a range of cells. Unfortunately, the Solr plugin is not
yet availa
A query parser, may be.
But that would not help either. End of the day, someone has to create those
many boolean queries in your case.
Cheers
Avlesh
On Thu, Sep 3, 2009 at 10:59 PM, gdeconto wrote:
>
> thx for the reply.
>
> you mean into a multivalue field? possible, but was wondering if there
Hi Khai,
a few weeks ago, I was facing the same problem.
In my case, this workaround helped (assuming, you're using Solr 1.3):
For each row, extract the content from the corresponding pdf file using
a parser library of your choice (I suggest Apache PDFBox or Apache Tika
in case you need to pr
thx for the reply.
you mean into a multivalue field? possible, but was wondering if there was
something more flexible than that. the ability to use a function (ie
myfunction) would open up some possibilities for more complex searching and
search syntax.
I could write my own query parser with s
>
> I know I can do this via this: http://localhost:8994/solr/select?q=(Foo1:3OR
> Foo2:3 OR Foo3:3 OR ... Foo999:3)
>
Careful! You may hit the upper limit for MAX_BOOLEAN_CLAUSES this way.
> You can copy the dynamic fields value into a different field and query on
> that field.
>
Good idea!
Ch
You can copy the dynamic fields value into a different field and query on that
field.
Thanks,
Kalyan Manepalli
-Original Message-
From: gdeconto [mailto:gerald.deco...@topproducer.com]
Sent: Thursday, September 03, 2009 12:06 PM
To: solr-user@lucene.apache.org
Subject: how to scan dynam
say I have a dynamic field called Foo* (where * can be in the hundreds) and
want to search Foo* for a value of 3 (for example)
I know I can do this via this:
http://localhost:8994/solr/select?q=(Foo1:3 OR Foo2:3 OR Foo3:3 OR …
Foo999:3)
However, is there a better way? i.e. is there some way to
Response with id:doc4 is OK
−
0
3
−
on
0
id:doc4
2.2
10
−
−
−
Sami Siren
−
application/pdf
−
−
Example PDF document Tika Solr Cell
This is a sample piece of content for Tika Solr Cell article.
−
Wed Dec 31 10:17:13 CET 2008
−
Writer
−
OpenOffice.org 3.0
−
applicati
We have a custom query parser plugin registered as the default for searches,
and we'd like to have the same parser used for facet.query.
Is there a way to register it as the default for FacetComponent in
solrconfig.xml?
I know I can add {!type=customparser} to each query as a workaround, but I'd
Hey there,
I need a query to get the total number of documents in my index. I can get
if I do this using DismaxRequestHandler:
q.alt=*:*&facet=false&hl=false&rows=0
I have noticed this query is very memory consuming. Is there any more
optimized way in trunk to get the total number of documents of
You could start with a TF formula that ignores frequencies above 1.
"onOffTF", I guess, returning 1 if the term is there one or more times.
Or, you could tell us what you are trying to achieve.
wunder
On Sep 3, 2009, at 12:28 AM, Shalin Shekhar Mangar wrote:
On Thu, Sep 3, 2009 at 4:09 AM, J
Thanks
My idea was that is I have
in schema.xml
Eveything was stored in the index.
The query "solr" or other stuff works well only with text given in the sample
files
Rgds
Bruno
> -Message d'origine-
> De : Erik Hatcher [mailto:erik.hatc...@gmail.com]
> Envoyé : jeudi 3 septembre 200
Hi
This is the solution I was testing.
I got some difficulties with AutoDetectParser but I think it's the solution I
will use in the end.
Thanks for the advice anyway :)
Regards,
Laurent
De : Abdullah Shaikh
À : solr-user@lucene.apache.org
Envoyé le : Jeu
On Thu, Sep 3, 2009 at 1:33 PM, bhaskar chandrasekar
wrote:
> Hi,
>
> Can any one help me with the below scenario?.
>
> Scenario :
>
> I have integrated Solr with Carrot2.
> The issue is
> Assuming i give "bhaskar" as input string for search.
> It should give me search results pertaining to bhaska
Hi Laurent,
I am not sure if this is what you need, but you can extract the content from
the uploaded document (MS Docs, PDF etc) using TIKA and then send it to SOLR
for indexing.
String CONTENT = extract the content using TIKA (you can use
AutoDetectParser)
and then,
SolrInputDocument doc = ne
I am not sure if this went to Mailing List before.. hence forwarding again
Hi All,
I want to search for a document containing "string to search", price between
100 to 200 and weight 10-20.
SolrQuery query = new SolrQuery();
query.setQuery( "DOC_CONTENT: string to search");
query.setFilterQuerie
On Sep 3, 2009, at 1:24 AM, SEZNEC Bruno wrote:
Hi,
Following solr tuto,
I send doc to solr by request :
curl
'http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr_&map
.
content=attr_content&commit=true' --F "myfi...@oxiane.pdf"
023717
Reply seems OK, content is in the
Hi Yatir,
The FieldAnalysisRequestHandler has the same behavior as the analysis tool.
It will show you the list of tokens that are created after each of the
filters have been applied. It can be used through normal HTTP requests, or
you can use SolrJ's support.
Thanks,
Chris
On Thu, Sep 3, 2009
Form java code I want to contact solr through Http and supply a text buffer
(or a url that returns text, whatever is easier) and I want to get in return
the final list of tokens (or the final text buffer) after it went through
all the query time filters defined for this solr instance (stemming, st
Thanks Uri. How does paging and scoring work when using field collapsing?
What patch works with 1.3? Is it production ready?
R
On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness wrote:
> The development on this patch is quite active. It works well for single
> solr instance, but distributed search (ie
On Wed, Sep 2, 2009 at 10:44 PM, Zhenyu Zhong wrote:
> Dear all,
>
> I am very interested in Solr and would like to deploy Solr for distributed
> indexing and searching. I hope you are the right Solr expert who can help
> me
> out.
> However, I have concerns about the scalability and management ov
Hi,
Following solr tuto,
I send doc to solr by request :
curl
'http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr_&map.
content=attr_content&commit=true' --F "myfi...@oxiane.pdf"
023717
Reply seems OK, content is in the index,
but after no query match the doc...
TIA
Regar
Hi,
Can any one help me with the below scenario?.
Scenario :
I have integrated Solr with Carrot2.
The issue is
Assuming i give "bhaskar" as input string for search.
It should give me search results pertaining to bhaskar only.
Example: It should not display search results as "chandarbhaskar"
The development on this patch is quite active. It works well for single
solr instance, but distributed search (ie. shards) is not yet supported.
Using this page you can group search results based on a specific field.
There are two flavors of field collapsing - adjacent and non-adjacent,
the for
On Fri, Aug 28, 2009 at 12:57 AM, Rihaed Tan wrote:
> Hi,
>
> I have a similar requirement to Matthew (from his post 2 years ago). Is
> this
> still the way to go in storing both the ID and name/value for facet values?
> I'm planning to use id#name format if this is still the case and doing a
> p
On Thu, Sep 3, 2009 at 1:45 AM, Adam Allgaier wrote:
>
> omitNorms="true"/>
> ...
>
>
> I am indexing the "specific_LIST_s" with the value "For Sale".
> The document indexes just fine. A query returns the document with the
> proper value:
>For Sale
>
> However, when I try to query on that
On Thu, Sep 3, 2009 at 4:09 AM, Joe Calderon wrote:
> hello *, what would be the best approach to return the sum of boosts
> as the score?
>
> ex:
> a dismax handler boosts matches to field1^100 and field2^50, a query
> matches both fields hence the score for that row would be 150
>
>
Not really.
On Mon, Aug 31, 2009 at 10:47 PM, jOhn wrote:
> This is mostly my misunderstanding of catenateAll="1" as I thought it would
> break down with an OR using the full concatenated word.
>
> Thus:
>
> Jokers Wild -> { jokers, wild } OR { jokerswild }
>
> But really it becomes: { jokers, {wild, jokersw
53 matches
Mail list logo