Hi. Lance.
Thanks for replying.
Yes. I especially checked the schema.xml and did another simple test.
The broker is running on localhost:7499/solr. A solr instance is running on
localhost:7498/solr. For this test, I only use these 2 instances. 7499's
index is empty. 7498 has 12 documents in inde
Hi Otis, Thanks for the update.
My paramteric search has to span across customer table and 30 child tables.
We have close to 1 million customers. Do you think Lucene/Solr is the right
fsolution for such requirements? or database search would be more optimal.
Regards,
Amit
--
View this message
I want to integrate document's timestamp into scoring of search. And I
find an example in the book "Solr 1.4 Enterprise Search Server" about
function query. I want to boost a document which is newer. so it may
be a function such as 1/(timestamp+1) . But the function query is
added to the final resu
Do all of the Solr instances, including the broker, use the same schema.xml?
On 6/22/10, Scott Zhang wrote:
> Hi. All.
>I was using distributed search over 30 solr instance, the previous one
> was using the standard query handler. And the result was returned correctly.
> each result has 2 fie
You need to share with us the Solr request you made, any any custom
request handler settings that might map to. Chances are you just need
to twiddle with the highlighter parameters (see wiki for docs) to get
it to do what you want.
Erik
On Jun 22, 2010, at 4:42 PM, n...@frameweld.
Martijn - Maybe the patches to SolrIndexSearcher could be extracted
into a new issue so that we can put in the infrastructure at least.
That way this could truly be a drop-in plugin without it actually
being in core. I haven't looked at the specifics, but I imagine we
could get the core s
you want a combination of WhitespaceTokenizer and EdgeNGramFilter
http://lucene.apache.org/solr/api/org/apache/solr/analysis/WhitespaceTokenizerFactory.html
http://lucene.apache.org/solr/api/org/apache/solr/analysis/EdgeNGramFilterFactory.html
the first will create tokens for each word the second
Hi,
Can you guys make a recommendation for which types/filters to use accomplish
the following partial keyword match:
A. Actual Indexed Term: "bank of america"
B. User Enters Search Term: "of ameri"
I would like SOLR to match document "bank of america" with the partial string
"of ameri"
I checked your stacktrace and I can't remember putting
SolrIndexSearcher.getDocListAndSet(...) in the doQuery(...) method. I
guess the patch was modified before it was applied.
I think the error occurs when you do a field collapse search with a fq
parameter. That is the only reason I can think of w
What exactly did not work? Patching, compiling or running it?
On 22 June 2010 16:06, Rakhi Khatwani wrote:
> Hi,
> I tried checking out the latest code (rev 956715) the patch did not
> work on it.
> Infact i even tried hunting for the revision mentioned earlier in this
> thread (i.e. rev 955
Hi, I need help with highlighting fields that would match a query. So far, my
results only highlight if the field is from all_text, and I would like it to
use other fields. It simply isn't the case if I just turn highlighting on. Any
ideas why it only applies to all_text? Here is my schema:
Fields over i'm sorting to are dynamic so one query sorts on
erick_time_1,erick_timeA_1 and other sorts on erick_time_2 and so
on.What we see in the heap are a lot of arrays,most of them,filled
with 0s maybe due to the fact that this timestamps fields are not
present in all the documents.
By the w
Muneeb Ali wrote:
>
> Hi Blargy,
>
> Nice to hear that I am not alone ;)
>
> Well we have been using Hadoop for other data-intensive services, those
> that can be done in parallel. We have multiple nodes, which are used by
> Hadoop for all our MapReduce jobs. I personally don't have much expe
as always: it depends.
take a look into hibernate search also, which is lucene powered.
Peter.
> I have complex data model with bi directional relations I Use hibernate
> as ORM provider.so I have several model objects representing data model. All
> together my model objetcs are 75 to 100 an
We (Attensity Group) have been using SOLR-1301 for 6+ months now
because we have a ready Hadoop cluster and need to be able to re/index
up to 3 billion docs. I read the various emails and wasn't sure what
you're asking.
Cheers...
On Tue, Jun 22, 2010 at 8:27 AM, Neeb wrote:
>
> Hey James,
>
> J
Hi,
I just want to know if there will be any overhead / performance degradation
if I use the Dismax search handler instead of standard search handler?
We are planning to index millions of documents and not sure if using Dismax
will slow down the search performance. Would be great if someone can
Sounds like what you want is to override Solr's "query" component.
Have a look at the built-in one and go from there.
Erik
On Jun 22, 2010, at 1:38 PM, sarfaraz masood wrote:
I am a novice in solr / lucene. but i have gone
thru the documentations of both.I have even implemented prog
I am a novice in solr / lucene. but i have gone
thru the documentations of both.I have even implemented programs in
lucene for searching etc.
My problem is to apply a new search technique other than the one used by solr.
Now as i know that lucene has its own searcher
which is used by solr as wel
Hmmm, I'm missing something here then. Sorting over 15 fields of type long
shouldn't use much memory, even if all the values are unique. When you say
"12-15 dynamic fields", are you talking about 12-15 fields per query out of
XXX total fields? And is XXX large? At a guess, how many different fields
I was playing around w/ Sqoop the other day, its a simple Cloudera tool for
imports (mysql -> hdfs) @ http://www.cloudera.com/developers/downloads/sqoop/
It seems to me (it would be pretty efficient) to dump to HDFS and have
something like Data Import Handler be able to read from hdfs:// directl
Hi Erick,
the index is quite small (1691145 docs) but sorting is massive and
often on unique timestamp fields.
OOM occur after a range of time between three and four hours.
Depending as well if users browse a part of the application.
We use solrj to make the queries so we did not use Readers obje
Hi Blargy,
Nice to hear that I am not alone ;)
Well we have been using Hadoop for other data-intensive services, those that
can be done in parallel. We have multiple nodes, which are used by Hadoop
for all our MapReduce jobs. I personally don't have much experience with its
use and hence wouldn
Well, the patch consumes the data from a csv. You have to modify the input to
use TableInputFormat (I don't remember if it's called exaclty like that) and
it will work.
Once you've done that, you have to specify as much reducers as shards you
want.
I know 2 ways to index using hadoop
method 1 (so
On 6/18/2010 2:42 PM, Chris Hostetter wrote:
: > I don't think DIH can do that, but who knows, let's see what others say.
: Looks like the ExtractingRequestHandler uses Tika as well. I might just use
: this but I'm wondering if there will be a large performance difference between
: using it to
Need,
Seems like we are in the same boat. Our index consist of 5M records which
roughly equals around 30 gigs. All in all thats not too bad however our
indexing process (we use DIH but I'm now revisiting that idea) takes a
whopping 30+ hours!!!
I just bought the Hadoop In Action early edition b
I wanted to add a Jira-issue about exactly what Otis is asking here.
Unfortunately, I haven't time for it because of my exams.
However, I'd like to add a question to Otis' ones:
If you destribute the indexing-progress this way, are you able to replicate
the different documents correctly?
Thank y
Thanks Marc,
Well I have an HBASE storage architecture and solr master-slave setup with
two slave servers.
Would this patch work with my setup? Do I need sharding in place? and what
tasks would be run at map and reduce phases?
I was thinking something like:
At Map: read documents as key/value
I think a good solution could be to use hadoop with SOLR-1301 to build solr
shards and then use solr distributed search against these shards (you will
have to copy to local from HDFS to search against them)
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr-with-hadoop-tp48
I think there's people using this patch in production:
https://issues.apache.org/jira/browse/SOLR-1301
I have tested it myself indexing data from CSV and from HBase and it works
properly
--
View this message in context:
http://lucene.472066.n3.nabble.com/anyone-use-hadoop-solr-tp485333p914553.ht
Hi,
there are reasons for both options. Usually it is a good idea to put the
default
configuration into the solrconfig.xml (and even fix some of the
configuration) in
order to have simple client-side code.
But sometimesit is necessary to have some flexibility for the actual query.
In this
si
Hi,
Sometimes I do both. I put the defaults in solrconfig.xml and thus have one
place to define all kind of low-level default settings.
But then I make a possibility in the application space to add/override any
parameters as well. This gives you great flexibility to let server
administrators (
Hi,
We currently have a master-slave setup for solr with two slave servers. We
are using Solrj (stream-update-solr-server) to index master slave, which
takes 6 hours to index around 15 million documents.
I would like to explore hadoop, in particularly for indexing job using
mapreduce approach.
Hey James,
Just wondering if you ever had a chance to try out hadoop with solr? Would
appreciate any information/directions you could give.
I am particularly interested in indexing using a mapreduce job.
Cheers,
-Ali
--
View this message in context:
http://lucene.472066.n3.nabble.com/anyone-u
Hi. All.
I was using distributed search over 30 solr instance, the previous one
was using the standard query handler. And the result was returned correctly.
each result has 2 fields. "ID" and "type".
Today I want to use search withk dismax, I tried search with each
instance with dismax. It wo
H.. A couple of details I'm wondering about. How many
documents are we talking about in your index? Do you get
OOMs when you start fresh or does it take a while?
You've done some good investigations, so it seems like there
could well be something else going on here than just "the usual
suspect
Have you already looked at Hibernate Search?
It combines Hibernate ORM with indexing/searching functionality of
Lucene.
The latest version even comes with the Solr analyzers.
http://www.hibernate.org/subprojects/search.html
Regards,
Tom
-Original Message-
From: fachhoch [mailto:fachh...@
Perhaps my answer is useless, bc I don't have an answer to your direct
question, but:
You *might* want to consider if your concept of a solr-document is on the
correct granular level, i.e:
your problem posted could be tackled (afaik) by defining a document being a
'sub-event' with only 1 daterang
Curiosity is good . Do be aware, though, that the behavior is not
guaranteed,
it's just "how things happen to work" and may change without warning
Erick
On Tue, Jun 22, 2010 at 4:01 AM, Marc Sturlese wrote:
>
> >>Well, sorting requires that all the unique values in the target field
> >>get l
I have complex data model with bi directional relations I Use hibernate
as ORM provider.so I have several model objects representing data model. All
together my model objetcs are 75 to 100 and my database each table has
several records like 20,000.
please suggest in my case will text search
Hi,
I tried checking out the latest code (rev 956715) the patch did not
work on it.
Infact i even tried hunting for the revision mentioned earlier in this
thread (i.e. rev 955615) but cannot find it in the repository. (it has
revision 955569 followed by revision 955785).
Any pointers??
Regar
Hi all,
Firstly, I apologise for the length of this email but I need to
describe properly what I'm doing before I get to the problem!
I'm working on a project just now which requires the ability to store
and search on temporal coverage data - ie. a field which specifies a
date range durin
I'm prototyping using StreamingUpdateSolrServer. I want to send a commit
(or optimize) after I'm done adding all of my docs, rather than wait for the
autoCommit to kick in. However, since StreamingUpdateSolrServer is
multi-threaded, I can't simply call commit when I'm done, because that can
happe
Hi Solr users,
If you are using Apache Solr via PHP, I have some good news for you.
There is a new response writer for the PHP native extension, currently
available as a plugin.
This new feature adds a new response writer class to the
org.apache.solr.request package.
This class is used by the P
Thanks Peter :)
On Tue, Jun 22, 2010 at 3:08 PM, Peter Karich wrote:
> ups, sorry. I meant Martijn! Not the germanized Martin :-/
>
> Peter.
>
> > Hi,
> > I wanted to apply field collapsing on the title(type string). but
> > want to show only one document (and the count of such documents
ups, sorry. I meant Martijn! Not the germanized Martin :-/
Peter.
> Hi,
> I wanted to apply field collapsing on the title(type string). but
> want to show only one document (and the count of such documents) per title
> rather than show all the documents.
>
> Regards
> Raakhi
>
>
> On Tue,
Hi Raakhi,
yes, then the collapse patch works perfectly in our case. If you don't
get the patch applied correctly, try asking directly here:
https://issues.apache.org/jira/browse/SOLR-236
I did the same and got immediately response from Martin & Co or try the
latest patch:
2010-06-17 03:08 PM Mar
First of all thanks for your answers.
Those OOMEs are pretty nasty for our production environment.
I didn't try the solution of ordering by function as it was a solr 1.5
feature and we prefer to use a stable version 1.4.
I made a temporary patch that it looks is working fine.
I patched the lucene-
E.g. take a look at:
http://www.craftyfella.com/2010/01/faceting-and-multifaceting-syntax-in.html
Peter.
> Huh? Read through the wiki: See http://wiki.apache.org/solr/LocalParams but I
> still don't understand its utility?
>
> Can someone explain to me why this would even be used? Any examples t
>>Well, sorting requires that all the unique values in the target field
>>get loaded into memory
That's what I tought, thanks.
>>But a larger question is whether what your doing is worthwhile
>>even as just a measurement. You say
>>"This is good for me, I don't care for my tests". I claim that
>>
It's ok
It was a problem with my schema
Thanks anyway
-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
Sent: Monday, June 21, 2010 5:09 PM
To: solr-user@lucene.apache.org
Subject: Re: solr string field
Or even better for an exact string query:
q={!raw f=field_
Hi,
I wanted to apply field collapsing on the title(type string). but
want to show only one document (and the count of such documents) per title
rather than show all the documents.
Regards
Raakhi
On Tue, Jun 22, 2010 at 12:59 AM, Peter Karich wrote:
> Hi Raakhi,
>
> First, field collap
No, this is basic to how Lucene works. You will need larger EC2 instances.
On Mon, Jun 21, 2010 at 2:08 AM, Matteo Fiandesio
wrote:
> Compiling solr with lucene 2.9.3 instead of 2.9.1 will solve this issue?
> Regards,
> Matteo
>
> On 19 June 2010 02:28, Lance Norskog wrote:
>> The Lucene impleme
Solr depends on Lucene's implementation of queries and how it returns
document hits. I can't help you architect these changes.
On Mon, Jun 21, 2010 at 7:47 AM, sarfaraz masood
wrote:
> Mr Lance
>
> Thanks
> a lot for ur reply.. I am a novice a solr / lucene. but i have gone
> thru the documentati
53 matches
Mail list logo