Kindly guide me that how shall I configure solr lucene with the below
kind of requirements:
The query is "abc"
Documents are:
a) abc
b) abcd
c) xyz ab c mno
d) ab
I require the score for each of the above mentioned documents with
the above mentioned query to be displayed as:
For document (
Why not stick to lucene score for each document then building your own? The
easiest way of getting the relevance score for each document is to add the
"debugQuery=true" parameter to your request handler.
Cheers
Avlesh
On Mon, Aug 17, 2009 at 12:32 PM, Sushan Rungta wrote:
> Kindly guide me that
near the bottom of my web.xml (just above ) i got
solr/home
path/to/solr
java.lang.String
While you're at it you might want to make sure the following line in your
solrconfig.xml is commented out
next you should copy the sorl directory (the one with the conf, da
Thanks for the help. I commented out that line in solrconfig.xml like
you said. my web.xml file has this entry in it:
solr/home
/usr/share/tomcat5/solr
java.lang.String
And here is my file structure for solr home:
/usr/share/tomcat5/solr/
/usr/share/tomcat5/solr/bin
/usr/share/to
This doesnot solve my purpose, as my requirement is different. Kindly
check the document "d",
which I have mentioned & the computation of score for that kind of
document will be different.
Hence, some sort of different query will be applied, which I am
unable to ascertain.
Regards,
Sushan R
I am definitely missing something here.
Do you want to fetch a document if one of its field contains "ab" given a
search term "abc"? If you can design a field and query your index so that
you can fetch such a document, Lucene (and hence Solr) would automagically
give you the relevance score.
Cheer
Not sure what's going on but i see jetty stuff scrolling by, that can't be
right :)
Jetty and Tomcat are 2 seperate webservers for serving java applications.
the 2 of them mixing doesn't sound like a good idea.
Jetty is included in the examples for .. well .. example purposes ... but
it's not a par
Hi folks,
I'm trying to use the Debug Now button in the development console to test
the effects of some changes in my data import config (see attached).
However, each time I click it, the right-hand frame fails to load -- it just
gets replaced with the standard 'connection reset' message from Fi
apparently I do not see any command full-import, delta-import being
fired. Is that true?
On Mon, Aug 17, 2009 at 5:55 PM, Andrew Clegg wrote:
>
> Hi folks,
>
> I'm trying to use the Debug Now button in the development console to test
> the effects of some changes in my data import config (see atta
Noble Paul നോബിള് नोब्ळ्-2 wrote:
>
> apparently I do not see any command full-import, delta-import being
> fired. Is that true?
>
It seems that way -- they're not appearing in the logs. I've tried Debug Now
with both full and delta selected from the dropdown, no difference either
way.
If
I am no longer getting this error. I downloaded the latest nightly
build this morning the document I wanted worked without any problems.
Kevin Miller
Web Services
-Original Message-
From: Kevin Miller [mailto:kevin.mil...@oktax.state.ok.us]
Sent: Thursday, August 13, 2009 3:35 PM
To:
Anybody have any suggestions or hints? I'd love to score my queries in a
way that pays attention to how close together terms appear.
Michael
On Thu, Aug 13, 2009 at 12:01 PM, Michael wrote:
> Hello,
> I'd like to score documents higher that have the user's search terms nearer
> each other. For
Dismax QueryParser with pf and ps params?
http://wiki.apache.org/solr/DisMaxRequestHandler
--
- Mark
http://www.lucidimagination.com
Michael wrote:
Anybody have any suggestions or hints? I'd love to score my queries in a
way that pays attention to how close together terms appear.
Michael
Sounds like you just need a buzzword field (indexed, stored) that is
analyzed containing each of the terms associated with that buzzword.
Then, just do the search against that field and return that field.
On Aug 15, 2009, at 11:03 PM, Ninad Raut wrote:
I want searchable buzzword word and b
Hello all,
I'm trying Data Import Handler for the first time to generate my index based
on my db.
Looking the server's logs, I can see the index process is opening a new
searcher for every doc. Is this what we should expect? why? If not, how can
I avoid it? I think if this wasn't being done, could
PhraseQuery's do score higher if the terms are found closer together.
does that imply that during the computation of the score for "a b c"~100,
sloppyFreq() will be called?
Yes. PhraseQuery uses PhraseWeight, which creates a SloppyPhraseScorer, which
takes into account Similiarity.sloppy
Hi,
Just a quick update to the list. Mike and I were able to apply it to 1.4 and
it works. We have it loaded on a few production servers and there is an odd
"StringIndexOutOfBoundsException" error but most of the time it seems to
work just fine.
On Fri, Aug 7, 2009 at 7:30 PM, mike anderson wrote
Thanks for the suggestion. Unfortunately, my implementation requires the
Standard query parser -- I sanitize and expand user queries into deeply
nested queries with custom boosts and other bells and whistles that make
Dismax unappealing.
I see from the docs that Similarity.sloppyFreq() is a method
DIH does not open searchers for each doc. Do you have any autocommit enabled?
On Mon, Aug 17, 2009 at 8:17 PM, Lucas F. A. Teixeira wrote:
> Hello all,
>
> I'm trying Data Import Handler for the first time to generate my index based
> on my db.
> Looking the server's logs, I can see the index proc
In my personal experience: ramBufferSizeMB=8192 helps to keep many things in
RAM and to delay Index Merge forever (I have single segment 10G with almost
100 mlns docs after 24 hours)
Heavy I/O was a problem before, and I solved it
-Original Message-
From: Archon810 [mailto:archon...@gmai
Great, thank you Mark!
Michael
On Mon, Aug 17, 2009 at 10:48 AM, Mark Miller wrote:
> PhraseQuery's do score higher if the terms are found closer together.
>
> does that imply that during the computation of the score for "a b
>>> c"~100, sloppyFreq() will be called?
>>>
>>
> Yes. PhraseQuer
Ian Connor wrote:
Hi,
Just a quick update to the list. Mike and I were able to apply it to 1.4 and
it works. We have it loaded on a few production servers and there is an odd
"StringIndexOutOfBoundsException" error but most of the time it seems to
work just fine.
Do you happen to have the st
I have an index of documents which contain these two fields:
Using the MLT handler with similarity field as city_id works fine and as
expected, however with categories it does not work at all. I tried looking
at "interestingTerms" in the latter case and but the handler does not return
anything.
Ok. Did that. Still got that error. Here is the log (it's not adding
jetty stuff anymore) Here is the log. I included the exception this
time. It looks like its blowing up on something related to XPath. Do
you think its having an issue with one of my xml files?
Aug 17, 2009 2:37:35 AM org.apache.c
No I don't. It's commented.
This is giving me 40 docs/sec indexing, witch is very a poor rate.
(I know this rate depends in a lot of things, including that my database is
not in the same network and other stuff, but I think I can get more than
this).
Any clues on what is probably happening to ope
Not sure SOLR can work in such environment without asking Hosting Support
for making a lot of secific changes... such as giving specific permissions
to specific folders, setting ulimit -n, dealing with exact versions and
vendors of Java, memory parameters, and even libraries which may overwrite
SOL
Sorry Fuad, that isn't very helpful. I also mentioned that this was a
dedicated server so none of those things are an issue. I am using SSH
right now to setup solr home etc. though.
--Aaron
On Mon, Aug 17, 2009 at 10:00 AM, Fuad Efendi wrote:
> Not sure SOLR can work in such environment without a
What is solr.xml for?
INFO: looking for solr.xml: /usr/share/tomcat5/solr/solr.xml Aug 17, 2009
2:37:36 AM org.apache.solr.core.SolrResourceLoader
java.lang.NoClassDefFoundError: org.apache.solr.core.Config
- can't find configuration... XPath needs to load XML to configure Config.
solr.xml??
Fuad,
I'd recommend indexing in Hadoop, then copying the new indexes to Solr
slaves. This removes the need for Solr master servers. Of course
you'd need a Hadoop cluster larger than the number of master servers
you have now. The merge indexes command (which can be taxing on the
servers because
Aaron,
Do you have solr.war in your %TOMCAT%/webapps folder? Is your solr/home in
another than /webapps location? Try to install sample Tomcat with SOLR on
local dev-box and check it's working...
-Original Message-
From: Fuad Efendi [mailto:f...@efendi.ca]
Sent: August-17-09 1:33 PM
To:
Looks like you are using SOLR multicore, with solr.xml... I never tried
it...
The rest looks fine, except suspicious solr.xml
-Original Message-
From: Fuad Efendi [mailto:f...@efendi.ca]
Sent: August-17-09 1:33 PM
To: solr-user@lucene.apache.org
Subject: RE: Cannot get solr 1.3.0 to run
On Mon, Aug 17, 2009 at 10:58 AM, Fuad Efendi wrote:
> Looks like you are using SOLR multicore, with solr.xml... I never tried
> it...
> The rest looks fine, except suspicious solr.xml
whats suspicious about it? is it in the wrong place? Is it not suppose
to be there?
technically my war file is n
Any help?
--
View this message in context:
http://www.nabble.com/delta-import-using-a-full-import-command-is-not-working-tp24989144p25011540.html
Sent from the Solr - User mailing list archive at Nabble.com.
Solr and your database are different machines? If yes, are their dates
synchronized?
If you have access to your database server logs, looking at the queries that
DIH generated might help.
Cheers
Avlesh
On Mon, Aug 17, 2009 at 11:40 PM, djain101 wrote:
>
> Any help?
> --
> View this message in c
Yes, database and Solr are different machines and their dates are not
synchronized. Could that be the issue? Why the date difference between Solr
and DB machine fails to put the timestamp from dataimport.properties file?
Thanks,
Dharmveer
Avlesh Singh wrote:
>
> Solr and your database are di
Hi Jason,
After moving to more RAM and CPUs and setting ramBufferSizeMB=8192 problem
disappeared; I had 100 mlns documents added in 24 hours almost without any
index merge (mergeFactor=10). Lucene flushes to disk the segment when RAM
buffer is full; then MergePolicy orchestrates...
However, 500Gb
I'm attempting to write a query as follows:
($query^10) OR (NOT ($query)) which effectively would return everything, but if
it matches the first query it will get a higher score and thus be sorted first
in the result set. Unfortunately the results are not coming back as expected.
($query) w
If I put an object into a SolrInputDocument and store it, how do I
query for it back? For instance, I stored a java.net.URI in a field
called "url", and I want to query for all the documents that match a
particular URI. The query syntax only seems to allow Strings, and if
I just try query.setQuer
Assuming you have written the SolrInputDocument to the server, you would next
query. See ClientUtils.escapeQueryChars. Also you need to be cognizant of
URLEncoding at times.
-Original Message-
From: ptomb...@gmail.com [mailto:ptomb...@gmail.com] On Behalf Of Paul Tomblin
Sent: Monday,
You can escape the string with
org.apache.lucene.queryParser.QueryParser.escape(String query)
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/queryParser/QueryParser.html#escape%28java.lang.String%29
> -Original Message-
> From: ptomb...@gmail.com [mailto:ptomb...@gmail.com]
The rows parameter would prevent you from getting all docs back. It is set by
default to 10 I believe.
-Original Message-
From: Matt Schraeder [mailto:mschrae...@btsb.com]
Sent: Monday, August 17, 2009 2:04 PM
To: solr-user@lucene.apache.org
Subject: Query not working as expected
I'm a
On Mon, Aug 17, 2009 at 5:28 PM, Harsch, Timothy J. (ARC-SC)[PEROT
SYSTEMS] wrote:
> Assuming you have written the SolrInputDocument to the server, you would next
> query.
I'm sorry, I don't understand what you mean by "you would next query."
There appear to be some words missing from that sente
On Mon, Aug 17, 2009 at 5:30 PM, Ensdorf Ken wrote:
> You can escape the string with
>
> org.apache.lucene.queryParser.QueryParser.escape(String query)
>
> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/queryParser/QueryParser.html#escape%28java.lang.String%29
>
Does this mean I should
> Does this mean I should have converted my objects to string before
> writing them to the server?
>
I believe SolrJ takes care of that for you by calling toString(), but you would
need to convert explicitly when you query (and then escape).
That isn't the problem, as I am looking at "numFound" and not actual rows
returned. In all searches the rows returned is less than the number found.
>>> timothy.j.har...@nasa.gov 8/17/2009 4:30:38 PM >>>
The rows parameter would prevent you from getting all docs back. It is set by
default to 1
On Mon, Aug 17, 2009 at 5:36 PM, Ensdorf Ken wrote:
>> Does this mean I should have converted my objects to string before
>> writing them to the server?
>>
>
> I believe SolrJ takes care of that for you by calling toString(), but you
> would need to convert explicitly when you query (and then esca
Matt Schraeder wrote:
I'm attempting to write a query as follows:
($query^10) OR (NOT ($query)) which effectively would return everything, but if it matches the first query it will get a higher score and thus be sorted first in the result set. Unfortunately the results are not coming back as e
All,
We are considering some new changes to our Solr schema to better support
some new functionality for our application. To that extent, we want to add
an additional field that is multi-valued, but will contain a large number of
values per document. Potentially up to 2000 values on this field per
Your term dictionary will grow somewhat, which means the term
index could consume more memory. Because the term dictionary has
grown there could be less performance in looking up terms but
that is unlikely to affect your application. How many unique
terms will there be?
On Mon, Aug 17, 2009 at 3:5
Hi,
The possibility is that all items in this field could be unique. Let me
clarify.
The main Solr index is a for a list of products. Some products belong to
catalogues. So, the consideration is to add a multi-valued field to put the
id of the catalogue in each product as a multi-valued field to
Hi,
We are using Solr's DataImportHandler to populate the Solr index from a
SQL Server database of nearly 4,000,000 rows. Whereas the population
itself is very fast (around 1000 rows per second), the delta import is
only processing around one row a second.
Is this a known performance issue? We
After debugging dataimporter code, i found that it is a bug in the
dataimporter code itself. doFullImport() in DataImporter class is not
loading last index time where as doDeltaImport() is. The code snippet from
doFullImport() is:
if (requestParams.commit)
setIndexStartTime(new Date());
On Mon, Aug 17, 2009 at 5:47 PM, Paul Tomblin wrote:
> Hmmm. It's not working right. I've added a 5 documents, 3 with the
> URL set to "http://xcski.com/pharma/"; and 2 with the URL set to
> "http://xcski.com/nano/";. Doing other sorts of queries seems to be
> pulling back the right data:
Of
Looks like this issue has been fixed on Sept 20, 2008 against issue SOLR-768.
Can someone please let me know which one is a stable jar after Sept 20,
2008.
djain101 wrote:
>
> After debugging dataimporter code, i found that it is a bug in the
> dataimporter 1.3 code itself. doFullImport() in
After running an application which heavily uses MD5 HEX-representation as
for SOLR v.1.4-dev-trunk:
1. After 30 hours:
101,000,000 documents added
2. Commit:
numDocs = 783,714
maxDoc = 3,975,393
3. Upload new docs to SOLR during 1 hour(!!!), then commit, then
optimize:
numDocs=1,281,851
Can you tell me please how many non-tokenized single-valued fields your
schema uses, and how many documents?
Thanks,
Fuad
Rahul R wrote:
>
> My primary issue is not Out of Memory error at run time. It is memory
> leaks:
> heap space not being released after doing a force GC also. So after
> som
I'd say you have a lot of documents that have the same id.
When you add a doc with the same id, first the old one is deleted, then the
new one is added (atomically though).
The deleted docs are not removed from the index immediately though - the doc
id is just marked as deleted.
Over time though,
But how to explain that within an hour (after commit) I have had about
500,000 new documents, and within 30 hours (after commit) only 1,300,000?
Same _random_enough_ documents...
BTW, SOLR Console was showing only few hundreds "deletesById" although I
don't use any deleteById explicitly; only
It is NOT sample war, it is SOLR application: solr.war - it should be!!! I
usually build from source and use dist/apache-solr-1.3.war instead, so I am
not sure about solr.war
solr.xml contains configuration for multicore; most probably something is
wrong with it.
Would be better if you tr
One more hour, and I have +0.5 mlns more (after commit/optimize)
Something strange happening with SOLR buffer flush (if we have single
segment???)... explicit commit prevents it...
30 hours, with index flush, commit: 783,714
+ 1 hour, commit, optimize: 1,281,851
+ 1 hour, commit, optimize: 1,786
BTW, you should really prefer JRockit which really rocks!!!
"Mission Control" has necessary toolongs; and JRockit produces _nice_
exception stacktrace (explaining almost everything) in case of even OOM
which SUN JVN still fails to produce.
SolrServlet still catches "Throwable":
} catch (Th
UPDATE:
After few more minutes (after previous commit):
docsPending: about 7,000,000
After commit:
numDocs: 2,297,231
Increase = 2,297,231 - 1,281,851 = 1,000,000 (average)
So that I have 7 docs with same ID in average.
Having 100,000,000 and then dropping below 1,000,000 is strange; it is a
sorry for typo in prev msg,
Increase = 2,297,231 - 1,786,552 = 500,000 (average)
RATE (non-unique-id:unique-id) = 7,000,000 : 500,000 = 14:1
but 125:1 (initial 30 hours) was very strange...
Funtick wrote:
>
> UPDATE:
>
> After few more minutes (after previous commit):
> docsPending: about
you can take a nightly of DIH jar alone. It is quite stable
On Tue, Aug 18, 2009 at 8:21 AM, djain101 wrote:
>
> Looks like this issue has been fixed on Sept 20, 2008 against issue SOLR-768.
> Can someone please let me know which one is a stable jar after Sept 20,
> 2008.
>
>
>
> djain101 wrote:
>
delta imports are likely to be far slower that the full imports
because it makes one db call per changed row. if you can write the
"query" in such a way that it gives only the changed rows, then write
a separate entity (directly under ) and just run a
full-import with that entity only.
On Tue, Aug
Can you please point me to the url for downloading latest DIH? Thanks for
your help.
Noble Paul നോബിള് नोब्ळ्-2 wrote:
>
> you can take a nightly of DIH jar alone. It is quite stable
>
> On Tue, Aug 18, 2009 at 8:21 AM, djain101 wrote:
>>
>> Looks like this issue has been fixed on Sept 20, 2
http://people.apache.org/builds/lucene/solr/nightly/
you can just replace the dataimporthandler jar in your current
installation and it should be fine
On Tue, Aug 18, 2009 at 11:18 AM, djain101 wrote:
>
> Can you please point me to the url for downloading latest DIH? Thanks for
> your help.
>
>
>
I replaced the dataimporthandler.jar from 8/7/2009 build in WEB-INF/lib of
solr.war but on restarting of JBOSS, it threw me following exception but if
i revert back to 1.3 jar then it loads the class fine. Is there any
compatibilty issue between latest dataimporthandler.jar and solr1.3.war?
INFO:
OK, I thought you were using an older version of 1.4. the new DIH is
not compatible with 1.3
On Tue, Aug 18, 2009 at 11:37 AM, djain101 wrote:
>
> I replaced the dataimporthandler.jar from 8/7/2009 build in WEB-INF/lib of
> solr.war but on restarting of JBOSS, it threw me following exception but i
How can i get the version of DIH which fixes this issue and is compatible
with 1.3?
Noble Paul നോബിള് नोब्ळ्-2 wrote:
>
> OK, I thought you were using an older version of 1.4. the new DIH is
> not compatible with 1.3
>
> On Tue, Aug 18, 2009 at 11:37 AM, djain101
> wrote:
>>
>> I replaced th
The only way is to backport the patch to 1.3 . If you are confortable
doing that just modify the relevant code and do an "ant dist" to get
the jar
On Tue, Aug 18, 2009 at 11:42 AM, djain101 wrote:
>
> How can i get the version of DIH which fixes this issue and is compatible
> with 1.3?
>
>
> Noble
71 matches
Mail list logo