Not yet I've been stuck trying to figure out what the hell is happening with
my delta-imports:
http://n3.nabble.com/Need-help-with-StackOverflowError-td704451.html#a704451
--
View this message in context:
http://n3.nabble.com/MoreLikeThis-function-queries-tp692377p707308.html
Sent from the Solr
Is there anyway to facet on a multi-valued field at a particular index?
For example, I have a field category_ids which is multi-valued containing
category ids. The first value in that field is always the root category and
I would like to be able to facet on just that one field. Is this possible
w
I am about to deploy Solr into our production environment and I would like to
do some benchmarking to determine how many slaves I will need to set up.
Currently the only way I know how to benchmark is to use Apache Benchmark
but I would like to be able to send random requests to the Solr... not ju
I have my loadbalancer (HAProxy) configured to check Solr for a healthcheck
file every 2 seconds.
solr
solr/conf/healthcheck.txt
However it keeps marking my slaves as down and I am seeing this error:
Apr 10, 2010 12:29:20 PM org.apache.solr.core.SolrCore execute
INFO: [items] webapp
Lance,
We have have thousands of searches per minute so a minute of downtime is out
of the question. If for whatever reason one of our solr slaves goes down I
want to remove it ASAP from the loadbalancers rotation, hence the 2 second
check.
Maybe I am doing something wrong but the my HAProxy hea
Taking the HAProxy out of the picture I still see the same results if I hit
my solr instance:
http://localhost:8983/solr/items/admin/file?file=healthcheck.txt from my
browser
..
java4729 root 48u REG 8,17 0 817622
/var/solr/home/items/conf/healthcheck.txt
java47
Mark,
Cool. I didn't think that was the expected behavior. Will you guys at Lucid
be rolling this patch into your 1.4 distribution?
As per your 1.5 comment, do you think 1.5 trunk is stable enough for
production or should I just be keeping an eye on it. I know its never really
known, but do yo
Tim Underwood wrote:
>
> Have you tried hitting /admin/ping (which handles checking for the
> existence of your health file) instead of
> /admin/file?file=healthcheck.txt?
>
Ok this is what I was looking for. I was wondering if the way I was doing it
was the preferred way or not.
I didnt even
Does anyone know if this would help?
onError : (abort|skip|continue) . The default value is 'abort' . 'skip'
skips the current document. 'continue' continues as if the error did not
happen . Solr1.4
--
View this message in context:
http://n3.nabble.com/Need-help-with-StackOverflowError-tp70445
I have a root entity item with 2 sub-entities. Is there any way I can defer
the calculation of a ScriptTransformer to after the 2 sub-entities are
processed? I need to access the variables from both the sub-entities in
order to add a 3 variable using the ScriptTransformer.
Is there anyway that a sub-entity can delete/rewrite fields from the
document? Is there anyway sub-entities can get access to what the documents
current value for a current field?
--
View this message in context:
http://n3.nabble.com/DIH-questions-tp719892p722651.html
Sent from the Solr - User m
Can you please explain how you solved your problem. Im going crazy over here!
:)
--
View this message in context:
http://n3.nabble.com/DIH-questions-tp719892p722710.html
Sent from the Solr - User mailing list archive at Nabble.com.
Is there anyway to instruct copy field overwrite an existing field, or only
accept the first one?
Basically I'm want to copy source1 to dest (if it exists). If source1 doesnt
exist then copy source2 into dest.
Is this possible?
--
View this message in context:
http://n3.nabble.com/CopyF
Thanks for the suggestion but I think I explained it wrong.
I have 3 values
valueA
valueB
valueC
I would like to only add the last one if possible. IE if i have values A,B,C
then add valueC. If I have A,B, then B...
--
View this message in context:
http://n3.nabble.com/CopyField-tp722785p7
Thanks Koji... this is more of what I wanted.
Is there any class or processor like this for the DataImportHanlder that can
accomplish this?
Thanks alot!
--
View this message in context:
http://n3.nabble.com/CopyField-tp722785p723060.html
Sent from the Solr - User mailing list archive at Nabbl
Ok stupid question i know (its been awhile since I played around with java).
Once I have the jar file compiled and I include it in my home/lib directory
how do I go about using it? Will this override the existing behavior or will
this be a new command?
or
--
View this message in context:
h
I have the following text field:
...
When I search for women's, womens or women I correctly get back all the
results I want. However when I use the highlighting feature it only
highlights women in the women's cases. Ho
Same general question about highlighting the full work "sunglasses" when I
search for glasses. Is this possible?
Thanks
--
View this message in context:
http://n3.nabble.com/Highlighting-apostrophe-tp731155p731305.html
Sent from the Solr - User mailing list archive at Nabble.com.
Is it possible to use boost function across the whole index/empty search
term?
I'm guessing the next question that would be asked is "Why would you want to
do that". Well with have a bunch of custom business metrics included in each
document (a product). I would like to only show the best produc
Correct, I am using dismax by default.
I actually accomplished what I was looking for by creating a separate
request handler with a defType of "lucene" and then I used _val_ hook.
I tried using the {!func}function as you describe but couldn't get it work.
Are there any difference between the tw
Can someone explain a useful case for the RandomSortField?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Random-Field-tp770087p770087.html
Sent from the Solr - User mailing list archive at Nabble.com.
Can someone please point me in the right direction (classes) on how to create
my own custom dih variable that can be used in my data-config.xml
So instead of ${dataimporter.last_index_time} I want to be able to create
${dataimporter.foo}
Thanks
--
View this message in context:
http://lucene.47
Thanks Paul, that will certainly work. I was just hoping there was a way I
could write my own class that would inject this value as needed instead of
precomputing this value and then passing it along in the params.
My specific use case is instead of using dataimporter.last_index_time I want
to us
Thanks Noble this is exactly what I was looking for.
What is the preferred way to query solr within these sorts of classes?
Should I grab the core from the context that is being passed in? Should I be
using SolrJ?
Can you provide an example and/or provide some tutorials/documentation.
Once aga
I know one can create custom event listeners for update or query events, but
is it possible to create one for any DIH event (Full-Import, Delta-Import)?
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/Custom-DIH-EventListeners-tp780517p780517.html
Sent from the Solr -
I am working with creating my own custom dataimport handler evaluator class
and I keep running across this error when I am trying to delta-import. It
told me to post this exception to the mailing list so thats what I am doing
;)
[java] SEVERE: java.util.concurrent.RejectedExecutionException
So I came up with the following class.
public class LatestTimestampEvaluator extends Evaluator {
private static final Logger logger =
Logger.getLogger(LatestTimestampEvaluator.class.getName());
@Override
public String evaluate(String expression, Context context) {
List params = Evalu
FYI, the code that is causing this exception and an explanation of my
specific use case is all listed in this thread:
http://lucene.472066.n3.nabble.com/Custom-DIH-variables-td777696.html
--
View this message in context:
http://lucene.472066.n3.nabble.com/SEVERE-java-util-concurrent-RejectedExec
Can someone please explain to me the use cases when one would use one over
the other.
All I got from the wiki was: (In reference to Embedded) "If you need to use
solr in an embedded application, this is the recommended approach. It allows
you to work with the same interface whether or not you hav
Thanks for the tip Lance. Just for reference, why is it dangerous to use the
HTTP method? I realized that the embedded method is probably not the way to
go (obviously since I was getting that "SEVERE:
java.util.concurrent.RejectedExecutionException")
--
View this message in context:
http://luc
Thanks for the input Lance.
My use case was actually pretty simple so my solution was relatively simple.
I ended up using the HTTP method. The code is listed here:
http://pastie.org/952040. I would appreciate any comments.
iorixxx you may find this solution to be of some use to you.
--
View th
Posted a few weeks ago about this but no one seemed to respond. Has anyone
seen this before? Why is this happening and more importantly how can I fix
it? Thanks in advance!
May 11, 2010 12:05:45 PM org.apache.solr.handler.dataimport.DataImporter
doDeltaImport
SEVERE: Delta Import Failed
java.lang
FYI I am using the mysql-connector-java-5.1.12-bin.jar as my JDBC driver
--
View this message in context:
http://lucene.472066.n3.nabble.com/StackOverflowError-during-Delta-Import-tp811053p811058.html
Sent from the Solr - User mailing list archive at Nabble.com.
How can one accomplish a MoreLikeThis search using boost functions?
If its not capable out of the box, can someone point me in the right
direction on what I would need to create to get this working? Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/MLT-Boost-Function-tp
Mike,
This only happens when I attempt to do a delta-import without first deleting
the index dir before doing a full-index.
For example these will work correctly.
1) Delete /home/corename/data
2) Full-Import
3) Delta-Import
However I attempt to do the following, it will result in an error
1)
Anyone know of any way to accomplish (or at least simulate) this?
Thanks again
--
View this message in context:
http://lucene.472066.n3.nabble.com/MLT-Boost-Function-tp811227p813982.html
Sent from the Solr - User mailing list archive at Nabble.com.
Is there any way to configure this so it only takes after if you match more
than one word?
For example if I search for: "foo" it should have no effect on scoring, but
if I search for "foo bar" then it should.
Is this possible? Thanks
--
View this message in context:
http://lucene.472066.n3.nab
Does anyone know of any documentation that is more in-depth that the wiki and
the Solr 1.4 book? I'm passed the basic usage of Solr and creating simple
support plugins. I really want to know all about the inner workings of Solr
and Lucene. Can someone recommend anything?
Thanks
--
View this mess
Can you please share with me your DIH settings and JDBC driver you are using.
I'll start...
jdbc driver = mysql-connector-java-5.1.12-bin
batchSize = "-1"
readOnly = "true"
Would someone mind explaining what "convertType" and "transactionIsolation"
actually does? The wiki doesnt really explain
Which driver is the "best" for use with solr?
I am currently using mysql-connector-java-5.1.12-bin.jar in my production
setting. However I recently tried downgrading and did some quick indexing
using mysql-connector-java-5.0.8-bin.jar and I close to a 2x improvement in
speed!!! Unfortunately I ke
Shawn, first off thanks for the reply and links!
"As far as the error in the 5.0.8 version, does the import work, or does it
fail when the exception is thrown?"
- The import "works" for about 5-10 minutes then it fails and everything is
rolled-back one the above exception is thrown.
" You might
Lucas.. was there a reason you went with 5.1.10 or was it just the latest
when you started your Solr project?
Also, how many items are in your index and how big is your index size?
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/Recommended-MySQL-JDBC-driver-tp817458p
What is the preferred way to implement this feature? Using facets or the
terms component (or maybe something entirely different). Thanks in advance!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Autosuggest-tp818430p818430.html
Sent from the Solr - User mailing list archive
"Easiest and oldest is wildcards on facets. "
- Does this allow partial matching or is this only prefix matching?
"It and facets allow limiting the database with searches. Using the spelling
database does not allow this."
- What do you mean?
So there is no generally accepted preferred way to do
Thanks for your help and especially your analyzer.. probably saved me a
full-import or two :)
--
View this message in context:
http://lucene.472066.n3.nabble.com/Autosuggest-tp818430p818712.html
Sent from the Solr - User mailing list archive at Nabble.com.
Andrzej is this ready for production usage?
"Hopefully in the future we can include user click through rates to boost
those terms/phrases higher"
- This could be huge!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Autosuggest-tp818430p819762.html
Sent from the Solr - User
Maybe I should have phrased it as: "Is this ready to be used with Solr 1.4?"
Also, as Grang asked in the thread, what is the actual status of that patch?
Thanks again!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Autosuggest-tp818430p819765.html
Sent from the Solr - User
Is there anymore information I can post so someone can give me a clue on
whats happening?
--
View this message in context:
http://lucene.472066.n3.nabble.com/StackOverflowError-during-Delta-Import-tp811053p824516.html
Sent from the Solr - User mailing list archive at Nabble.com.
I just found out if I remove my deletedPkQuery then the import will work. Is
it possible that the there is some conflict between my delta indexing and my
delta deleting?
Any suggestions?
--
View this message in context:
http://lucene.472066.n3.nabble.com/StackOverflowError-during-Delta-Import-t
Whats the best way to get to the instance of DataImport handler from the
current context?
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/DataImporter-from-context-tp825517p825517.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks for the info Hoss.
I will probably need to go with one of the more complicated solutions. Is
there any online documentation for this task? Thanks.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Autosuggest-tp818430p827329.html
Sent from the Solr - User mailing list a
Basically for some uses cases I would like to show duplicates for other I
wanted them ignored.
If I have overwriteDupes=false and I just create the dedup hash how can I
query for only unique hash values... ie something like a SQL group by.
Thanks
--
View this message in context:
http://lucen
I am trying to subclass DIH to add I am having a hard time trying to get
access to the current Solr Context. How is this possible?
Is there anyway to get access to the current DataSource, DataImporter etc?
On a related note... when working with an onImportEnd, or onImportStart how
can I get a r
Ok to further explain myself.
Well first off I was experience a StackOverFlow error during my
delta-imports after doing a full-import. The strange thing was, it only
happened sometimes. Thread is here:
http://lucene.472066.n3.nabble.com/StackOverflowError-during-Delta-Import-td811053.html#a824780
I am trying to send out email notifications when our full/delta imports fail.
I tried working with onImportEnd EventListener but that only fires off when
the import passes.
Can anyone recommend a good way to send out email notifications on import
failures?
--
View this message in context:
http
Awesome thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/DIH-post-import-event-listener-for-errors-tp834645p836955.html
Sent from the Solr - User mailing list archive at Nabble.com.
Smiley, I dont follow. Can you explain how one could do this?
I'm guessing Log4J would parse the logs looking for a "ROLLBACK" and then it
would send out a notification? Sorry but i'm not really familiar with Log4J
BTW, loved your book. Have you've thought about putting out another more
advanced
Ok... just read up on Log4J email notification. Sounds like it would be a
good idea however can you have separate SMTPAppenders based on which
exception is thrown and/or by searching for a particular string?
ie, if log level = SEVERE and contains "rollback" then use SMTPAppender foo.
Thanks
--
Narrowed down the issues to this block in in DocBuilder.java in the
collectDelta method. Any ideas?
Set> deletedSet = new HashSet>();
Set> deltaRemoveSet = new HashSet>();
while (true) {
Map row = entityProcessor.nextDeletedRowKey();
if (row == null)
break;
Forgot to mention, the entity that is causing this is the root entity
--
View this message in context:
http://lucene.472066.n3.nabble.com/StackOverflowError-during-Delta-Import-tp811053p837451.html
Sent from the Solr - User mailing list archive at Nabble.com.
Is it possible to limit the number of snapshots taken by the replication
handler? ...http://localhost:8983/solr/replication?command=backup
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/Snapshooter-question-tp838914p838914.html
Sent from the Solr - User mailing list
What are the correct for settings to get highlighting excerpting working?
Original Text: "The quick brown fox jumps over the lazy dog"
Query: "jump"
Result: " fox jumps over "
Can you do something like the above with the highlighter or can it only
surround matches with pre and post tags?
We have user entered item listings that have a title and contain html in
their descriptions. I would like to index the full descriptions (minus the
html which im stripping out via the DIH HTMLStripTransformer) so I can
search across that it as well as perform highlighting/excerpting.
Can someone
There will never be any need to search the actual HTML (tags, markup, etc) so
as far as functionality goes it seems like the DIH HTMLStripTransformer is
the way to go.
Are there any significant performance differences between the two?
--
View this message in context:
http://lucene.472066.n3.nab
Can someone explain to be what the state of Solr/Lucene is... didn't they
recently combine?
I know I am running version 1.4 but I keep seeing version numbers out there
that are 3.0, 4.0??? Can someone explain what that means.
Also is the state of trunk (1.4 or 4.0??) "good enough" for production
Yonik Seeley-2-2 wrote:
>
> Lots of other stuff has changed. For example, trunk is now always the
> next *major* version number.
> So the trunk of the combined lucene/solr is 4.0-dev
>
> There is now a branch_3x that is like trunk for all future 3.x releases.
>
> The next version of Solr will
How would this be any different than simply using the function to alter the
scoring of the final results and then sorting by score?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Sort-by-function-workaround-for-Solr-1-4-tp851922p852471.html
Sent from the Solr - User mailin
I'll give the deletedEntity "trick" a try... igneous
--
View this message in context:
http://lucene.472066.n3.nabble.com/Subclassing-DIH-tp830954p863108.html
Sent from the Solr - User mailing list archive at Nabble.com.
We have around 5 million items in our index and each item has a description
located on a separate physical database. These item descriptions vary in
size and for the most part are quite large. Currently we are only indexing
items and not their corresponding description and a full import takes arou
As a data point, I routinely see clients index 5M items on normal
> hardware in approx. 1 hour (give or take 30 minutes).
Our master solr machine is running 64-bit RHEL 5.4 on dedicated machine with
4 cores and 16G ram so I think we are good on the hardware. Our DB is MySQL
version 5.0.67 (exa
Andrzej Bialecki wrote:
>
> On 2010-06-02 12:42, Grant Ingersoll wrote:
>>
>> On Jun 1, 2010, at 9:54 PM, Blargy wrote:
>>
>>>
>>> We have around 5 million items in our index and each item has a
>>> description
>>> located on a
As a data point, I routinely see clients index 5M items on normal hardware
in approx. 1 hour (give or take 30 minutes).
Also wanted to add that our main entity (item) consists of 5 sub-entities
(ie, joins). 2 of those 5 are fairly small so I am using
CachedSqlEntityProcessor for them but the ot
> One thing that might help indexing speed - create a *single* SQL query
> to grab all the data you need without using DIH's sub-entities, at
> least the non-cached ones.
>
Not sure how much that would help. As I mentioned that without the item
description import the full process takes 4 h
me
>> fields
>> in about 8mins I know it's not quite the scale bit with batching...
>>
>> David Stuar
>>
>> On 2 Jun 2010, at 17:58, Blargy wrote:
>>
>>>
>>>
>>>
>>>> One thing that might help indexing speed - create
Erik Hatcher-4 wrote:
>
> One thing that might help indexing speed - create a *single* SQL query
> to grab all the data you need without using DIH's sub-entities, at
> least the non-cached ones.
>
> Erik
>
> On Jun 2, 2010, at 12:21 PM, Blargy wrote:
&
Would dumping the databases to a local file help at all?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p866538.html
Sent from the Solr - User mailing list archive at Nabble.com.
I believe I'll need to write some custom code to accomplish what I want
(efficiently that is) but I'm unsure of what would be the best route to
take. Will this require a custom request handler? Search component?
Ok the easiest way to explain is to show you what I want.
http://shop.ebay.com/?_fro
What is the preferred way to index html using DIH (my html is stored in a
blob field in our database)?
I know there is the built in HTMLStripTransformer but that doesn't seem to
work well with malformed/incomplete HTML. I've created a custom transformer
to first tidy up the html using JTidy then
Does the HTMLStripChar apply at index time or query time? Would it matter to
use over the other?
As a side question, if I want to perform highlighter summaries against this
field do I need to store the whole field or just index it with
TermVector.WITH_POSITIONS_OFFSETS?
--
View this message in
Wait... do you mean I should try the HTMLStripCharFilterFactory analyzer at
index time?
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory
--
View this message in context:
http://lucene.472066.n3.nabble.com/Indexing-HTML-tp884497p884592.html
Sent from
Do I even need to tidy/clean up the html if I use the
HTMLStripCharFilterFactory?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Indexing-HTML-tp884497p885797.html
Sent from the Solr - User mailing list archive at Nabble.com.
: ...you've already got the conceptual model of how to do it, all you need
: now is to implement it as a Component that does the secondary-faceting in
: the same requests (which should definitley be more efficient since you can
: reuse the DocSets) instead of issuing secondary requets from your cl
Got it. Thanks!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Custom-faceting-question-tp868015p897390.html
Sent from the Solr - User mailing list archive at Nabble.com.
Can someone please explain what the inform method should accomplish? Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCoreAware-tp899064p899064.html
Sent from the Solr - User mailing list archive at Nabble.com.
Can someone explain how to register a SolrEventListener?
I am actually interested in using the SpellCheckerListener and it appears
that it would build/rebuild a spellchecker index on commit and/or optimize
but according to the wiki "the only events that can be "listened" for are
firstSearcher a
101 - 185 of 185 matches
Mail list logo