Hi,
I am using multi-core tomcat on 2 servers. 3 language per server.
I am adding documents to solr up to 200 doc/sec. when updating process is
started, every thing is fine (update performance is max 200 ms/doc. with about
800 MB memory used with minimal cpu usage).
After 15-17 hours it's bec
Hi Lee,
On Mon, Dec 6, 2010 at 10:56 PM, lee carroll
wrote:
> Hi Erik
Nope, Erik is the other one. :-)
> thanks for the reply. I only want the synonyms to be in the index
> how can I achieve that ? Sorry probably missing something obvious in the
> docs
Exactly what he said, use the => syntax.
Hi Erik thanks for the reply. I only want the synonyms to be in the index
how can I achieve that ? Sorry probably missing something obvious in the
docs
On 7 Dec 2010 01:28, "Erick Erickson" wrote:
> See:
>
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
>
> wi
Hi
I want to config DataImport Scheduling, but not know, how to do it.
i just create and compile Scheduling classes with netbeans. and now have
Scheduling.Jar.
Q: how to setup it on tomcat or solr? (i using tomcat 6 on windows 2008)
thanks in advanced
Anyone know abt it?
how to extract the dictionary generated by default.? How do i read this
.cfs files generated in index folder..
Awaiting reply
On Mon, Dec 6, 2010 at 7:54 PM, rajini maski wrote:
> Yeah.. I wanna use this Spell-check only.. I want to create myself the
> dictionary.. And
Thanks Koji.
Problem seems to be that template transformer is not used when delete
is performed.
...
Dec 7, 2010 7:19:43 AM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed ModifiedRowKey for Entity: entry rows obtained : 0
Dec 7, 2010 7:19:43 AM org.apache.solr.handler.
Batch size "-1"??? Strange but could be a problem.
Note also you can't provide parameters to default startup.sh command; you
should modify setenv.sh instead
--Original Message--
From: sivaprasad
To: solr-user@lucene.apache.org
ReplyTo: solr-user@lucene.apache.org
Subject: Out of memory
Hi,
When i am trying to import the data using DIH, iam getting Out of memory
error.The below are the configurations which i have.
Database:Mysql
Os:windows
No Of documents:15525532
In Db-config.xml i made batch size as "-1"
The solr server is running on Linux machine with tomcat.
i set tomcat a
Hi,
First time poster here - I'm not entirely sure where I need to look for this
information.
What I'm trying to do is extract some (presumably) structured information
from non-uniform data (eg, prices from a nutch crawl) that needs to show in
search queries, and I've come up against a wall.
I'v
That is correct. Solr is a search engine, not a text analysis engine.
There are a few open source text analysis systems: Weka, OpenNLP,
UIMA.
Someone is working on integrating UIMA with Solr:
https://issues.apache.org/jira/browse/SOLR-2129
But you should generally assume you will have a batch pro
(10/12/06 23:52), CRB wrote:
Koji,
Thank you for the reply.
Being something of a novice with Solr, I would be grateful if you could clarify
my next steps.
I infer from your reply that there is no current implementation yet contributed
for the FVH similar
to the regex fragmenter.
Thus I need
See:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
with the => syntax, I think that's what you're looking for
Best
Erick
On Mon, Dec 6, 2010 at 6:34 PM, lee carroll wrote:
> Hi Can the following usecase be achieved.
>
> value to be analysed at index time
Hi John,
sounds like this bug in NIO:
http://jira.codehaus.org/browse/JETTY-937
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
I think recent versions of jetty work around this bug, or maybe try
the non-NIO socket connector
Kent
On Tue, Dec 7, 2010 at 9:10 AM, John Russell wrote:
Hi Can the following usecase be achieved.
value to be analysed at index time "this is a pretty line of text"
synonym list is pretty => scenic , text => words
valued placed in the index is "scenic words"
That is to say only the matching synonyms. Basically i want to produce a
normalised set of p
I'm not understanding this response. My main table does have a one to many
relationship with the other tables. What should I be anticipating/wanting
for each document if I want to return to the user the values while allowing
them to search on the other terms?
Thanks.
--
View this message in
Hi,
I'm using solr and have been load testing it for around 4 days. We use the
solrj client to communicate with a separate jetty based solr process on the
same box.
After a few days solr's CPU% is now consistently at or above 100% (multiple
processors available) and the application using it is mo
Thanks for all the help! It is really appreciated.
For now, I can afford the parallel requests problem, but when I put
synchronous=true in the delta import, the call still returns with
outdated items.
Examining the log, it seems that the commit operation is being
executed after the operation retur
Thanks for the quick response!
I was thinking more about the idea of having both structured and unstructred
data coming into a system to be indexed/searched. I would like these
documents to be processed by some sort of entity/keyword/semantic
processing. I have a well defined taxonomy for my
I'm unsure but maybe you mean something like clustering? Then carrot^2
can do this (at index time I think):
http://search.carrot2.org/stable/search?query=jetwick&view=visu
(There is a plugin for solr)
Or do you already know the categories of your docs. E.g. you already
have a category tree and
> When you say "two parallel requests from two users to single DIH
> request handler", what do you mean by "request handler"?
I mean DIH.
> Are you
> refering to the HTTP request? Would that mean that if I make the
> request from different HTTP sessions it would work?
No.
It means that when you h
Alex:
Thanks for the quick reply.
When you say "two parallel requests from two users to single DIH
request handler", what do you mean by "request handler"? Are you
refering to the HTTP request? Would that mean that if I make the
request from different HTTP sessions it would work?
Cheers!
Juan M.
> I have a table that contains the data values I'm wanting to return when
> someone makes a search. This table has, in addition to the data values, 3
> id's (FKs) pointing to the data/info that I'm wanting the users to be able
> to search on (while also returning the data values).
>
> The general
Koji,
Thank you for the reply.
Being something of a novice with Solr, I would be grateful if you could
clarify my next steps.
I infer from your reply that there is no current implementation yet
contributed for the FVH similar to the regex fragmenter.
Thus I need to write my own custom exte
Has anyone been able to get Saxon 9 working with Solr3.1?
I was following the wiki page
(http://wiki.apache.org/solr/XsltResponseWriter), placing all the
saxon-*.jars are in Jetty's lib/ext folder and start with
java
-Djavax.xml.transform.TransformerFactory=net.sf.saxon.TransformerFactoryImp
Yes, that's my conclusion as well Grant.
As for the example output:
The symposium of Tg(RX3fg+and) gene studies
Should end up tokenizing to:
symposium tg the rx3fg and gene studi
Assuming I guessed right on the stemming.
Anyhow, thanks for the confirmation guys.
Matt
On 12/4/2010 8:18 PM,
2010/12/6 Ahmet Arslan :
>
> If you are already using DIH,
> http://wiki.apache.org/solr/DataImportHandler#HTMLStripTransformer can do
> what you want.
Indeed it can. Many thanks.
I'm new to solr (and indexing in general) and am having a hard time making
the transition from rdbms to indexing in terms of the DIH/data-config.xml
file. I've successfully created a working index (so far) for the simple
queries in my db, but I'm struggling to add a more "complex" query. When I
I've seen references to score filtering in the list archives with frange
being the suggested solution, but I have a slightly different problem that I
don't think frange will solve. I basically want to drop a portion of the
results based on their score in relation to the other scores in the result
s
I think this is expected behavior. You have to issue the "details"
command to get the real indexversion for slave machines.
Thanks,
Xin
On Mon, Dec 6, 2010 at 11:26 AM, Markus Jelsma
wrote:
> Hi,
>
> The indexversion command in the replicationHandler on slave nodes returns 0
> for indexversion a
> - I have zero control over what is stored in the database
> - using the Solr XML update protocol i could probably
> transform the
> data before sending it
> - ... but I'd much rather continue using DataImportHandler
> to access
> the database
If you are already using DIH,
http://wiki.apache.or
ahhh, right..in dismax, you pre-define the fields that will be searched upon
is that right? is it also true that the query is parsed and all special
characters escaped?
On 6 December 2010 16:25, Peter Karich wrote:
> for dismax just pass an empty query all q= or none at all
>
>
> Hello,
>>
>>
I have been digging through the user lists for Solr and Nutch, as well as
reading lots of blogs, etc. I have yet to find a clear answer (maybe there
is none )
I am trying to find the best way ahead for choosing a technology that will
allow the ability to use a large taxonomy for classifying stru
Hi,
You can create a custom update request processor [1] to strip unwanted input
as it is about to enter the index.
[1]: http://wiki.apache.org/solr/UpdateRequestProcessor
Cheers,
On Monday 06 December 2010 17:36:09 Emmanuel Bégué wrote:
> Hello,
>
> Is it possible to manipulate the value of
Hello,
Is it possible to manipulate the value of a field before it is stored?
I'm indexing a database where some field contain raw HTML, including
named character entities.
Using solr.HTMLStripCharFilterFactory on the index analyzer, results
in this HTML being correctly stripped, and named chara
Hi,
The indexversion command in the replicationHandler on slave nodes returns 0
for indexversion and generation while the details command does return the
correct information. I haven't found an existing ticket on this one although
https://issues.apache.org/jira/browse/SOLR-1573 has similarities
for dismax just pass an empty query all q= or none at all
Hello,
shouldn't that query syntax be *:* ?
Regards,
-- Savvas.
On 6 December 2010 16:10, Solr User wrote:
Hi,
First off thanks to the group for guiding me to move from default search
handler to dismax.
I have a question related
With dismax, I didn't get any results with *:*. I did the query with
these options (q is empty) and got the full rowcount:
q=&rows=0&qt=dismax
I have q.alt defined in my dismax handler as *:*, don't know if that is
required or not.
Shawn
On 12/6/2010 9:17 AM, Savvas-Andreas Moysidis wrote
Hello,
shouldn't that query syntax be *:* ?
Regards,
-- Savvas.
On 6 December 2010 16:10, Solr User wrote:
> Hi,
>
> First off thanks to the group for guiding me to move from default search
> handler to dismax.
>
> I have a question related to getting all the search results. In the past
> with
Hey Juan,
It seems that DataImportHandler is not a right tool for your scenario
and you'd better use Solr XML update protocol.
* http://wiki.apache.org/solr/UpdateXmlMessages
You still can work around your outdated GUI view problem with calling
DIH synchronously, by adding synchronous=true to you
Hi,
First off thanks to the group for guiding me to move from default search
handler to dismax.
I have a question related to getting all the search results. In the past
with the default search handler I was getting all the search results (8000)
if I pass q=* as search string but with dismax I was
> After issueing a dataimport, I've noticed solr returns a response prior to
> finishing the import. Is this correct? Is there anyway i can make solr not
> return until it finishes?
Yes, you can add synchronous=true to your request. But be aware that
it could take a long time and you can see ht
* Do you use EdgeNGramFilter in index analyzer only? Or you also use
it on query side as well?
* What if you create additional field first_letter (string) and put
first character/characters (multivalued?) there in your external
processing code. And then during search you can filter all documents
t
Koji,
Thank you for the reply.
Being something of a novice with Solr, I would be grateful if you could
clarify my next steps.
I infer from your reply that there is no current implementation yet
contributed for the FVH similar to the regex fragmenter.
Thus I need to write my own custom exte
Yeah.. I wanna use this Spell-check only.. I want to create myself the
dictionary.. And give it as input to solr.. Because my indexes also have
mis-spelled content and so I want solr to refer this file and not
autogenrated. How do i get this done?
I will try the spell check as suggested by micha
Are you sure you want spellcheck/autosuggest?
Because what you're talking about almost sounds like
synonyms.
Best
Erick
On Mon, Dec 6, 2010 at 1:37 AM, rajini maski wrote:
> How does the solr file based spell check work?
>
> How do we need to enter data in the spelling.txt...I am not clear
Hi. As I know, for file based spellcheck you need:
- configure you spellcheck seach component in solrconfig.xml, for example:
solr.FileBasedSpellChecker
file
spellings.txt
UTF-8
./spellcheckerFile
- then you must get or form spell
maybe encoding !?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Dataimport-Could-not-load-driver-com-mysql-jdbc-Driver-tp2021616p2027138.html
Sent from the Solr - User mailing list archive at Nabble.com.
47 matches
Mail list logo