Hi Robert,
> On Fri, Sep 24, 2010 at 3:54 AM, Mathias Walter wrote:
>
> > Hi,
> >
> > I'm combined the WordDelimiterFilter with the PositionFilter to prevent the
> > creation of expensive Phrase and MultiPhraseQueries. But
> > if I now parse an escaped string consisting of two terms, the analyser
Hi,
I have been using solrsharp to integrate solr in my project. Everything was
going fine until I tried to incorporate shard query.
I tested the shard query using the browser and everything went fine. I tried
to do the same in solrsharp by adding the following line
queryBuilder.AddSearchParame
Hello,
We were testing nutch configurations and apparently we got heavy handed with
our approach to stopping things.
Now when nutch starts indexing solr, we are seeing these messages:
org.apache.solr.common.SolrException: Lock obtain timed out:
SingleInstanceLock: write.lock
org.apache.lucene.st
Hello, I am writing a clustering component for Solr. It registers, loads and
works properly. However, whenever there is an exception inside my plugin, I
cannot get tomcat to show me the line numbers. It always says "Unknown source"
for my classes. The stack trace in tomcat shows line numbers for
Stop running 32-bit operating systems. You'll never get good performance with a
toy like that. --wunder
On Sep 29, 2010, at 8:18 PM, newsam wrote:
> Thanks for your reply.
>
> Our box is win server 2003 (32bits) and 6G RAM totally. Large heap (>2G) may
> not be helpful for JVM in 32bits box. T
Thanks for your reply.
Our box is win server 2003 (32bits) and 6G RAM totally. Large heap (>2G) may
not be helpful for JVM in 32bits box. Therefore we set JAVA_OPTIONS to
"-Xms521m -Xmx1400m". Is my understanding right?
Thanks.
>From: Lance Norskog
>Reply-To: solr-user@lucene.apache.org
>To:
How much ram does the JVM have?
Wildcard queries are slow. Starting with '*' are even slower. If you
want all values try "field:[* TO *]". This is a range query and lets
you pick a range of values- this picks everything.
The "*:*" is not a wildcard. It is a magic syntax for "all documents"
and do
Some of these are big questions- try them in different emails.
On Wed, Sep 29, 2010 at 9:40 AM, Sharma, Raghvendra
wrote:
> Some questions.
>
> 1. I have about 3-5 tables. Now designing schema.xml for a single table looks
> ok, but whats the direction for handling multiple table structures is
>
How many documents are there? How many unique words are in a text
field? Both of these numbers can have a non-linear effect on the
amount of space used.
But, usually a 22Gb index (on disk) might need 6-12G of ram total.
There is something odd going on here.
Lance
On Wed, Sep 29, 2010 at 4:34 PM,
This would be a Java VM option, not something Solr or other apps can know about.
Using this or procset seems like a great way to handle it.
On Wed, Sep 29, 2010 at 8:46 AM, Glen Newton wrote:
> In a recent blog entry ("The MySQL “swap insanity” problem and the
> effects of the NUMA architecture"
Simple text .txt files and MS office .doc files are very very different beasts.
You can do simple .txt files with some more lines in your
DataImportHandler script.
With DOC files it is easiest to use the extracting request handler
*/extract". This is on the wiki.
If you want to do this inside the D
Does anybody can help on this ?
Many thanks
2010/9/29 Floyd Wu
> Hi there
>
> I have a problem, the situation is when I issue a query to single instance,
> Solr response XML like following
> as you can see, the score is normal()
> ===
>
>
> 0
> 23
>
> _l_title,score
> 0
>
Looking for some clarification on DIH to make sure I am interpreting this
correctly.
I have a wide DB table, 100 columns. I'd rather not have to add 100 values
in schema.xml and data-config.xml. I was under the impression that if the
column name matched a dynamic Field name, it would be added. I
My server has 128GB of ram, the index is 22GB large. It seems the memory
consumption goes up on every query and the garbage collector will never free
up as much memory as I expect it to. The memory consumption looks like a
curve, it eventually levels off but the old gen is always 60 or 70GB. I have
No, I am using xpath for html, this is not the question. I am indexing pure
text in addition to html that I was indexing. Pure text like TXT file or
Microsoft Word doc. So, no xpath for TXT, how do I index TXT file into
different fields in my index like the way I use xpath to index html into
Yes, this is almost certainly stemming. Take a look at solr/admin, [schema
browser],
then click on Home>fields>>. Then the index and query
"details" link
shows you exactly what's happening.
You can also get some joy from the admin [analysis] page. That takes input
and
shows you exactly what transf
Can you provide a few more details? You mention xpath, which leads me
to believe that you are using DIH, is that true? How are you getting
your documents to index? Parts of a filesystem?
Because it's possible to do many things. If you're using DIH against a
filesystem,
you could use two fileDataSo
Make sure your index and query analyzers are identical, and pay special
attention if you're using any of the
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Stemminganalyzers
- many of them have a number of configurable attributes that could
cause differences.
-L
On Wed, Sep 29, 2010
Hi
I issue a request like the following, in order to get a list of search-terms in
a particular field:
http://localhost:8983/solr/terms?terms.limit=-1&terms.fl=bodytext
But some of the terms which are returned are not quite the same as those which
were indexed (or which are returned in a searc
: In Solrconfig.xml, default request handler is set to "standard". I am
: planning to change that to use dismax as the request handler but when I
: set "default=true" for dismax - Solr does not return any results - I get
: results only when I comment out "dismax".
you need to elaborate on what yo
: What's a GA release?
http://en.wikipedia.org/wiki/Software_release_life_cycle#General_availability
-Hoss
--
http://lucenerevolution.org/ ... October 7-8, Boston
http://bit.ly/stump-hoss ... Stump The Chump!
I am using HAProxy for load balancing on my Solr installation, for
redundancy. Very recently, request throttling (and by extension, DoS
mitigation) was added to the development branch (1.5) of HAProxy.
You could probably use that, even if you don't need actual load balancing.
http://haproxy.
No, these new documents are not html, these are pure text, like the ones you
see
in notepad or Microsoft Word. I have no problem indexing Html, but I got stuck
with these pure text.
From: Scott Gonyea
To: solr-user@lucene.apache.org
Sent: Wed, September 29,
Break your HTML pages into the desired fields, format it as follows:
http://wiki.apache.org/solr/UpdateXmlMessages
And away you go. You may want to search / review the Wiki. Also, if
you're indexing websites and want to place it in Solr, you should look
at Nutch. It can do all that work for yo
Hi,
I am using xpath to index different parts of the html pages into different
fields. Now, I have some pure text documents that has no html. So I can't use
xpath. How do I index these pure text into different fields of the index? How
do I make nutch/solr understand these different parts b
This kind of thing is not limited to Solr and you normally wouldn't solve it in
software - it's more a network concern. I'd be looking at a web server solution
such as Apache mod_evasive combined with a good firewall for more conventional
DOS attacks. Just hide your Solr install behind the firew
Hi, I'm curious as to what approaches one would take to defend against users
attacking a Solr service, especially if exposed to the internet as opposed
to an intranet. I'm fairly new to Solr, is there anything built in?
Is there anything in place to prevent the search engine from getting
overwhel
I saw there had been a previous discussion on commit failing for
EmbeddedSolrServer here:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg28236.html
But it was never resolved. I have an embedded solr server and it does not
seem to pick up changes in the index after a commit through Solr
Hi,
For us this is a usability concern. You either don't show Sweden in a pick-list
called Country and some users go away thinking you don't *ever* support Sweden
(not true). OR you allow a user to execute an empty result search - but at
least they know you do support Sweden.
It is we believe
Yes, just after sending the email I reread the wiki and noticed the 4.0
requirement. I will try that, thanks.
From: ysee...@gmail.com on behalf of Yonik Seeley
Sent: Wed 9/29/2010 8:12 AM
To: solr-user@lucene.apache.org
Subject: Re: Queries, Functions, and Param
I don't understand why you would want to show Sweden if it isn't in the
index, what will your UI do if the user selects Sweden?
However, one way to handle this would be to make a second document type.
Have a field called type or some such, and make the new document type be
'dummy' or 'system' or
If at all possible, denormalize the data. Anytime you find yourself trying
to make Solr
behave like a database, the probability is high that you're mis-using Solr
or the DB.
Best
Erick
On Wed, Sep 29, 2010 at 12:40 PM, Sharma, Raghvendra <
sraghven...@corelogic.com> wrote:
> Some questions.
>
>
Yep, I was thinking of this on a field. I was assuming that
there was
a PK in the database that you were mapping to the uniqueKey field, but if
that's
not so then it's more of a problem.
But you'd have problems anyway if you *don't* have a uniqueKey when it comes
time
to update any records, so it
Some questions.
1. I have about 3-5 tables. Now designing schema.xml for a single table looks
ok, but whats the direction for handling multiple table structures is something
I am not sure about. Would it be like a big huge xml, wherein those three
tables (assuming its three) would show up as
Think about what fields you need to return. For this, you probably only need
the id. That could be a lot faster than the default set of fields.
wunder
On Sep 29, 2010, at 9:04 AM, dshvadskiy wrote:
>
> Actually retrieving 1000 docs via search isn't that bad. Turned out it takes
> under 1 sec.
Actually retrieving 1000 docs via search isn't that bad. Turned out it takes
under 1 sec. I still like the idea of using TermComponent and will use it
in the future if number of docs in the index will grow. Thanks for all
suggestions.
Dmitriy
--
View this message in context:
http://lucene.47206
In a recent blog entry ("The MySQL “swap insanity” problem and the
effects of the NUMA architecture"
http://jcole.us/blog/archives/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/),
Jeremy Cole describes a particular but common problem with large memory
installations of MySql on multi-core
On Tue, Sep 28, 2010 at 6:08 PM, Robert Thayer
wrote:
> On the http://wiki.apache.org/solr/FunctionQuery page, the following query
> function is listed:
>
> q={!func}add($v1,$v2)&v1=sqrt(popularity)&v2=100.0
>
> When run against the default solr instance, server returns the error(400):
> "undefi
Check
http://doc.ez.no/Extensions/eZ-Find/2.2/Advanced-Configuration/Using-multi-core-features
It's for eZ-Find, but it's the basic setup for multiple cores in any
environment.
We have cores designed like so:
solr/sfx/
solr/forum/
solr/mail/
solr/news/
solr/tracker/
each of those core directori
Hi Andy!
I configured this a few days ago, and found a good resource --
http://wiki.apache.org/solr/MultipleIndexes
That page has links that will give you the instructions for setting up
Tomcat, Jetty and Resin. I used the Tomcat ones the other day, and it gave
me everything that I needed to get
Regenerating index is a slow operation due to limitation of the source
systems. We run several complex SQL statements to generate 1 Solr document.
Full reindex takes about 24 hours.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Best-way-to-check-Solr-index-for-completenes
I installed Solr according to the tutorial. My schema.xml & solrconfig.xml is
in
~/apache-solr-1.4.1/example/solr/conf
Everything so far is just like that in the tutorial. But I want to set up a 2nd
index (separate from the "main" index) just for the purpose of auto-complete.
I understand that
Using TermComponent is an interesting suggestion. However my understanding it
will work only for unique terms. For example compare database primary key
with Solr id field. A variation of that is to calculate some kind of unique
record hash and store it in the index.Then retrieve id and hash via
T
OK good to know I'm not going bonkers :)
Cheers
On Sep 29, 2010, at 9:45 AM, Luke Crouch wrote:
> We had to do the same thing - we draw our facet navigation links by looping
> over the full result set from our database, and then we add the facet counts
> and draw the link url's using the solr da
Hi Allistair,
On Wed, 2010-09-29 at 15:37 +0200, Allistair Crossley wrote:
> Hello list,
>
> I am implementing a directory using Solr. The user is able to search with a
> free-text query or 2 filters (provided as pick-lists) for country. A
> directory entry only has one country.
>
> I am usin
We had to do the same thing - we draw our facet navigation links by looping
over the full result set from our database, and then we add the facet counts
and draw the link url's using the solr data.
-L
On Wed, Sep 29, 2010 at 8:42 AM, Markus Jelsma wrote:
> I'm afraid you'd have to add the missin
I'm afraid you'd have to add the missing countries in your application. If it's
not in the index, it will not be returned. You last question is possible, the
facet.query parameter allows you to rely on other conditions to generate a
facet count. But if the missing countries are not in the index,
Hello list,
I am implementing a directory using Solr. The user is able to search with a
free-text query or 2 filters (provided as pick-lists) for country. A directory
entry only has one country.
I am using Solr facets for country and I use the facet counts generated
initially by a *:* search t
This sounds like https://issues.apache.org/jira/browse/SOLR-1711. It is a
known issue in Solr 1.4.0, which is apparently fixed in Solr 1.4.1. We also
encountered it when indexing large numbers of documents with SolrJ, and are
therefore in the process of upgrading to 1.4.1.
-- Avi
On Wed, Sep 29,
How long does it take to get 1000 docs?
Why not ensure this while indexing?
I think besides your suggestion or the suggestion of Luke there is no
other way...
Regards,
Peter.
> Hello,
> What would be the best way to check Solr index against original system
> (Database) to make sure index is up t
50 matches
Mail list logo