I have the same problem. at 4.1 ,a solr instance could take 8,000,000,000
doc. but at 4.2.1, a instance only take 400,000,000 doc, it will oom at
facet query. the facet field was token by space.
May 27, 2013 11:12:55 AM org.apache.solr.common.SolrException log
SEVERE: null:java.lang.RuntimeExcep
I am sorry about a type mistake 8,000,000,000 -> 800,000,000
2013/5/27 Jam Luo
> I have the same problem. at 4.1 ,a solr instance could take 8,000,000,000
> doc. but at 4.2.1, a instance only take 400,000,000 doc, it will oom at
> facet query. the facet field was token by space.
>
> May 27,
On Sun, May 26, 2013 at 8:16 PM, Jack Krupansky wrote:
> The only comment I was trying to make here is the relationship between the
> RemoveDuplicatesTokenFilterFactory and the KeywordRepeatFilterFactory.
>
> No, stemmed terms are not considered the same text as the original word. By
> definition,
On 27 May 2013 12:58, Arkadi Colson wrote:
> Hi
>
> We would like to index our messages system. We should be able to search for
> messages for specific recipients due to performance issues on our databases.
> But the message is of course the same for all receipients and the message
> text should b
Hi
We would like to index our messages system. We should be able to search
for messages for specific recipients due to performance issues on our
databases. But the message is of course the same for all receipients and
the message text should be saved only once! Is it possible to have some
kin
Hi, thanks for the response. Seems like this is the case because there are no
any other applications that could fire commit/optimize calls. All commits
are triggered by Solr and the optimize is triggered by a cron task.
Because of all that it looks like a bug in Solr. It probably should not run
co
Yes indeed... Thx!
On 05/27/2013 09:33 AM, Gora Mohanty wrote:
On 27 May 2013 12:58, Arkadi Colson wrote:
Hi
We would like to index our messages system. We should be able to search for
messages for specific recipients due to performance issues on our databases.
But the message is of course th
Thanks for the help.
@Alexandre: Thanks for the suggestion, I'll try to use an
ExtractingRequestHandler, I thought that I was missing some DIH option :).
@Erik: I'm interested in knowing them all to do various form of analysis. I
have documents coming from heterogeneous sources and I'm interested
Hi,
We have setup the SOLR cloud with zookeeper.
Zookeeper (localhost:8000)
1 shard (localhost:9000)
2 Replica (localhost:9001,localhost:9002)
Question :
We load the solr index from Relational DB using DIH, Based on solr cloud
documentation the request to load the data will be forwarded
The intent is that optimize is obsolete and should no longer be used,
especially with tiered merge policy running. In other words, merging should
be occurring on the fly in Lucene now. What release of Solr are you running?
-- Jack Krupansky
-Original Message-
From: heaven
Sent: Monda
Setting the uprefix parameter of SolrCell (ERH) to something like "attr_"
will result in all metatdata attributes that are not named in the Solr
schema being given the "attr_" prefix to their metadata attribute names. For
example,
curl "http://localhost:8983/solr/update/extract?literal.id=doc-
Standalone Tika can also run in a network server mode. That increases data
roundtrips but gives you more options. Even in .net .
Regards,
Alex
On 27 May 2013 04:22, "Gian Maria Ricci" wrote:
> Thanks for the help.
>
> @Alexandre: Thanks for the suggestion, I'll try to use an
> ExtractingR
400M docs is quite a large number of documents for a single piece of
hardware, and
if you're faceting over a large number of unique values, this will
chew up memory.
So it's not surprising that you're seeing OOMs, I suspect you just have too many
documents on a single machine..
Best
Erick
On Mo
There's no requirement to send the document to any leader, send updates to
any node in the system. The documents will be automatically forwarded to
the appropriate leaders.
You may be getting confused by the "leader aware" solr client stuff. It's
slightly more efficient to send updates to the lead
Hi Jack,
I'd like to ask as a person who contributed a case study article about
"Automatically acquiring synonym knowledge from Wikipedia" to the book.
(13/05/24 8:14), Jack Krupansky wrote:
To those of you who may have heard about the Lucene/Solr book that I and two
others are writing on Luce
Hello,
I'm writing my first little Solrj program, but don't get it running because of
an RemoteSolrException: Server at http://localhost:8983/solr returned non ok
status:404
The server is definitely running and the url works in the browser.
I am working with Solr 4.3.0.
This is my source code
If you would like to Solr-ize your contribution, that would be great. The
focus of the book will be hard-core Solr.
-- Jack Krupansky
-Original Message-
From: Koji Sekiguchi
Sent: Monday, May 27, 2013 8:07 AM
To: solr-user@lucene.apache.org
Subject: Re: Note on The Book
Hi Jack,
I'd
You did not open source it by any chance? :-)
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working. (Anonymous
No, the implementation was very specific to my needs.
On 5/27/2013 8:28 AM, Alexandre Rafalovitch wrote:
> You did not open source it by any chance? :-)
>
> Regards,
>Alex.
Hello,
I am working on implementation of system to categorize URLs/Web Pages.
I would have categories like ...
Adult Health Business
Arts Home Science
I am looking at how Lucence/Solr could help me out to achive this.
I came across links that mention MoreLik
Hello
Our team faced the problem regarding the sourceId of JMX when getting the
information of JMX from tomcat manager.
Command:
curl http://localhost:${PORT}/manager/jmxproxy?qry=solr:type=documentCache,*
Here is the error log (tomcat/manager log).
Hi;
I want to use Solr for an academical research. One step of my purpose is I
want to store tokens in a file (I will store it at a database later) and I
don't want to index them. For such kind of purposes should I use core
Lucene or Solr? Is there an example for writing a custom analyzer and just
Hello!
Take a look at custom posting formats. For example
here is a nice post showing what you can do with Lucene SimpleText
codec:
http://blog.mikemccandless.com/2010/10/lucenes-simpletext-codec.html
However please remember that it is not advised to use that codec in
production environmen
On Mon, May 27, 2013 at 7:11 AM, Jack Krupansky wrote:
> The intent is that optimize is obsolete and should no longer be used
That's incorrect.
People need to understand the cost of optimize, and that it's use is optional.
It's up to the developer to figure out of the benefits of calling
optimiz
I downloaded solr 4.3.0, started it up with java -jar start.jar (from
inside the example directory) and executed your program. No exceptions are
thrown. Is there something you did differently?
On Mon, May 27, 2013 at 5:45 PM, Hans-Peter Stricker
wrote:
> Hello,
>
> I'm writing my first little So
forceMerge is very useful if you delete a significant portion of an index. It
can take a very long time before any merge policy decides to finally merge them
all away, especially for a static or infrequently changing index. Also, having
a lot of deleted docs in the index can be an issue if your
As the wiki does say: "if at all ... Segments are normally merged over time
anyway (as determined by the merge policy), and optimize just forces these
merges to occur immediately."
So, the only real question here is if the optimize really does lie outside
the "if at all" category and whether "
This is a bug. The sourceId should have been removed from the
SolrDynamicMBean. I'll create an issue.
On Mon, May 27, 2013 at 6:39 PM, 菅沼 嘉一 wrote:
> Hello
>
> Our team faced the problem regarding the sourceId of JMX when getting the
> information of JMX from tomcat manager.
>
> Command:
> curl
I am on 4.2.1
@Yonik Seeley I do understand the cost and run it once per 24 hours and
perhaps later this interval will be increased up to a few days.
In general I am optimizing not to merge the fragments but to remove deleted
docs. My index refreshes quickly and number of deleted docs could reach
I created SOLR-4862 ... I found no way to assign the ticket to somebody though
(I guess it is is under "Workflow", but the button is greyed out).
Thanks,
André
I opened https://issues.apache.org/jira/browse/SOLR-4863
On Mon, May 27, 2013 at 7:35 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> This is a bug. The sourceId should have been removed from the
> SolrDynamicMBean. I'll create an issue.
>
>
> On Mon, May 27, 2013 at 6:39 PM, 菅沼 嘉一
Hello, guys!
Well, I've done some tests and I think that there exists some kind of bug
related with distributed search. Currently I'm setting a key field that
it's impossible to be duplicated, and I have experienced the same wrong
behavior with numFound field while changing rows parameter. Has any
Yes, I started it up with java -Dsolr.solr.home=example-DIH/solr -jar
start.jar.
Without the java options I don't get the expections neither! (I should have
checked.)
What now?
--
From: "Shalin Shekhar Mangar"
Sent: Monday, May 27, 2013 3:58 P
On 5/27/2013 6:15 AM, Hans-Peter Stricker wrote:
> I'm writing my first little Solrj program, but don't get it running because
> of an RemoteSolrException: Server at http://localhost:8983/solr returned non
> ok status:404
>
> The server is definitely running and the url works in the browser.
>
On 5/27/2013 8:24 AM, Hans-Peter Stricker wrote:
> Yes, I started it up with java -Dsolr.solr.home=example-DIH/solr -jar
> start.jar.
That explains it. See my other reply. The solr.xml file for
example-DIH does not have a defaultCoreName attribute.
Thanks,
Shawn
Dear Shawn, dear Shalin,
thanks for your valuable replies!
Could/should I have known better (by reading more carefully the manual)?
I'll try to fix it - and I am confident that it will work!
Best regards
Hans
--
From: "Shawn Heisey"
Sent: Mond
Thanks a lot, other useful hints, and probably standalone Tika could be a
solution.
I've another little question: how can I express filters in DIH configuration to
run import of the server incrementally?
Actually I've two distinct scenario.
In first scenario I've documents stored inside datab
Hello,
Sorry for cross post. I just wanted to announce that I've written a blog post on
how to create synonyms.txt file automatically from Wikipedia:
http://soleami.com/blog/automatically-acquiring-synonym-knowledge-from-wikipedia.html
Hope that the article gives someone a good experience!
koji
On 5/27/2013 8:34 AM, Hans-Peter Stricker wrote:
> Dear Shawn, dear Shalin,
>
> thanks for your valuable replies!
>
> Could/should I have known better (by reading more carefully the manual)?
I just looked at the wiki. The SolrJ wiki page doesn't mention using
the core name, which I find surpris
Now my contribution can be read on soleami blog in English:
Automatically Acquiring Synonym Knowledge from Wikipedia
http://soleami.com/blog/automatically-acquiring-synonym-knowledge-from-wikipedia.html
koji
(13/05/27 21:16), Jack Krupansky wrote:
If you would like to Solr-ize your contributio
Hi,
I'm executing a search and retrieve more like this results. For the search
results, I can specify the columns to be returned via the "fl" parameter. The
"mlt.fl" parameter defines the columns to be used for similarity calculation.
The mlt-results see to return the columns specified in "fl"
I start the SOLR example with
java -Dsolr.solr.home=example-DIH/solr -jar start.jar
and run
public static void main(String[] args) {
String url = "http://localhost:8983/solr/rss";;
SolrServer server;
SolrQuery query;
try {
server = new HttpSolrServer
Your program is not specifying a command. You need to add:
query.setParam("command", "full-import");
On Mon, May 27, 2013 at 9:31 PM, Hans-Peter Stricker
wrote:
> I start the SOLR example with
>
> java -Dsolr.solr.home=example-DIH/solr -jar start.jar
>
> and run
>
> public static void main(Stri
Marvelous!!
Once again: where could/should I have read this? What kinds of
concepts/keywords are "command" and "full-import"? (Couldn't find them in
any config file. Where are they explained?)
Anyway: Now it works like a charm!
Thanks
Hans
Details about the DataImportHandler are on the wiki:
http://wiki.apache.org/solr/DataImportHandler
In general, the SolrJ client just makes HTTP requests to the corresponding
Solr APIs so you need to learn about the http parameters for the
corresponding solr component. The solr wiki is your best b
On 5/27/2013 10:20 AM, Hans-Peter Stricker wrote:
> Marvelous!!
>
> Once again: where could/should I have read this? What kinds of
> concepts/keywords are "command" and "full-import"? (Couldn't find them
> in any config file. Where are they explained?)
>
> Anyway: Now it works like a charm!
http
I've a test VM where I usually test solr installation. In that VM I already
configured solr4.0 and everything went good. Today I download the 4.3
version, unpack everything, configuring TOMCAT as I did for the 4.0 version
but the application does not start, and in catilina log I find only
May 2
The usual answer (which may or may not be relevant) is that Solr 4.3 has
moved the logging libraries around and you need to copy specific library
implementations to your Tomcat lib files. If that sounds as a possible,
search the mailing list for a number of detailed discussions on this topic.
Rega
Hi.
Searching terms with wildcard in their start, is solved with
ReversedWildcardFilterFactory. But, what about terms with wildcard in both
start AND end?
This query is heavy, and I want to disallow such queries from my users.
I'm looking for a way to cause these queries to fail.
I guess there i
On 5/27/2013 12:00 PM, Alexandre Rafalovitch wrote:
> The usual answer (which may or may not be relevant) is that Solr 4.3 has
> moved the logging libraries around and you need to copy specific library
> implementations to your Tomcat lib files. If that sounds as a possible,
> search the mailing li
Thanks, I'll check :)
--
Gian Maria Ricci
Mobile: +39 320 0136949
-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
Sent: Monday, May 27, 2013 8:00 PM
To: solr-user@lucene.apache.org; alkamp...@nablasoft.com
Subject: Re: Unable to start solr 4.3
The usual
You are right that starting to parse the query before the query component
can get soon very ugly and complicated. You should take advantage of the
flex parser, it is already in lucene contrib - but if you are interested in
the better version, look at
https://issues.apache.org/jira/browse/LUCENE-501
Thanks a lot, it seems that probably solr won't start because of all the log
libraries missing. Once I copied all needed log libraries inside c:\tomcat\libs
solr started with no problem.
If other person are interested, here is the link on the wiki that states
changes in logging library in solr
Hi,
SolrCloud now has the same index aliasing as Elasticsearch. I can't lookup
the link now but Zoie from LinkedIn has Hourglass, which is uses for
circular buffer sort of index setup if I recall correctly.
Otis
Solr & ElasticSearch Support
http://sematext.com/
On May 24, 2013 10:26 AM, "Saikat
But how is Hourglass going to help Solr? Or is it a portable implementation?
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't se
Thanks Roman.
Based on some of your suggestions, will the steps below do the work?
* Create (and register) a new SearchComponent
* In its prepare method: Do for Q and all of the FQs (so this
SearchComponent should run AFTER QueryComponent, in order to see all of the
FQs)
* Create org.apache.lucene
Thank you, Shalin.
I'll see it.
>-Original Message-
>From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
>Sent: Monday, May 27, 2013 11:11 PM
>To: solr-user@lucene.apache.org
>Subject: Re: sourceId of JMX
>
>I opened https://issues.apache.org/jira/browse/SOLR-4863
>
>
>On Mon, May
Hi Solr experts,
I have a solr 4.3 schema
and xml data
51.1164,6.9612
52.3473,9.77564
If I run this query:
fq={!geofilt pt=51.11,6.9 sfield=location_geo d=20}
I get no result.
But if I remove the second geo line and only have this geo coordinate it
works:
51.1164,6.9612
*Thus it seems that
Hi Issac,
it is as you say, with the exception that you create a QParserPlugin, not a
search component
* create QParserPlugin, give it some name, eg. 'nw'
* make a copy of the pipeline - your component should be at the same place,
or just above, the wildcard processor
also make sure you are setti
I think I found the reason/bug
the type was wrong, it should be
On Tue, May 28, 2013 at 1:37 AM, Eric Grobler wrote:
> Hi Solr experts,
>
> I have a solr 4.3 schema
> "solr.SpatialRecursivePrefixTreeFieldType" geo="true" distErrPct="0.025"
> maxDistErr="0.09" units="degrees" />
>
> multi
Shalin,
We tried use it after removing staticStats.add("sourceId"), it seems going with
no problem.
Do you know any other side effects by removing it ?
Regards
suganuma
>-Original Message-
>From: 菅沼 嘉一 [mailto:yo_sugan...@waku-2.com]
>Sent: Tuesday, May 28, 2013 9:30 AM
>To: solr-user@l
I don't want to affect on the (correctness of the) real query parsing, so
creating a QParserPlugin is risky.
Instead, If I'll parse the query in my search component, it will be
detached from the real query parsing, (obviously this causes double
parsing, but assume it's OK)...
On Tue, May 28, 2013
Hello Koji,
This is seems pretty useful post on how to create synonyms file.
Thanks a lot for sharing this !
Have you shared source code / jar for the same so at it could be used ?
Thanks,
Rajesh
On Mon, May 27, 2013 at 8:44 PM, Koji Sekiguchi wrote:
> Hello,
>
> Sorry for cross post. I jus
Suganuma,
No, there shouldn't be any side effects.
On Tue, May 28, 2013 at 7:13 AM, 菅沼 嘉一 wrote:
> Shalin,
>
> We tried use it after removing staticStats.add("sourceId"), it seems going
> with no problem.
> Do you know any other side effects by removing it ?
>
> Regards
> suganuma
>
> >-Or
Hi Benson,
We typically use https://github.com/sematext/ActionGenerator
As a matter of fact, we are using it right now to test one of our
search clusters...
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Sun, May 26, 2013 at 10:38 AM, Benson Margulies
wrote:
> I'd like to run
Thank you, Shalin.
>-Original Message-
>From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
>Sent: Tuesday, May 28, 2013 2:22 PM
>To: solr-user@lucene.apache.org
>Subject: Re: sourceId of JMX
>
>Suganuma,
>
>No, there shouldn't be any side effects.
>
>
>On Tue, May 28, 2013 at 7:13
Folks;
playing with Solr and an existing (legacy) RDBMS structure which we
can't change much, I am trying to figure out how to best make Solrs
full/delta import work for me. A few thoughts:
(a) The usual tutorials outline something like
WHERE LASTMODIFIED > '${dih.last_index_time}
in order to
It does not seem to be memory footprint also ? looks too high for my index.
./zahoor
On 20-May-2013, at 10:55 PM, Jason Hellman
wrote:
> Most definitely not the number of unique elements in each segment. My 32
> document sample index (built from the default example docs data) has the
> fol
Hi
Has anyone tried using HLL for using finding unique values of a field in solr?
I am planning to use them to facet count on certain fields to reduce memory
footprint.
./Zahoor
Is it ok to just change the multivalue attribute to true and reindex the
message module data? There are also other modules indexed on the same
schema with multivalued = false. Will it become a problem?
BR,
Arkadi
On 05/27/2013 09:33 AM, Gora Mohanty wrote:
On 27 May 2013 12:58, Arkadi Colson
70 matches
Mail list logo