Thanks all for your response.
I presume this conversation concludes that indexing around 1Billion
documents per shard won't be a problem, as I have 10 Billion docs to index,
so approx 10 shards with 1 Billion each should be fine with it and how
about Memory, what size of RAM should be fine for this
Assuming that you just want to sort - have you tried using
sort=id desc
Cheers,
Siegfried Goeschl
On 04 Jun 2014, at 06:19, sachinpkale wrote:
> I have a following field in SOLR schema.
>
>
> required="false" multiValued="false"/>
>
> If I issue following query:
>
> id:(1234 OR 2345 OR 3
Hello,
Try to boost every sub clause accordingly I.e. id:(34^1000 45^100 76^10) if
you can not just sort by id.
04.06.2014 8:29 пользователь "sachinpkale" написал:
> I have a following field in SOLR schema.
>
>
> required="false" multiValued="false"/>
>
> If I issue following query:
>
> id:(123
I have a following field in SOLR schema.
If I issue following query:
id:(1234 OR 2345 OR 3456)
SOLR does not return the documents in that order. It is giving document with
id 3456, then with 1234 and then with 2345.
How do I get it in the same order as in the query?
--
View this message i
Hi all,
We launched our new production instance of SolrCloud last week and since
then have noticed a trend with regards to disk usage. The non-leader
replicas all seem to be self-optimizing their index segments as expected,
but the leaders have (on average) around 33% more data on disk. My
assumpt
Thank you Alexandre!
I will check my configurations again.
On Wed, Jun 4, 2014 at 9:14 AM, Alexandre Rafalovitch
wrote:
> Ok, the question was if I understood it now:
>
> "I am importing data from Nutch into Solr. One of the fields is
> "author" and I have defined it in Solr's schema.xml. Unfor
On Tue, Jun 3, 2014 at 9:48 PM, Brett Hoerner wrote:
> Yonik, I'm familiar with your blog posts -- and thanks very much for them.
> :) Though I'm not sure what you're trying to show me with the q=*:* part? I
> was of course using q=*:* in my queries, but I assume you mean to leave off
> the text:l
I meant of course, "Ahmet's answer". Sorry, both.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency
On Wed, Jun 4, 2014 at 9:14 AM, Alexandre Rafalovitch
wrote:
> Ok, the question was if I understood
Ok, the question was if I understood it now:
"I am importing data from Nutch into Solr. One of the fields is
"author" and I have defined it in Solr's schema.xml. Unfortunately, it
is always empty when I check the records in the Solr's AdminUI. How
can I confirm that the field was actually indexed
Hi Alexandre,
I've already play with "fl" parameter in Admin UI but the result is not I
expected.
>From what I understand that Solr database structure is defined on Solr's
schema.xml.
On that file we defined in example "author" field to store author content
in Solr database.
Even I put "author"
Hi,
I don't know nutch, but I will answer as if you were using solr cell :
http://wiki.apache.org/solr/ExtractingRequestHandler
When a pdf file is sent to extracting request handler, several meta data are
extracted from pdf. These metadata are assigned to fields. I usually enable
dynamic field
Are you looking for the 'fl' parameter by any chance:
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefl(FieldList)Parameter
?
It's in the Admin UI as well.
If not, then you really do need to rephrase your question. Maybe by
giving a very specific
Hi Ahmet,
I just refering to Solr's schema.xml which described this field definition.
In this case for example "author" field.
Then also refer to Solr query's result which I queried through Solr Admin
page that didn't response author field.
CMIIW.
Thanks.-
On Wed, Jun 4, 2014 at 5:19 AM, Ahmet
Yonik, I'm familiar with your blog posts -- and thanks very much for them.
:) Though I'm not sure what you're trying to show me with the q=*:* part? I
was of course using q=*:* in my queries, but I assume you mean to leave off
the text:lol bit?
I've done some Cluster changes, so these are my basel
use one is ok. solrCloud will route it, but use cloudserver is a good choice.
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-distributed-indexing-tp4138600p4139700.html
Sent from the Solr - User mailing list archive at Nabble.com.
mark.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-maximum-Optimal-Index-Size-per-Shard-tp4139565p4139698.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi Bayu,
I think this is a nutch question, no?
Ahmet
On Wednesday, June 4, 2014 1:13 AM, Bayu Widyasanyata
wrote:
Hi,
I'm sorry if this is a frequently asked question.
In default Solr's schema.xml file we define an "author" field like
following:
But this field seems not parsed (by nu
Hi,
I'm sorry if this is a frequently asked question.
In default Solr's schema.xml file we define an "author" field like
following:
But this field seems not parsed (by nutch) and indexed (by Solr).
My query is always return null result for "author" field even some
documents (PDF) are have a
On Tue, Jun 3, 2014 at 5:19 PM, Yonik Seeley wrote:
> So try:
> q=*:*
> fq=created_at_tdid:[1400544000 TO 1400630400]
vs
So try:
q=*:*
fq={!cache=false}created_at_tdid:[1400544000 TO 1400630400]
-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filters&fieldcache
On Tue, Jun 3, 2014 at 4:44 PM, Brett Hoerner wrote:
> If I run a query like this,
>
> fq=text:lol
> fq=created_at_tdid:[1400544000 TO 1400630400]
>
> It takes about 6 seconds. Following queries take only 50ms or less, as
> expected because my fqs are cached.
>
> However, if I change the query to
This is seemingly where it checks whether to use cache or not, the extra
work is really just a get (miss) and a put:
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1216
I suppose it's possible the put is taking 4 seconds, but th
In this case, I have >400 million documents, so I understand it taking a
while.
That said, I'm still not sure I understand why it would take *more* time.
In your example above, wouldn't it have to create an 11.92MB bitset even if
I *don't* cache the bitset? It seems the mere act of storing the wor
On 6/3/2014 2:44 PM, Brett Hoerner wrote:
> If I run a query like this,
>
> fq=text:lol
> fq=created_at_tdid:[1400544000 TO 1400630400]
>
> It takes about 6 seconds. Following queries take only 50ms or less, as
> expected because my fqs are cached.
>
> However, if I change the query to not cache my
I want to follow up on the ClassCastException issue with custom Plugins.
Turns out that the exception is not happening because an older version of
Solr could be some how loaded in the classpath.
I had to dig down the rabbit hole which brought me to SolrResourceLoader.
turns out that the new classL
Hi,
first I'd like to thank those who've spent their time reading this,
especially Erick, Jason and Shawn. Thank you!
I've finally got it working. Shawn was right, I needed to enable updateLog
in solrconfig.xml and create the tlog directory. After doing, so I could
index documents sent on any nod
If I run a query like this,
fq=text:lol
fq=created_at_tdid:[1400544000 TO 1400630400]
It takes about 6 seconds. Following queries take only 50ms or less, as
expected because my fqs are cached.
However, if I change the query to not cache my big range query:
fq=text:lol
fq={!cache=false}created_a
On 6/3/2014 1:47 PM, Jack Krupansky wrote:
> Anybody care to forecast when hardware will catch up with Solr and we
> can routinely look forward to newbies complaining that they indexed
> "some" data and after only 10 minutes they hit this weird 2G document
> count limit?
I would speculate that Luc
Want to follow up on this...I suppose I need to file an issue?
From: tky...@hotmail.com
To: solr-user@lucene.apache.org
Subject: RE: Using multiple facet.prefix on same field with facet.threads
Date: Fri, 30 May 2014 10:27:32 -0700
Sorry didn't format it correctly.. here is the output without
Anybody care to forecast when hardware will catch up with Solr and we can
routinely look forward to newbies complaining that they indexed "some" data
and after only 10 minutes they hit this weird 2G document count limit?
-- Jack Krupansky
-Original Message-
From: Shawn Heisey
Sent: T
On 6/3/2014 12:54 PM, Jack Krupansky wrote:
> How much free system memory do you have for the OS to cache file
> system data? If your entire index fits in system memory operations
> will be fast, but as your index grows beyond the space the OS can use
> to cache the data, performance will decline.
On 05/20/2014 11:31 AM, Geepalem wrote:
Hi,
What is the filter to be used to implement stemming for Chinese and Japanese
language field types.
For English, I have used and its working fine.
What do you mean by "working fine"?
Try analyzing this with text_en field type:
単語は何個ありますか?
This Japane
On 05/30/2014 08:29 AM, Erick Erickson wrote:
I see errors in both cases. Do you
1> have schemaless configured
or
2> have a dynamic field pattern that matches your "non_exist_field"?
Maybe
is un-commented-out in schema.xml?
Kuro
Hello,
I am migrating from single node Solr to SolrCloud and have run into a problem.
I have timeAllowed set to 5 minutes and am trying to facet on a string field.
With grouping enabled and group.truncate set to true, I consistently get the
following exception as soon as I fire the query. If I
How much free system memory do you have for the OS to cache file system
data? If your entire index fits in system memory operations will be fast,
but as your index grows beyond the space the OS can use to cache the data,
performance will decline.
But there's no hard limit in Solr per se.
-- J
Hi see comments inline below…
On Jun 2, 2014, at 6:49 AM, Vineet Mishra wrote:
> Hi Wolfgang,
>
> Thanks for your response, can you quote some running example of
> MapReduceIndexerTool
> for indexing through csv files.
> If you are referring to
> http://www.cloudera.com/content/cloudera-content
Thanks for all of your suggestions.
I will use *bq=field:value *and it's working well, I dont think I'll need
to use *bf*.
2014-06-03 8:42 GMT+01:00 rulinma :
> function also can be use.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Boost-documents-having-a-fiel
Yes we are already using it.
> -Original Message-
> From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
> Sent: June-03-14 11:41 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Strange behaviour when tuning the caches
>
> Hi,
>
> Have you seen https://wiki.apache.org/solr/Coll
Good question, Mikhail. I started by putting that logic in the prepare()
function by sheer force of habit, but have since moved it first to
distributedProcess() (my system is sharded) and now to handleResponses() (as
in MoreLikeThisComponent.java, which I am mimicking without understanding).
As o
Hi,
Have you seen https://wiki.apache.org/solr/CollapsingQParserPlugin ? May
help with the field collapsing queries.
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/
On Tue, Jun 3, 2014 at 8:41 AM, Jean-Sebastien Vachon <
jea
I followed this link
https://cwiki.apache.org/confluence/display/solr/UIMA+Integration to
integrate solr+uima.
I'm succeeded in integrating. SentenceAnnotation is working fine but i want
use openCalasis annotation so that i can fetch person, place,organization
name. Nowhere its mentioned about whi
I think you need to use parameter substitution for those nested queries
since the "boost" parameter takes a white-space delimited sequence of
function queries.
-- Jack Krupansky
-Original Message-
From: Kamal Kishore Aggarwal
Sent: Tuesday, June 3, 2014 2:22 AM
To: solr-user@lucene.a
Hi Ahmet,
Thanks a ton.
You were absolutely right the moment i added the line batchsize=-1 it
worked .
Thank you so much its been 7 days and i just could not figure out what the
issue was.
Its working like a charm now.
Thanks Again
regards
Madhav Bahuguna
On Tue, Jun 3, 2014 at 5:28 PM, Ahmet Ars
Can you extract names, locations etc using OpenNLP in plain/straight java
program?
If yes, here are two seperate options :
1) Use http://searchhub.org/2012/02/14/indexing-with-solrj/ as an example to
integrate your NER code into it and write your own indexing code. You have the
full power her
On 6/3/2014 6:18 AM, Martin de Vries wrote:
> I have two questions about upgrading Solr:
>
> - We upgrade Solr often, to match the latest version. We have a number
> of servers in a Solrcloud and prefer to upgrade one or two servers first
> and upgrade the other server a few weeks later when we ar
On 6/3/2014 12:00 AM, madhav bahuguna wrote:
> iam using solr 4.7.1 and trying to do a full import.My data source is a
> table in mysql. It has 1000 rows and 20 columns.
>
> Whenever iam trying to do a full import solr stops responding. But when i
> try to do a import with a limit of 40 or
On 6/3/2014 3:04 AM, Aniket Bhoi wrote:
> I changed the value of removeAbandoned to false,this time the indexing
> failed due to a different exception:
> Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Connection
> reset
This is really the same error, but now the dataimport handle
On 6/3/2014 12:57 AM, binaychap wrote:
> Solr version 4.8.1
> I download the Solr and run via the start.jar and i am able to index the
> json data through post.jar and i read the solr in action book and follow the
> instruction of book and copy the example and rename the realEstate and
> collectio
Hi All,
Has anyone came across the maximum threshold document or size wise for each
core of solr to hold.
As I have indexed some 10 Million Documents of 18Gb and when I index
another 5 (9Gb)Million Documents on top of these indexes it responds little
slow with Stats query.
Considering I have arou
Hi Otis,
We saw some improvement when increasing the size of the caches. Since then, we
followed Shawn advice on the filterCache and gave some additional RAM to the
JVM in order to reduce GC. The performance is very good right now but we are
still experiencing some instability but not at the sa
Okay, but i dint understand what you said. Can you please elaborate.
Thanks,
Vivek
On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan wrote:
> Hi Vivekanand,
>
> I have never use UIMA+Solr before.
>
> Personally I think it takes more time to learn how to configure/use these
> uima stuff.
>
>
> If yo
Kamal,
Alexandre was pointing not to what java version you are compiling with, but
what are the lucene and solr jar files you compile and run against.
What is your build system -- ant or maven or ..?
On Tue, Jun 3, 2014 at 2:03 PM, Kamal Kishore Aggarwal <
kkroyal@gmail.com> wrote:
> Even a
Hi,
I have two questions about upgrading Solr:
- We upgrade Solr often, to match the latest version. We have a number
of servers in a Solrcloud and prefer to upgrade one or two servers first
and upgrade the other server a few weeks later when we are sure
everything is stable. Is this the reco
Hi Vivekanand,
I have never use UIMA+Solr before.
Personally I think it takes more time to learn how to configure/use these uima
stuff.
If you are familiar with java, write a class that extends
UpdateRequestProcessor(Factory). Use OpenNLP for NER, add these new fields
(organisation, city, pe
On 3 June 2014 11:22, Manoj V wrote:
> I m working on solr. i m interested in getting added to solr user group.
>
> Can you please add me to the group ?
If mail from your address is reaching this list, you are already subscribed
to it. Presumably, you did that from under
https://lucene.apache.org
Hi Madhav,
Just a guess, try using batchSize="-1"
Ahmet
On Tuesday, June 3, 2014 12:48 PM, madhav bahuguna
wrote:
HI
iam using solr 4.7.1 and trying to do a full import.My data source is a
table in mysql. It has 1000 rows and 20 columns.
Whenever iam trying to do a full import solr st
11 * 11 or 121 query terms, which shouldn't be so bad.
But... maybe the Lucene FST for your synonym list is huge. Someone with
deeper Lucene knowledge would have to address that.
-- Jack Krupansky
-Original Message-
From: Branham, Jeremy [HR]
Sent: Tuesday, June 3, 2014 3:57 AM
To:
Hi Ahmet,
I followed what you said
https://cwiki.apache.org/confluence/display/solr/UIMA+Integration. But how
can i achieve my goal? i mean extracting only name of the organization or
person from the content field.
I guess i'm almost there but something is missing? please guide me
Thanks,
Vivek
Even after making the same java version, it is not working. I am using
java.runtime.version:1.7.0_55-b13
On Tue, Jun 3, 2014 at 2:05 PM, rulinma wrote:
> normal, rewrite filter.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Custom-filter-not-working-wi
I m working on solr. i m interested in getting added to solr user group.
Can you please add me to the group ?
Entire goal cant be said but one of those tasks can be like this.. we have
big document(can be website or pdf etc) indexed to the solr.
Lets say will sore store the contents of document. All
i want to do is pick name of persons,places from it using openNLP or some
other means.
Those names should
On Sun, Jun 1, 2014 at 10:55 PM, Shawn Heisey wrote:
> On 5/31/2014 1:54 PM, Aniket Bhoi wrote:
> > Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The result
> >>> set is closed.
>
> I still think this is an indication of the source of the problem.
> Something closed the connection t
normal, rewrite filter.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Custom-filter-not-working-with-solr-4-7-1-tp4136824p4139506.html
Sent from the Solr - User mailing list archive at Nabble.com.
Are you sure you are compiling with the same version of Java and Solr
libraries that you are executing with? I would double-check your
CLASSPATH for those two environments.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating
Hi,
I have changed the code. But now it is showing the following errors:
Caused by: java.lang.NoSuchMethodException:
org.apache.lucene.analysis.ExtendedNameFilterFactory.(java.util.Map)
Here's the new java code : http://pastebin.com/J8q4JLgP
A urgent help is appreciated. :)
Thanks
Kamal
On We
Hi,
Please tell us what you are trying to in a new treat. Your high level goal.
There may be some other ways/tools such as ( https://stanbol.apache.org ) other
than OpenNLP.
On Tuesday, June 3, 2014 8:31 AM, Vivekanand Ittigi
wrote:
We'll surely look into UIMA integration.
But before m
Evidently I didn't understand enough about the synonym filter.
I'm not sure anyone would be able to determine the impact based on the example
queries below.
However I'm curious what the best practice is for synonyms.
We have 179 lines of synonyms each with 2 - 6 synonyms per line [all expanded].
function also can be use.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Boost-documents-having-a-field-value-tp4139342p4139486.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hello,
im trying to implement a user friendly search for phone numbers. These numbers
consist out of two digit-tokens like "12345 67890".
Finally I want the highlighting for the phone number in the search result,
without any concerns about was this search result hit by field tel or
copyFi
What collection are you using and what URL are you giving to post.jar?
Unless it's a default collection1, you need to have the collection
name in the URL.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficie
Hello,
im trying to implement a user friendly search for phone numbers. These numbers
consist out of two digit-tokens like "12345 67890".
Finally I want the highlighting for the phone number in the search result,
without any concerns about was this search result hit by field tel or
copyFi
Hi,
Solr version 4.8.1
I download the Solr and run via the start.jar and i am able to index the
json data through post.jar and i read the solr in action book and follow the
instruction of book and copy the example and rename the realEstate and
collection1 to real and then change the core.propertie
Hi,
Solr version 4.8.1
I download the Solr and run via the start.jar and i am able to index the
json data through post.jar and i read the solr in action book and follow the
instruction of book and copy the example and rename the realEstate and
collection1 to real and then change the core.propertie
72 matches
Mail list logo