Hey Sujit thanks a lot.
But what do you think about Berryman blog post ?
Is it feasible to apply or should i apply the synonym stuff ?
which one is good?
And the 3rd approach you told me about, seems like difficult and
time consuming for students like me as i will have to submit this in next
15 Day
the link you provided has no information about customizing
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-customize-Solr-tp4122551p4122760.html
Sent from the Solr - User mailing list archive at Nabble.com.
We have a cluster running SolrCloud 4.7 built 2/25. 10 shards with 2
replicas each (20 shards total) at about ~20GB/shard.
We index around 1k-1.5k documents/second into this cluster constantly. To
manage growth we have a scheduled job that runs every 3 hours to prune
documents based on business
Hello, I think you are confused between two different index
structures, probably because of the name of the options in solr.
1. indexing term vectors: this means given a document, you can go
lookup a miniature "inverted index" just for that document. That means
each document has "term vectors" whi
Correct, Steve. Alternatively you can also put this option in your query
after the end of the last parenthesis, as in this example from the wiki:
fq=geo:"IsWithin(POLYGON((-10 30, -40 40, -10 -20, 40 20, 0 0, -10 30)))
distErrPct=0"
~ David
Steven Bower wrote
> Only points in the index.. Am I
Really, how can anyone help with this little information?
Please read:
http://wiki.apache.org/solr/UsingMailingLists
Best,
Erick
On Mon, Mar 10, 2014 at 10:03 PM, William Bell wrote:
> Send the queries.
>
>
> On Fri, Mar 7, 2014 at 2:32 PM, EXTERNAL Taminidi Ravi (ETI,
> Automotive-Service-Solut
Send the queries.
On Fri, Mar 7, 2014 at 2:32 PM, EXTERNAL Taminidi Ravi (ETI,
Automotive-Service-Solutions) wrote:
> Hi All,
>
> I am facing a strange behavior with the Solr Server. All my joins are not
> working suddenly after a restart. Individual collections are returning the
> response but
Only points in the index.. Am I correct this won't require a reindex?
On Monday, March 10, 2014, Smiley, David W. wrote:
> Hi Steven,
>
> Set distErrPct to 0 in order to get non-point shapes to always be as
> accurate as maxDistErr. Point shapes are always that accurate. As long as
> you only
This looks like a codec issue, but I'm not sure how to address it. I've
found that a different instance of DocsAndPositionsEnum is instantiated
between my code and Solr's TermVectorComponent.
Mine:
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$EverythingEnum
Solr:
org.apache.lucene.cod
Where do you live? Is it possible you're getting fooled by the fact
that Solr uses UTC?
Solr doesn't distinguish between dates and times, they're all just
unix timestamps.
And, taking into account the time difference between now and UTC in my
time zone it works perfectly for me.
Best,
Erick
On
Hello
is there a fix for the NOW rounding
Otherwise i have to get current date and crreate a range query like
* TO -MM-ddThh:mm:ssZ
--
View this message in context:
http://lucene.472066.n3.nabble.com/Filter-query-not-working-for-time-range-tp4122441p4122723.html
Sent from the Solr - User
I'm trying to use the new InfixSuggester exposed in 4.7 and I'm getting
some errors on startup, they don't seem to necessarily cause any problems,
my app still seems to run, but I get the following:
17:28:54.721 WARN {coreLoadExecutor-4-thread-1} [o.a.s.core.SolrCore] :
[vpr] Solr index directory
Having a couple of docs that aren't being
returned that you think should be would
help.
It's tangential, but you might get better
performance out of this when you get over
your initial problem by using something like
fq=StartDate:[NOW/DAY TO NOW/DAY+1DAY]
That'll filter on all docs with startDate
You have to separate out a couple of things.
First, data gets written to segments _without_
the segment getting closed and _before_ you
commit. What happens is that when
ramBufferSizeMB in solrconfig.xml is exceeded,
its contents are flushed to the currently-opened
segment. The segment is _not_ cl
Hi Sohan,
You would be the best person to answer your question of how to proceed :-).
>From your original query term "musical events in New York" rewriting to
"musical nights at ABC place" OR "concerts events" OR "classical music
event" you would have to build into your knowledge base that "ABC pl
Pardon my typo. I meant 1000ms in my last mail.
Thanks,
-Vijay
On Mon, Mar 10, 2014 at 4:22 PM, Vijay Kokatnur wrote:
> Thanks Erick. The links you provided are invaluable.
>
> Here are our commit settings. Since we have NRT search, softCommit is set
> to 1000s which explains why cache is co
Thanks Erick. The links you provided are invaluable.
Here are our commit settings. Since we have NRT search, softCommit is set
to 1000s which explains why cache is constantly invalidated.
60
false
1000
With constant cache invalidation it becom
Hi Steven,
Set distErrPct to 0 in order to get non-point shapes to always be as accurate
as maxDistErr. Point shapes are always that accurate. As long as you only
index points, not other shapes (you don’t index polygons, etc.) then distErrPct
of 0 should be fine. In fact, perhaps a future So
I get a different error (but related to the same issue I guess) with
the following simple query:
/opt/code/heliosearch/solr$ curl -XPOST
"http://localhost:8983/solr/select?q=*:*";
Must specify a Content-Type header
with POST requests415
HTTP does not require a POST body, so it seems like the
Hello!
Luke 4.7.0 has been released. Download it here:
https://github.com/DmitryKey/luke/releases/tag/4.7.0
Release based on pull request of Petri Kivikangas (
https://github.com/DmitryKey/luke/pull/2) Kiitos, Petri!
Tested against the solr-4.7.0 index.
1. Upgraded maven plugins.
2. Added simp
On 3/10/2014 6:14 AM, leevduhl wrote:
> We just upgraded our dev environment from Solr 4.6 to 4.7 and our search
> "posts" are now returning a "Search requests cannot accept content streams"
> error. We did not install over top of our 4.6 install, we installed into a
> new folder.
>
> org.apache.
Solr has extensive filtering tests.
The first step would be to double check that you see what you think
you are seeing, and then try and create an example to reproduce it.
For example, this works fine with the "example" data, and is of the
same form as your query:
http://localhost:8983/solr/query
What are some example values of the HotelID and StateDate fields that are
not getting filtered out?
Multiple fq queries will be ANDed.
-- Jack Krupansky
-Original Message-
From: Vijay Kokatnur
Sent: Monday, March 10, 2014 4:51 PM
To: solr-user
Subject: Multiple "fq" parameters are no
Weirdly that same point shows up in the polygon below as well, which in the
area around the point doesn't intersect with the polygon in my first msg...
29.0454,41.2198
29.2349,41.1826
31.1107,40.9956
38.437,40.7991
41.1616,40.8988
<..Spawning this as a separate thread..>
So I have a filter query with multiple "fq" parameters. However, I have
noticed that only the first "fq" is used for filtering. For instance, a
lookup with
...&fq=ClientID:2
&fq=HotelID:234-PPP
&fq={!cache=false}StartDate:[NOW/DAY TO *]
In the above que
Minor edit to the KML to adjust color of polygon
On Mon, Mar 10, 2014 at 4:21 PM, Steven Bower wrote:
> I am seeing a "error" when doing a spatial search where a particular point
> is showing up within a polygon, but by all methods I've tried that point is
> not within the polygon..
>
> First t
I am seeing a "error" when doing a spatial search where a particular point
is showing up within a polygon, but by all methods I've tried that point is
not within the polygon..
First the point is: 41.2299,29.1345 (lat/lon)
The polygon is:
31.2719,32.283
31.2179,32.3681
31.1333,32.3407
30.9356,32.
Salman,
It looks like what you describe has been implemented at Twitter.
Presentation from the recent Lucene / Solr Revolution conference in Dublin:
http://www.youtube.com/watch?v=AguWva8P_DI
On Sat, Mar 8, 2014 at 4:16 PM, Salman Akram <
salman.ak...@northbaysolutions.net> wrote:
> The issue
Maybe I spoke too soon.
The second and third filter parameter *fq={!cache=false cost=50}ClientID:4*and
*fq={!cache=false cost=150}StartDate:[NOW/DAY TO NOW/DAY+1YEAR] *above are
not getting executed, unless I make it the first parameter. And when it's
the first filter parameter the Qtime goes up
Hi Staszek, Tommaso,
Thanks for the clarification.
Ahmet
On Monday, March 10, 2014 8:23 PM, Tommaso Teofili
wrote:
Hi Ahmet, Ale,
right, there's a classification module for Lucene (and therefore usable in
Solr as well), but no clustering support there.
Regards,
Tommaso
2014-03-10 19:15
Hi Ahmet, Ale,
right, there's a classification module for Lucene (and therefore usable in
Solr as well), but no clustering support there.
Regards,
Tommaso
2014-03-10 19:15 GMT+01:00 Ahmet Arslan :
> Hi,
>
> Thats weird. As far as I know there is no such thing. There is
> classification stuff b
>
> Thats weird. As far as I know there is no such thing. There is
> classification stuff but I haven't heard of clustering.
>
> http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
I think the wording on the wiki page needs some clarification -- Solr
cont
Hi,
Thats weird. As far as I know there is no such thing. There is classification
stuff but I haven't heard of clustering.
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
May be others (Dawid Weiss) can clarify?
Ahmet
On Monday, March 10, 2014 4:
Ok David. I give it a shot.
Thanks again!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-spatial-search-within-the-polygon-tp4101147p4122647.html
Sent from the Solr - User mailing list archive at Nabble.com.
Lucene has multiple modules, one of which is "spatial". You'll see it in the
source tree checkout underneath the lucene directory.
Javadocs: http://lucene.apache.org/core/4_7_0/spatial/index.html
SpatialExample.java:
https://github.com/apache/lucene-solr/blob/trunk/lucene/spatial/src/test/org/apa
Hi;
If you have any other problems you can ask them too.
Thanks;
Furkan KAMACI
2014-03-10 16:17 GMT+02:00 Vineet Mishra :
> Hi
>
> Got it working!
>
> Much thanks for you response.
>
>
> On Sat, Mar 8, 2014 at 7:40 PM, Furkan KAMACI >wrote:
>
> > Hi;
> >
> > Could you check here:
> >
> >
> ht
Could you please send me where I can find this .java?
What do you refer by "Lucene-spatial module"?
Thanks for your time David!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-spatial-search-within-the-polygon-tp4101147p4122642.html
Sent from the Solr - User mailing
You're going to have to use the Lucene-spatial module directly then. There's
SpatialExample.java to get you started.
javinsnc wrote
>
> David Smiley (@MITRE.org) wrote
>> On 3/10/14, 12:56 PM, "javinsnc" <
>> javiersangrador@
>> > wrote:
>> This is indeed the source of the problem.
>>
>> Why
David Smiley (@MITRE.org) wrote
> On 3/10/14, 12:56 PM, "javinsnc" <
> javiersangrador@
> > wrote:
> This is indeed the source of the problem.
>
> Why do you index with Lucene’s API and not Solr’s? Solr not only has a
> web-service API but it also has the SolrJ API that can embed Solr —
> Embed
On 3/10/14, 12:56 PM, "javinsnc" wrote:
>>>
>>>/*/
>>>/* Document contents */
>>>/*/
>>>I have tried with 3 different content for my documents (lat-lon refers
>>>to
>>>Madrid, Spain):
>>
>> Um…. Just to be absolutely sure, are you adding the data in Solr’
Changes from the previous release are primarily off-heap FieldCache
support for strings as well as as all numerics (the previous release
only had integer support).
Benchmarks for string fields here:
http://heliosearch.org/hs-solr-off-heap-fieldcache-performance
Try it out here: https://github.com
Take a look at the "explain" section of the results when you set the
debugQuery=true parameter.
Also set the debug.explain.structured=true parameter to get a structured
representation of the explain section.
-- Jack Krupansky
-Original Message-
From: heaven
Sent: Monday, March 10,
David Smiley (@MITRE.org) wrote
> On 3/10/14, 6:45 AM, "Javi" <
> javiersangrador@
> > wrote:
>
>>/**/
>>/* 1. library */
>>/**/
>>
>>(1) I use "jts-1.13.jar" and "spatial4j-0.4.1.jar"
>>(I think they are the latest version)
>
> You should only need to add JTS; spatial4j
On 3/10/14, 12:12 PM, "Smiley, David W." wrote:
>>
>>
>>
>>c) I tried no WKT format by adding a comma and using "longitude,latitude"
>>
>>
>>
>> 40.442179,-3.69278
>>
>>
>
>That is *wrong*. Remove the comma and it will then be okay. But again,
>see my earlier advise on lat & lon
Hi, I have a few text fields indexed and when searching I need to know what
field matched. For example I have fields:
{code}
full_name, site_source, tweets, rss_entries, etc
{code}
When searching I need to show results and show scores per each field. So an
user can see what exactly content match th
The clause limit covers all clauses (terms) in one Lucene BooleanQuery - one
level of a Solr query, where a parenthesized sub-query is a separate level
and counts as a single clause in the parent query.
In this case, it appears that the wildcard is being expanded/rewritten to a
long list of te
On 3/10/14, 6:45 AM, "Javi" wrote:
>Hi all.
>
>I need your help! I have read every post about Spatial in Solr because I
>need to check if a point (latitude,longitude) is inside a Polygon.
>
>/**/
>/* 1. library */
>/**/
>
>(1) I use "jts-1.13.jar" and "spatial4j-0.4.1.ja
You need to either quote your query (after the colon, and another at the
very end), or escape any special characters, or use a different query
parser like “field”. I prefer to use the field query parser:
{!field f=loc}Intersects(POLYGON(...
~ David
On 3/6/14, 10:52 AM, "leevduhl" wrote:
>Gett
The "#" character introduces the "fragment" portion of a URL, so
"/dev/update/extract" is not a part of the "path" of the URL. In this case
the URL "path" is "/solr/" and the server is simply complaining that there
is no code registered to process that path.
Normally, the collection name (core
Merhaba Furkan,
We are planning to migrate to 3 nodes in an ensemble, but by now we have only
one active zookeeper instance in production.
Actually, I thought about a param somewhere in Solr configuration. I may be
wrong but I thought that the problem was due to the fact that Solr asks or
tell
Thank you, Ahmet, i already know Mahout.
What i was curious is if already exists an integration in Solr for Offline
clustering ...
Reading the wiki we can find this phrase : " While Solr contains an
extension for for full-index clustering (*off-line* clustering) this
section will focus on discussin
"literal.id" should contain a unique identifier for each document (assuming
that the unique identifier field in your solr schema is called "id"); see
http://wiki.apache.org/solr/ExtractingRequestHandler .
I'm guessing that the url for the ExtractinRequestHandler is incorrect, or
maybe you haven't
Hi
Got it working!
Much thanks for you response.
On Sat, Mar 8, 2014 at 7:40 PM, Furkan KAMACI wrote:
> Hi;
>
> Could you check here:
>
> http://lucene.472066.n3.nabble.com/Error-when-creating-collection-in-Solr-4-6-td4103536.html
>
> Thanks;
> Furkan KAMACI
>
>
> 2014-03-07 9:44 GMT+02:00 Vin
Hi Alessandro,
Generally Apache mahout http://mahout.apache.org is recommended for offline
clustering.
Ahmet
On Monday, March 10, 2014 4:11 PM, Alessandro Benedetti
wrote:
Hi guys,
I'm looking around to find out if it's possible to have a full-index
/Offline cluster.
My scope is to make a f
Hi guys,
I'm looking around to find out if it's possible to have a full-index
/Offline cluster.
My scope is to make a full index clustering ad for each document have the
cluster field with the id/label of the cluster at indexing time.
Anyone know more details regarding this kind of integration with
Hi,
I have a performance and scoring problem for phrase queries
1. Performance - phrase queries involving frequent terms are very slow
due to the reading of large positions posting list.
2. Scoring - I want to control the boost of phrase and entity (in
gazetteers) matches
Indexing all
Any news regarding this ?
I'm investigating in Solr offline clustering as well ( full index
clustering).
Cheers
2012-09-17 20:16 GMT+01:00 Denis Kuzmenok :
>
>
>
> Sorry for late response. To be strict, here is what i want:
>
> * I get documents all the time. Let's assume those are news (It's
>
Hi;
Did you read here:
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
Thanks;
Furkan KAMACI
2014-03-10 15:14 GMT+02:00 RadhaJayalakshmi :
> Hi,
>
> Brief Description of my application:
> We have a java program which reads a flat file, and add
Hi,
How much refreshes do you need? Can you live with 3-5 minutes refresh rate?
If you can effort to query mysql for every single query, consider using post
filter :
http://searchhub.org/2012/02/22/custom-security-filtering-in-solr/
Ahmet
On Monday, March 10, 2014 2:56 PM, lavesh wrote:
I
Hi all,
Following throw "The request resource is not available"
curl "
http://localhost:8080/solr/#/dev/update/extract?stream.file=/home/priti/$file&literal.id=document$i&commit=true
"
I don't understand what is literal.id ?? Is it mandatory. [Please share
reading links if known]
HTTP Status
Thanks Furkam,
This give some really good understanding. We have Amazon Instance and right
now it is running on m1.large.
In Amazon we are not finding a support to increase ONLY RAM ! that our main
concern and we are actively looking which instance can help us to support
this index size.
Do you
Hi Metin;
I think that timeout value you are talking about is that:
http://zookeeper.apache.org/doc/r3.1.2/zookeeperStarted.html However it is
not recommended to change timeout value of Zookeeper "if you do not have a
specific reason". On the other hand how many Zookeepers do you have at your
infr
Hi,
Brief Description of my application:
We have a java program which reads a flat file, and adds document to solr
using cloudsolrserver.
And we index for every 1000 documents(bulk indexing).
And the Autocommit setting of my application is:
10
false
So after every 100,000
Hi Priti;
100 qps is not much but 7 GB is too low and it may be a problem for you. I
have a tens of nodes of SolrCloud and I send data them via Map/Reduce via
tens of servers. However indexing speed did not be a problem for me yet.
Problems occurs because of network communication, RAM or something
I want list of users who are online and fulfill the criteria specified.
Current implementation:
I am sending post parameters of online ids(usually 20k) with search
criteria.
How i want to optimize it:
I must change the internal code of solr so that these 20k profiles are
fetching from solr
i.e
On 3/10/2014 6:20 AM, abhishek jain wrote:
>
> replacement=" punct " replace="all"/>
> Is there a way i can tokenize after application of filter, please suggest i
> know i am missing something basic.
Use PatternReplaceCharFilterFactory instead. CharFilters are performed
before tokenizers, re
Hi,
As a solution, i have tried a combination of PatternTokenizerFactory and
PatternReplaceFilterFactory .
In both query and indexer i have written:
What i am trying to do is tokenizing on space and then rewriting every
special character as " punct " .
So, A,B becomes A punct B .
but the pro
We just upgraded our dev environment from Solr 4.6 to 4.7 and our search
"posts" are now returning a "Search requests cannot accept content streams"
error. We did not install over top of our 4.6 install, we installed into a
new folder.
org.apache.solr.common.SolrException: Search requests cannot
Excellent, thank you.
Lee
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Production-Installation-tp4122091p4122533.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi all.
I need your help! I have read every post about Spatial in Solr because I
need to check if a point (latitude,longitude) is inside a Polygon.
/**/
/* 1. library */
/**/
(1) I use "jts-1.13.jar" and "spatial4j-0.4.1.jar"
(I think they are the latest version)
/*
As of now index is on 136 GB.
I want to understand can we do multiple write on solr? I don't have any
partitioning strategy as of now.
On Amazon instance for Solr the disk read/write a like 5% or so . I am not
able to understand even though I am almost processing 300 records per min
how come Sol
Hi all,
we are using SolrCloud with this configuration :
* SolR 4.4.0
* Zookeeper 3.4.5
* one server with zookeeper + 4 solr nodes
* one server with 4 solr nodes
* only one core
* Solr instances deployed on tomcats with mod_cluster
* c
Hi,
When our server crashes the memory fills up fast. So I think it might
be a specific query that causes our servers to crash. I think the query
won't be logged because it doesn't finish. Is there anything we can do
to see the currently running queries in de Solr server (so when can see
them
does this maxClauseCount go over each field individually or all put together?
is it the date fields?
when i execute a query i get this error:
500 93true
Ein PDFchen als Dokument roles:* 1394436617394 xml
.
0.10604319
390 2
On Sun, 2014-03-09 at 19:55 +0100, abhishek jain wrote:
> I am confused should i keep two separate indexes or keep one index with two
> versions or column , i mean col1_stemmed and col2_unstemmed.
1 index with stemmed & unstemmed will be markedly smaller than 2 indexes
(one with stemmed, one with
75 matches
Mail list logo