At the risk of derailing the thread:
We do a lot more in the script than is mentioned here: We pull out parts of the
path and mangle them (for example turn them into a UNC path for users to use,
or pull out a client name or job number using a known folder structure). As for
deleted files, here'
Hi again,
Anybody interested in this feature for Solr MailEntityProcessor?
WDYT?
Thanks,
Dileepa
On Thu, Jan 30, 2014 at 11:00 AM, Dileepa Jayakody <
dileepajayak...@gmail.com> wrote:
> Hi All,
>
> I think Oauth2 integration is a valid usecase for Solr when it comes to
> importing data from us
Thanks all.
I am following couple of articles for same.
I am sending data to solr instead of using DIH and able to successfully
index data in solr.
My concern here is to ensure how to minimize solr indexing so that only
updated data is indexed each time out of all data items.
Is this something
Why write a Perl script for that?
touch new_timestamp
find . -newer timestamp | script-to-submit && mv new_timestamp timestamp
Neither approach deals with deleted files.
To do this correctly, you need lists of all the files in the index with their
timestamps, and of all the files in the reposit
I'd start from doing Solr tutorial. It will explain a lot of things.
But in summary, you can send data to Solr (best option) or you can
pull it using DataImportHandler. Take your pick, do the tutorial,
maybe read some books. Then come back with specific questions of where
you started.
Regards,
I had this problem when I started to look at Solr as an index for a file
server. What I ended up doing was writing a perl script that did this:
- Scan the whole filesystem and create an XML that is submitted into Solr for
indexing. As this might be some 600,000 files, I break it down into chunks
Thanks Alex,
Yes my source system maintains the crettion & last modificaiton system of
each document.
As per your inputs, can i assume that next time when solr starts indexing,
it scans all the prsent in source but only picks those for indexing which
are either new or have been updated since las
hi,
Thanks for your reply..
I m beginner of solr kindly elaborate it mor details because in my
solrconfig.xml
explicit
5
name
true
json
true
where I can add this qf parameter for those tw
You have read that Solr needs to reindex a full source. That's correct
(unless you use atomic updates). But - the important point is - this
is per document. So, once you indexed your 1 documents, you don't
need to worry about them until they change.
Just go ahead and index your additional docu
Hi,I am working on a prototyope where i have a content source & i am indexing
all documents & strore the index in solr.Now i have pre-condition that my
content source is ever changing means there is always new content added to
it. As i have read that solr use to do indexing on full source only
ever
Hi ,
I am new to solr , i need help with the following
PROBLEM: I have a huge file of 1 lines i want this to be an inclusion or
exclusion in the query . i.e each line like ( line1 or line2 or ..)
How can this be achieved in solr , is there a custom implementation that i
would need to implemen
Navaa,
You need the query to be sent to the two fields. In dismax, this is easy.
Paul
On 12 février 2014 14:22:33 HNEC, Navaa
wrote:
>Hi,
>I am using solr for searching phoneticly equivalent string
>my schema contains...
>positionIncrementGap="100">
>
>
>
Re-posting...
Thanks,
Anand
On 2/12/2014 10:55 AM, anand chandak wrote:
Thanks David, really helpful response.
You mentioned that if we have to add scoring support in solr then a
possible approach would be to add a custom QueryParser, which might be
taking Lucene's JOIN module. I have t
Chun,
Have you looked at Grouping / Field Collapsing feature in solr?
https://wiki.apache.org/solr/FieldCollapsing
If shop is one of your field, you can use field collapsing on that field
with a maximum of 'n' to return per field value (or group).
Sameer.
--
www.measuredsearch.com
tw: measureds
Dear all gurus,
I would like to limit amount of search result, let's say I have many shop
which is selling shirt. So when I search "white shirt" I want to give a
maximum number per shop (ex. 5).
The result should be like this...
-> Shop A
-> Shop A
-> Shop B
-> Shop B
-> Shop B
-> Shop B
-> Shop
Did you mean to use "||" for the OR operator? A single "|" is not treated as
an operator - it will be treated as a term and sent through normal term
analysis.
-- Jack Krupansky
-Original Message-
From: Shamik Bandopadhyay
Sent: Wednesday, February 12, 2014 5:32 PM
To: solr-user@lucen
Hi Erick,
Thank you very much, those are valuable suggestions :-).
I would give a try.
Appreciate your time.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Indexing-strategies-tp4116852p4117050.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks, I'll take a look at the debug data.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Weird-issue-with-q-op-AND-tp4117013p4117047.html
Sent from the Solr - User mailing list archive at Nabble.com.
Not a direct answer, but the usual next question is: are you
absolutely sure you are using the right jars? Try renaming them and
restarting Solr. If it complains, you got the right ones. If not
Also, unzip those jars and see if your file made it all the way
through the build pipeline.
Regards,
Hello,
I use
icu4j-49.1.jar,
lucene-analyzers-icu-4.6-SNAPSHOT.jar
for one of the fields in the form
I need to change one of the accent char's corresponding letter. I made changes
to this file
lucene/analysis/icu/src/data/utr30/DiacriticFolding.txt
recompiled solr and lucene and replaced t
On 2/12/2014 4:58 PM, shamik wrote:
Thanks a lot Shawn. Changing the appends filtering based on your suggestion
worked. The part which confused me bigtime is the syntax I've been using so
far without an issue (barring the q.op part).
Source:"TestHelp" | Source:"downloads" |
-AccessMode:"
Thanks a lot Shawn. Changing the appends filtering based on your suggestion
worked. The part which confused me bigtime is the syntax I've been using so
far without an issue (barring the q.op part).
Source:"TestHelp" | Source:"downloads" |
-AccessMode:"internal" | -workflowparentid:[* TO *]
On 2/12/2014 3:32 PM, Shamik Bandopadhyay wrote:
Hi,
I'm facing a weird problem while using q.op=AND condition. Looks like it
gets into some conflict if I use multiple "appends" condition in
conjunction. It works as long as I've one filtering condition in appends.
Source:"TestHelp"
Hi,
I'm facing a weird problem while using q.op=AND condition. Looks like it
gets into some conflict if I use multiple "appends" condition in
conjunction. It works as long as I've one filtering condition in appends.
Source:"TestHelp"
Now, the moment I add an additional parameter, sear
Your new code should also work, and should be equivalent.
The longer stack trace you have is of the wrapping SolrException which wraps
another exception — InvalidShapeException. You should also see the stack trace
of InvalidShapeException which should originate out of Spatial4j.
~ David
From:
Hi David,
You wrote:
> Perhaps you’ve got some funky UpdateRequestProcessor from experimentation
> you’ve done that’s parsing then toString’ing it?
No, nothing at all. The update processing is straight out-of-the-box Solr.
> And also, your stack trace should have more to it than what you pres
On 2/6/2014 4:00 AM, Shawn Heisey wrote:
I would not recommend it, but if you know for sure that your
infrastructure can handle it, then you should be able to optimize them
all at once by sending parallel optimize requests with distrib=false
directly to the Solr cores that hold the shard replicas
Navaa,
you need query expansion for that.
E.g. if your query goes through dismax, you need to add the two field names to
the qf parameter.
The nice thing is that qf can be:
text^3.0 test.stemmed^2 text.phonetic^1
And thus exact matches are preferred to stemmed or phonetic matches.
This is conf
And perhaps one other, but very pertinent, recommendation is: allocate only
as little heap as is necessary. By allocating more, you are working against
the OS caching. To know how much is enough is bit tricky, though.
Best,
roman
On Wed, Feb 12, 2014 at 2:56 PM, Shawn Heisey wrote:
> On 2/1
That’s pretty weird. It appears that somehow a Spatial4j Point class is having
it’s toString() called on it (which looks like "Pt(x=-72.544123,y=41.85)" )
and then Spatial4j is trying to parse this which isn’t in a valid format — the
toString is more debug-ability. Your SolrJ code looks to
Tri,
You will most likely need to implement a custom QParserPlugin to
efficiently handle what you described. Inside of this QParserPlugin you
could create the logic that would bring in your outside list of ID's and
build a DocSet that could be applied to the fq and the facet.query. I
haven't attem
On 2/12/2014 12:07 PM, Greg Walters wrote:
Take a look at
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html as it's
a pretty decent explanation of memory mapped files. I don't believe that the
default configuration for solr is to use MMapDirectory but even if it does my
Hi all,I am running a Solr application and I would need to implement a feature that requires faceting and filtering on a large list of IDs. The IDs are stored outside of Solr and is specific to the current logged on user. An example of this is the articles/tweets the user has read in the last few w
Hi David,
I finally got back to this again, after getting sidetracked for a couple of
weeks.
I implemented things in accordance with my understanding of what you wrote
below. Using SolrJ, the code to index the spatial field is as follows,
private void addSpatialField(double lat, double lon, S
Shital,
Take a look at
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html as it's
a pretty decent explanation of memory mapped files. I don't believe that the
default configuration for solr is to use MMapDirectory but even if it does my
understanding is that the entire fil
No, Solr doesn't load the entire index in memory. I think you'll find
Uwe's blog most helpful on this matter:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
On Thu, Feb 13, 2014 at 12:27 AM, Joshi, Shital wrote:
> Does Solr4 load entire index in Memory mapped file? What
Does Solr4 load entire index in Memory mapped file? What is the eviction policy
of this memory mapped file? Can we control it?
_
From: Joshi, Shital [Tech]
Sent: Wednesday, February 05, 2014 12:00 PM
To: 'solr-user@lucene.apache.org'
Subject: Solr4 perf
Hi Robert,
I don't think this is possible at the moment, but I hope to get
https://issues.apache.org/jira/browse/SOLR-4478 in for Lucene/Solr 4.7, which
should allow you to inject your own SolrResourceLoader implementation for core
creation (it sounds as though you want to wrap the core's loade
Is price a float/double field?
price:[99.5 TO 100.5] -- price near 100
price:[900 TO 1000]
or
price:[899.5 TO 1000.5]
-- Jack Krupansky
-Original Message-
From: jay67
Sent: Wednesday, February 12, 2014 12:03 PM
To: solr-user@lucene.apache.org
Subject: Using numeric ranges in Solr
When user enter a price in price field, for Ex: 1000 USD, i want to fetch all
items with price around 1000 USD. I found in documentation that i can use
price:[* to 1000] like that. It will get all items with from 1 to 1000 USD.
But i want to get results where price is between 900 to 1000 USD.
Any
There is also an "XSLT" update handler option to transform raw XML to Solr
XML on the fly. If anybody here has used it, feel free to chime in.
See:
http://wiki.apache.org/solr/XsltUpdateRequestHandler
and
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#Uploadi
Hello!
Just specify the left boundary, like: price:[900 TO 1000]
--
Regards,
Rafał Kuć
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/
> When user enter a price in price field, for Ex: 1000 USD, i want to fetch all
> items with pri
Thanks you so much Erick, I will try to write my owe XML parser
--
View this message in context:
http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901p4116936.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks a lot, learnt a lot from it
--
View this message in context:
http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901p4116937.html
Sent from the Solr - User mailing list archive at Nabble.com.
The explicit commit will cause your app to be delayed until that commit
completes, and then Solr would be idle until that request completion makes
its way back to your app and you submit another request which finds its way
to Solr, maybe a few ms. That includes network latency. That interval of
Here's some additional background that may shed light on the
performance..
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
Best,
Erick
On Wed, Feb 12, 2014 at 7:40 AM, Dmitry Kan wrote:
> Cross-posting my answer from SO:
>
> According to this
Thanks the syntax correction solved the problem. I actually thought I tried
that before I posted.
Thanks
Lee
--
View this message in context:
http://lucene.472066.n3.nabble.com/Newb-Search-not-returning-any-results-tp4116905p4116930.html
Sent from the Solr - User mailing list archive at Nabbl
On 2/12/2014 8:21 AM, Eric_Peng wrote:
> I was just trying to use SolrJ Client to import XML data to Solr server. And
> I read SolrJ wiki that says "SolrJ lets you upload content in XML and Binary
> format"
>
> I realized there is a XML parser in Solr (We can use a dataUpadateHandler in
> Solr d
Hmmm, before going there let's be sure you're trying to do
what you think you are.
Solr does _not_ index arbitrary XML. There is a very
specific format of XML that describes solr documents
that _can_ be indexed. But random XML is not
supported. See the documents in example/exampledocs
for the XML
Thanks for the comments/advice. I did mess with the drivers ( by
deliberately moving the libs) and it did fail as it is supposed to.
When I looked into catalina.out, I realized that the problem lies with
data directory being owned by root instead of tomcat6. I changed it
so that tomcat6 can write
First, why are you talking about DoubleMetaphone when
your fieldType uses BeiderMorseFilterFactory? Which points
up a basic issue you need to wrap your head around or you'll
be endlessly confused. At least I was...
Your analysis chains _must_ do compatible things at index and
query time. The field
I'd seriously consider a SolrJ program that pulled the necessary data from
two of your systems, held it in cache and then pulled the data from your
main system and enriched it with the cached data.
Or export your information from your remote systems and import them into
a single system where you c
Cross-posting my answer from SO:
According to this wiki:
https://wiki.apache.org/solr/NearRealtimeSearch
the commitWithin is a soft-commit by default. Soft-commits are very
efficient in terms of making the added documents immediately searchable.
But! They are not on the disk yet. That means the
On 12 February 2014 20:57, leevduhl wrote:
[...]
> However, when I try to search specifically where "mailingcity=redford" I
> don't get any results back. See the following query/results.
>
> Query:
> http://{domain}:8983/solr/MIM/select?q=mailingcity=redford&rows=2&fl=id,mailingcity&wt=json&inden
It can be anything from wrong credentials, to missing driver in the class path,
to malformed connection string, etc..
What does the Solr log say?
-Original Message-
From: Maheedhar Kolla [mailto:maheedhar.ko...@gmail.com]
Sent: יום ד 12 פברואר 2014 17:23
To: solr-user@lucene.apache.org
Hi ,
I need help with importing data, through DIH. ( using solr-3.6.1, tomcat6 )
I see the following error when I try to do a full-import from my
local MySQL table ( http:/s/solr//dataimport?command=full-import"
).
..
0
0
0
0
Indexing failed. Rolled back all changes.
I did search
I was just trying to use SolrJ Client to import XML data to Solr server. And
I read SolrJ wiki that says "SolrJ lets you upload content in XML and Binary
format"
I realized there is a XML parser in Solr (We can use a dataUpadateHandler in
Solr default UI Solr Core "Dataimport")
So I was wonderi
On 12 February 2014 20:53, Maheedhar Kolla wrote:
>
> Hi ,
>
>
> I need help with importing data, through DIH. ( using solr-3.6.1, tomcat6 )
>
> I see the following error when I try to do a full-import from my
> local MySQL table ( http:/s/solr//dataimport?command=full-import"
> ).
>
>
> ..
I absolutely agree and I even read the NRT page before posting this question.
The thing that baffles me is this:
Doing a commit after each add kills the performance.
On the other hand, when I use commit within and specify an (absurd) 1ms delay,-
I expect that this behavior will be equivalent to
I setup a Solr Core and populated it with documents but I am not able to get
any results when attempting to search the documents.
A generic search (q=*.*) returns all documents (and fields/values within
those documents), however when I try to search using specific criteria I get
no results back.
Yes, committing after each document will greatly degrade performance. I
typically use autoCommit and autoSoftCommit to set the time interval
between commits, but commitWithin should have a similar effect.. I often
see performance of 2000+ docs per second on the load using auto commits.
When explici
Doing a standard commit after every document is a Solr anti-pattern.
commitWithin is a “near-realtime” commit in recent versions of Solr and not a
standard commit.
https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching
- Mark
http://about.me/markrmiller
On Feb 12, 2014, at
I am running a very simple performance experiment where I post 2000 documents
to my application. Who in turn persists them to a relational DB and sends them
to Solr for indexing (Synchronously, in the same request).
I am testing 3 use cases:
1. No indexing at all - ~45 sec to post 2000 docume
Hi,
I am using solr for searching phoneticly equivalent string
my schema contains...
Note: In SolrCloud terminology, a leader is also a replica. IOW, you have
two replicas, one of which (and it can vary over time) is elected as leader
for that shard.
The other shards remain capable of indexing even if one shard becomes
unavailable. That is expected - and desired - behavior in
We have a system which consists of 2 shards while every shard has a leader
and one replica.
During indexing one of the shards (both leader and replica) was shut down.
We got two types of HTTP requests: rm= "Service Unavailable" and "rm="OK".
>From this we’ve got to the conclusion that the shard wh
Hi,
I am beginner of solr,
I am trying to implement phonetic search in my application
my code in schema.xml for fieldType
And
AND field definition
when I am search stephen, stifn will gives me stephen but it wont wo
Hi,
I'm facing a dilemma of choosing the indexing strategies.
My application architecture is
- I have a listing table in my DB
- For each listing, I have 3 calls to a URL Datasource of different system
I have 200k records
Time taken to index 25 docs is 1Minute, so for 200k it might take more
68 matches
Mail list logo