Thanks Cris
I'm going to see both UpdateLog and RealTimeGetComponent classes,
but I not sure if I could use them because I'm working with apache solr
version 1.4.1, (I know is older).
Anyway I'll tell you my problem. I am developing a custom class extend
from UpdateRequestProcessorFactory.
I need some advanced search features for a desktop application. The
application is a .NET (C#) application, so I can't use Lucene and as I'm
not sure about the future of Lucene.NET I consider using Solr (with
SolrNET).
As I need a cache for the desktop app anyway it seems to be a good
opportunity
I should also add that some of the books don't have chapters, so the query
won't succeed for these books.
But in this case I expected that the document won't be added at all ..
rather than first added then deleted (which I am now suspecting is the
case).
It would be very helpful if I could see a li
how to config the solrconfig.xml to open fragmentsBuilder and
highlightMultiTerm on 4.0 and 4.1
i read the documnet on wiki
but i don't know where the snippet should be placed. and how to call by url
path
thanks
--
View this message in context:
http://lucene.472066.n3.nab
On 2/21/2013 9:50 PM, Shankar Sundararaju wrote:
I am using Solr 4.1.
I created collection1 consisting of 2 leaders and 2 replicas (2 shards) at
boot time.
After the cluster is up, I am trying to create collection2 with 2 leaders
and 2 replicas just like collection1. I am using following collec
I am using Solr 4.1.
I created collection1 consisting of 2 leaders and 2 replicas (2 shards) at
boot time.
After the cluster is up, I am trying to create collection2 with 2 leaders
and 2 replicas just like collection1. I am using following collections API
for that:
http://localhost:7575/solr/adm
The leader doesn't really do a lot more work than any of the replicas, so I
don't think it's likely that important. If someone starts running into
problems, that's usually when we start looking for solutions.
- Mark
On Feb 21, 2013, at 10:20 PM, "Vaillancourt, Tim" wrote:
> I sent this reques
I sent this request to "ServerA" in this case, which became the leader of all
shards. As far as I know you're supposed to issue this call to just one server
as it issues the calls to the other leaders/replicas in the background, right?
I am expecting the single collections API call to spread the
Thans Walter for info, we will disable optimize then and do more testing.
Regards,
Yandong
2013/2/22 Walter Underwood
> That seems fairly fast. We index about 3 million documents in about half
> that time. We are probably limited by the time it takes to get the data
> from MySQL.
>
> Don't opti
The issue may simply be that your indexed data has the mixed case and your
query has only lower case. So, the suggested change won't affect the query
itself, but will cause the indexed data to be indexed differently.
-- Jack Krupansky
-Original Message-
From: scallawa
Sent: Thursday,
On 2/21/2013 10:00 AM, Jack Park wrote:
Interesting you should say that. Here is my solrj code:
public Solr3Client(String solrURL) throws Exception {
server = new HttpSolrServer(solrURL);
// server.setParser(new XMLResponseParser());
}
I cannot reca
: Hi everyone, i am new to solr technology and not getting a way to get back
: the original HTML document with Hits highlighted into it. what
: configuration and where i can do to instruct SolrCell/ Tika so that it does
: not strips down the tags of HTML document in the content field.
I _think_ w
thanks
we do have 1 master , 5 slave servers. and we use slave as production
server.
we just update master index file when we have new contents
now our index file almost 88G, the server just 1 core, 8G ram,JVM:
Xmx60964M -Xms1024M
it's easy out of memory
so i plan to deploy new server to
I cannot give an affirmative answer. But I am thinking that it would have
potential problem, as the index format in 3.3 and 4.1 are slightly
different.
Why don't you upgrade to 4.1? The only thing you need to do is
1. install solr 4.1
2.1 copy all related config files from 3.3
2.2 back up the in
Which of your three hosts did you point this request at?
Upayavira
On Thu, Feb 21, 2013, at 09:13 PM, Vaillancourt, Tim wrote:
> Correction, I used this curl:
>
> curl -v
> 'http://:8983/solr/admin/collections?action=CREATE&name=test&numShards=3&replicationFactor=2&maxShardsPerNode=2'
>
> So 3
Correction, I used this curl:
curl -v
'http://:8983/solr/admin/collections?action=CREATE&name=test&numShards=3&replicationFactor=2&maxShardsPerNode=2'
So 3 instances, 3 shards, 2 replicas per shard. ServerA becomes leader of all 3
shards in 4.1 with this call.
Tim Vaillancourt
-Original M
And keep in mind you do need quotes around your searchTerm if it consists
of multiple words - q=text_exact_field:"your_unquoted_query"
otherwise Solr will interpret "two words" as: "exact_field:two
defaultfield:words"
(Maybe not directly applicable for your problem Kristian, but I just want
to men
You could also do this outside Solr, in your client. If your query is
surrounded by quotes, then strip away the quotes and make
q=text_exact_field:your_unquoted_query. Probably better to do outside Solr in
general keeping in mind the upgrade path.
-sujit
On Feb 21, 2013, at 12:20 PM, Van Tasse
Thank you.
So essentially I need to write a custom query parser (extending upon something
like the QParser)?
-Original Message-
From: Upayavira [mailto:u...@odoko.co.uk]
Sent: Thursday, February 21, 2013 12:22 PM
To: solr-user@lucene.apache.org
Subject: Re: Matching an exact word
Solr
Marcelo
In some sense, it sounds like you are aiming at building a topic map
of all your resources.
Jack
On Thu, Feb 21, 2013 at 11:54 AM, Marcelo Elias Del Valle
wrote:
> Hello David,
>
> First of all, thanks for answering!
>
> 2013/2/21 David Quarterman
>
>> Looked through your site and
Hi Gora and Arcadius,
Thanks for your help. I'll try and answer both your questions here.
I am interested in three database tables. "Book" contains information about
books, "page" has the content of each book page by page, and "chapter"
contains the title of each chapter in every book, and the pa
Hello David,
First of all, thanks for answering!
2013/2/21 David Quarterman
> Looked through your site and the framework looks very powerful as an
> aggregator. We do a lot of data aggregation from many different sources in
> many different formats (XML, JSON, text, CSV, etc) using RDBMS a
: With this approach now I can boost (i.e. multiply Solr's score by a factor)
: the results of any query by doing something like this:
: http://localhost:8080/solr/Prueba/select_test?q={!boost
: b=rating(usuario1)}text:grapa&fl=score
:
: Where 'rating' is the name of my function.
:
: Unfortunate
Thanks Mark,
The real driver for me wanting to promote a different leader is when I create a
new Collection via the Collections API across a multi-server SolrCloud, the
leader of each shard is always the same host, so you're right that I'm tackling
the wrong problem with this request, although
Hi ,
our SOLR master version is 3.3, can i install new box SOLR 4.1 as slaver,
and replication from master data.
thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/can-i-install-new-SOLR-4-1-as-slaver-3-3-Master-tp4041976.html
Sent from the Solr - User mailing list a
Sounds good I am trying the combination of my patch and 4413 now to see how
it works and will have to see if I can put unit tests around them as some
of what I thought may not be true with respect to the commit generation
numbers.
For your issue above in your last post, is it possible that there w
: I have a field in which I have strings with unwanted character like
: \n\r\n\n these kind, I wanted to know is their any why I can remove
: these...actually I had data stored in html format in the sql database
: column which I had to index in solr...using HTML stripe I had removed the
: HTML t
: Anybody know how-to get content is put in the index queue but is not
: committed?
i'm guessing you are refering to uncommited documents in the transaction
log? Take a look at the UpdateLog class, and how it's used by the
RealTimeGetComponent.
If you provide more details as to what you end
Hi Csaba.
Would you mind posting your DIHconfig/data-config.xml and the command
you use for the import?
Thanks.
Arcadius.
On 21 February 2013 17:55, Gora Mohanty wrote:
> On 21 February 2013 19:30, cveres wrote:
>> Thanks Gora,
>>
>> Sorry I might not have been sufficiently clear.
>>
>> I st
: Subject: Solr UIMA
: References: <5123b218.7050...@juntadeandalucia.es>
: In-reply-to: <5123b218.7050...@juntadeandalucia.es>
https://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing
Solr will only match on the terms as they are in the index. If it is
stemmed in the index, it will match that. If it isn't, it'll match that.
All term matches are (by default at least) exact matches. Only with
stemming you are doing an exact match against the stemmed term.
Therefore, there really
You can split an index using the MultiPassIndexSplitter, which is in
Lucene contrib. However, it won't use the same algorithm for assigning
documents to shards, which means the indexes won't work with a SolrCloud
setup.
A splitter that uses the same split technique but uses the shard
assignment al
I'm trying to match the word "created". Given that it is surrounded by quotes,
I would expect an exact match to occur, but instead the entire stemming results
show for words such as create, creates, created, etc.
q="created"&wt=xml&rows=1000&qf=text&defType=edismax
If I copy the text field to a
Hi
I have built a 300GB index using lucene 4.1 and now it is too big to do
queries efficiently. I wonder if it is possible to split it into shards,
then use SolrCloud configuration?
I have looked around the forum but was unable to find any tips on this. Any
help please?
Many thanks!
--
View t
I tried playing with the analyzer before posting and wasn't sure how to
interpret it.
Field type: text
Field value index: womens-mcmurdo-ii-bootsthis is based on the info that
is in the field
Field value query: mcmurdo
results
I only got one match in the index analyzer
org.apache.solr.analys
On 21 February 2013 19:30, cveres wrote:
> Thanks Gora,
>
> Sorry I might not have been sufficiently clear.
>
> I start with an empty index, then add documents.
> 9000 are added and 6000 immediately deleted again, leaving 3000.
> I assume this can only happen with duplicate IDs, but that should no
That seems fairly fast. We index about 3 million documents in about half that
time. We are probably limited by the time it takes to get the data from MySQL.
Don't optimize. Solr automatically merges index segments as needed. Optimize
forces a full merge. You'll probably never notice the differen
I get 2 second response time in average.
Any config / hardware change suggestions for my usecase - low qps rate?
I would say more shards on the same node, but there would be the cache
diminution disadvantage
On Wednesday, February 20, 2013, Walter Underwood wrote:
> In production, you should hav
I'm using the new AnalyzingSuggester (my code is available on
http://pastebin.com/tN9yXHB0)
and I got the synonyms "whisky,whiskey" (they are bi-directional)
So whether the user searches for whiskey or whisky, I want to retrieve all
documents that have any of them.
However, for autosuggest, I wou
How about passing -Dsolr.data.dir=/ur/data/dir in the command line to java
when you start Solr service.
On Thu, Feb 21, 2013 at 9:05 AM, chamara wrote:
> Yes that is what i am doing now? I taught this solution is not elegant for
> a
> deployment? Is there any other way to do this from the Solr
Yes that is what i am doing now? I taught this solution is not elegant for a
deployment? Is there any other way to do this from the SolrConfig.xml?
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-change-the-index-dir-in-Solr-4-1-tp4041891p4041950.html
Sent from the So
Interesting you should say that. Here is my solrj code:
public Solr3Client(String solrURL) throws Exception {
server = new HttpSolrServer(solrURL);
// server.setParser(new XMLResponseParser());
}
I cannot recall why I commented out the setParser line;
Weird - the only difference I see is that we us XML vs. JSON, but
otherwise, doing the following works for us:
VALU1
VALU2
Result would be:
VALU1
VALU2
On Thu, Feb 21, 2013 at 9:44 AM, Jack Park wrote:
> I am using 4.1. I was not aware of that link. In the absence of being
> able to do
Have you tried leaving: ${solr.data.dir:} in
solrconfig.xml and then setting the data dir for each core in the
solr.xml, i.e.
On Thu, Feb 21, 2013 at 7:13 AM, chamara wrote:
> I am having 5 shards in one machine using the new one collection multiple
> cores method. I am trying to change the in
I am using 4.1. I was not aware of that link. In the absence of being
able to do partial updates to multi-valued fields, I just punted to
delete and reindex. I'd like to see otherwise.
Many thanks
Jack
On Thu, Feb 21, 2013 at 8:13 AM, Timothy Potter wrote:
> Hi Jack,
>
> There was a bug for this
AnalyzingSuggester might also be worth having a look at (requires some
Googling and SO reading to get it right for now).
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from
Hi Jack,
There was a bug for this fixed for 4.1 - which version are you on? I
remember this b/c I was on 4.0 and had to upgrade for this exact
reason.
https://issues.apache.org/jira/browse/SOLR-4134
Tim
On Wed, Feb 20, 2013 at 9:16 PM, Jack Park wrote:
> From what I can read about partial upda
Yes, each spellchecker (or "dictionary") in your spellcheck search component
has a "field" parameter to specify the field to be used to generate the
dictionary index for that spellchecker:
spell
See the Solr example solrconfig.xml and search for name="spellchecker">.
Also see:
http://wiki.ap
With Solr's atomic updates, optimistic locking, update log,
openSearcher=false on commits, etc. you can definitely do this.
Biggest question in my mind is whether you're willing to accept Solr's
emphasis on consistency vs. write-availability? With a db like
Cassandra, you can achieve better write-
The word splitting is caused by "splitOnCaseChange: 1". Change that "1" to
"0" and completely reindex your data.
-- Jack Krupansky
-Original Message-
From: scallawa
Sent: Thursday, February 21, 2013 7:47 AM
To: solr-user@lucene.apache.org
Subject: Solr splitting my words
Let me start
Feed your data into the Analysis form to see the transformations
taking place. Navigate to the Solr admin console, select your
collection name on the left (e.g. collection1). Click on Analysis
link. I suspect it's the WordDelimiterFilterFactory that is not doing
what you expect, which you can fine-
Let me start out by saying that I am just learning Solr now. Solr is
splitting a word and I am not sure why. The word is mcmurdo. If I do a
search for McMurdo it picks it up. If I do a search for just murdo it will
also pick it up. If I search for mcmurdo, I get nothing.
"womens-mcmurdo-ii-bo
Thanks Shawn for the Input, I could actually get RAID10's.
--
View this message in context:
http://lucene.472066.n3.nabble.com/SOLR4-SAN-vs-Local-Disk-tp4041299p4041895.html
Sent from the Solr - User mailing list archive at Nabble.com.
I am having 5 shards in one machine using the new one collection multiple
cores method. I am trying to change the index directory, but if i hard code
that in the SolrConfig.xml , the index dir does not change for other cores
and each core tries to fight over it and ends up as a deadlock. Is there
It's not really any different in SolrCloud as the pre-cloud - distrib search is
still the same code done the same way by and large.
shards.qt should be just as valid an option as forcing a query component.
- Mark
On Feb 21, 2013, at 7:56 AM, AlexeyK wrote:
> In pre-cloud version of SOLR it wa
Thanks Gora,
Sorry I might not have been sufficiently clear.
I start with an empty index, then add documents.
9000 are added and 6000 immediately deleted again, leaving 3000.
I assume this can only happen with duplicate IDs, but that should not be
possible! So I wanted to get a list of deleted do
Never mind. I just realized the difference between the two. Sorry for the
noise.
Bill
On Thu, Feb 21, 2013 at 8:42 AM, Bill Au wrote:
> There have been requests for supporting multiple facet.prefix for the same
> facet.field. There is an open JIRA with a patch:
>
> https://issues.apache.org
There have been requests for supporting multiple facet.prefix for the same
facet.field. There is an open JIRA with a patch:
https://issues.apache.org/jira/browse/SOLR-1351
Wouldn't using multiple facet.query achieve the same result? I mean
something like:
facet.query=lastName:A*&facet.query=la
In pre-cloud version of SOLR it was necessary to pass shards and shards.qt
parameters in order to make /suggest handler work standalone.
How should it work in SolrCloud?
SpellCheckComponent skips the distributed stage of processing and thus I get
suggestions only when I force distrib=false mode.
Se
I guess the Term Vector Component might satisfy all or most of what
you're trying to do: http://wiki.apache.org/solr/TermVectorComponent
On 21.02.2013 12:58, search engn dev wrote:
I have indexed data of 10 websites in solr. Now i want to dump data of each
website with following format : [Term
Hi
Look up the luke page in admin Solr .. /admin/luke?show=index
That page show topTerms of terms, so I suppose is possible get frecuency
all terms.
El 21/02/2013 12:58, search engn dev escribió:
I have indexed data of 10 websites in solr. Now i want to dump data of each
website with follo
I have indexed data of 10 websites in solr. Now i want to dump data of each
website with following format : [Terms,Frequency of terms in that website
,IDF]
Can i do this with solr admin, or i need to write any script for that?
--
View this message in context:
http://lucene.472066.n3.nabble.co
Hi Marcelo,
Looked through your site and the framework looks very powerful as an
aggregator. We do a lot of data aggregation from many different sources in many
different formats (XML, JSON, text, CSV, etc) using RDBMS as the main
repository for eventual SOLR indexing. A 'one-stop-shop' for all
Thanks for the patch, we'll try to install these fixes and post if
replication works or not.
I renamed 'index.' folders to just 'index' but it didn't work.
These lines appeared in the log:
INFO: Master's generation: 64594
21-feb-2013 10:42:00 org.apache.solr.handler.SnapPuller fetchLatestIndex
I
Hi Bart,
I think the only way you can do that is by reindexing, or maybe by just
doing a dummy atomic update [1] to each of the documents (e.g. adding or
changing a field of type 'ignored' or something like that) that weren't
"tagged" by UIMA before.
Regards,
Tommaso
[1] : http://wiki.apache.org
Thanks for the links... I have updated SOLR-4471 with a proposed solution
that I hope can be incorporated or amended so we can get a clean fix into
the next version so our operations and network staff will be happier with
not having gigs of data flying around the network :-)
On Thu, Feb 21, 2013
On 21 February 2013 14:27, cveres wrote:
> I am adding documents with data import handler from a mysql database. I
> create a unique id for each document by concatenating a couple of fields in
> the database. Every id is unique.
>
> After the import, over half the documents which were imported are
Hi Amit,
I have came across some JIRAs that may be useful in this issue:
https://issues.apache.org/jira/browse/SOLR-4471
https://issues.apache.org/jira/browse/SOLR-4354
https://issues.apache.org/jira/browse/SOLR-4303
https://issues.apache.org/jira/browse/SOLR-4413
https://issues.apache.org/jira/br
A few others have posted about this too apparently and SOLR-4413 is the
root problem. Basically what I am seeing is that if your index directory is
not index/ but rather index. set in the index.properties a new
index will be downloaded all the time because the download is expecting
your index to be
I am adding documents with data import handler from a mysql database. I
create a unique id for each document by concatenating a couple of fields in
the database. Every id is unique.
After the import, over half the documents which were imported are deleted
again, leaving me with less then half the
70 matches
Mail list logo