Apologies if things were a little vague.
Given the example snippet to index (numbered to show searches needed to
match)...
1: i am a sales-manager in here
2: using asp.net and .net daily
3: working in design.
4: using something called sage 200. and i'm fluent
5: german sausages.
6: busy A&E dept
This is true with Lucene as it stands. It would be much faster if there
were a specialized in-memory index such as is typically used with high
performance search engines.
On Tue, Feb 7, 2012 at 9:50 PM, Lance Norskog wrote:
> Experience has shown that it is much faster to run Solr with a small
But the solr did not have the im-memory index, I am right?
At 2012-02-08 16:17:49,"Ted Dunning" wrote:
>This is true with Lucene as it stands. It would be much faster if there
>were a specialized in-memory index such as is typically used with high
>performance search engines.
>
>On Tue, Feb
A start maybe to use a RAM disk for that. Mount is as a normal disk and
have the index files stored there. Have a read here:
http://en.wikipedia.org/wiki/RAM_disk
Cheers,
Patrick
2012/2/8 Ted Dunning
> This is true with Lucene as it stands. It would be much faster if there
> were a speciali
Hi,
This talk has some interesting details on setting up an Lucene index in RAM:
http://www.lucidimagination.com/devzone/events/conferences/revolution/2011/lucene-yelp
Would be great to hear your findings!
Dmitry
2012/2/8 James
> Is there any practice to load index into RAM to accelerate so
Hi,
I am using solr 3.5 version. I moved the data import handler files from solr
1.4(which I used previously) to the new solr. When I tried to start the solr
3.5, I got the following message in my log
WARNING: XML parse warning in "solrres:/dataimport.xml", line 2, column 95:
Include operation fa
On 08/02/2012 09:17, Ted Dunning wrote:
This is true with Lucene as it stands. It would be much faster if there
were a specialized in-memory index such as is typically used with high
performance search engines.
This could be implemented in Lucene trunk as a Codec. The challenge
though is to c
Hi Erick,
if we're not doing geo searches, we filter by "location tags" that we
attach to places. This is simply a hierachical regional id, which is
simple to filter for, but much less flexible. We use that on Web a
lot, but not on mobile, where we want to performance searches in
arbitrary radii a
Hi,
I found a solution to it.
Adding the Weblogic Server Argument -Dfile.encoding=UTF-8 did not affect
the encoding.
Only a change to the .war file's weblogic.xml and redeployment of the
modified .war solved it.
I added the following to the weblogic.xml:
*
UTF-8
Would it ma
Hello folks,
i want to reindex about 10Mio. Docs. from one Solr(1.4.1) to another
Solr(1.4.1).
I changed my schema.xml (field types sing to slong), standard
replication would fail.
what is the fastest and smartest way to manage this?
this here sound great (EntityProcessor):
http://www.searchworkin
> i want to reindex about 10Mio. Docs. from one Solr(1.4.1) to
> another
> Solr(1.4.1).
> I changed my schema.xml (field types sing to slong),
> standard
> replication would fail.
> what is the fastest and smartest way to manage this?
> this here sound great (EntityProcessor):
> http://www.searchwo
I concur with this. As long as index segment files are cached in OS file cache
performance is as about good as it gets. Pulling segment files into RAM inside
JVM process may actually be slower, given Lucene's existing data structures and
algorithms for reading segment file data. If you have
Hi Ahmet,
thanks for quick response:)
I've already thought the same...
And it will be a pain to export and import this huge doc-set as CSV.
Do i have an another solution?
Regards
Vadim
2012/2/8 Ahmet Arslan :
>> i want to reindex about 10Mio. Docs. from one Solr(1.4.1) to
>> another
>> Solr(1.4.1
Hi,
I am following
http://www.lucidimagination.com/devzone/technical-articles/setting-apache-solr-eclipse
in order to be able to debug Solr in eclipse. I got it working fine.
Now, I usually use ./etc/jetty.xml to set logging configuration. When
starting jetty in eclipse I dont see any log files c
Another problem appeared ;)
how can i export my docs in csv-format?
In Solr 3.1+ i can use the query-param &wt=csv, but in Solr 1.4.1?
Best Regards
Vadim
2012/2/8 Vadim Kisselmann :
> Hi Ahmet,
> thanks for quick response:)
> I've already thought the same...
> And it will be a pain to export and
Hi all,
I am trying to write a custom document clustering component that should
take all the docs in commit and cluster them; Solr Version:3.5.0
Main Class:
public class KMeansClusteringEngine extends DocumentClusteringEngine
implements SolrEventListener
I added newSearcher event listener, that
Hmmm, seems OK. Did you re-index after any
schema changes?
You'll learn to love admin/analysis for questions like this,
that page should show you what the actual tokenization
results are, make sure to click the "verbose" check boxes.
Best
Erick
On Tue, Feb 7, 2012 at 10:52 PM, geeky2 wrote:
> h
Yes, WDDF creates multiple tokens. But that has
nothing to do with the multiValued suggestion.
You can get exactly what you want by
1> setting multiValued="true" in your schema file and re-indexing. Say
positionIncrementGap is set to 100
2> When you index, add the field for each sentence, so your
How does your schema for the fields look like?
On Wed, Feb 8, 2012 at 2:41 PM, Radu Toev wrote:
> Hi,
>
> I am really new to Solr so I apologize if the question is a little off.
> I was playing with DataImportHandler and tried to index a table in a MS SQL
> database.
> I configured my datasource
The schema.xml is the default file that comes with Solr 3.5, didn't change
anything there.
On Wed, Feb 8, 2012 at 2:45 PM, Dmitry Kan wrote:
> How does your schema for the fields look like?
>
> On Wed, Feb 8, 2012 at 2:41 PM, Radu Toev wrote:
>
> > Hi,
> >
> > I am really new to Solr so I apolo
well, you should add these fields in schema.xml, otherwise solr won't know
them.
On Wed, Feb 8, 2012 at 2:48 PM, Radu Toev wrote:
> The schema.xml is the default file that comes with Solr 3.5, didn't change
> anything there.
>
> On Wed, Feb 8, 2012 at 2:45 PM, Dmitry Kan wrote:
>
> > How does y
I just realized that as I pushed the send button :P
Thanks, I'll have a look.
On Wed, Feb 8, 2012 at 2:58 PM, Dmitry Kan wrote:
> well, you should add these fields in schema.xml, otherwise solr won't know
> them.
>
> On Wed, Feb 8, 2012 at 2:48 PM, Radu Toev wrote:
>
> > The schema.xml is the d
Hi,
run-jetty-run issue #9:
...
In the VM Arguments of your launch configuration set
-Drjrxml=./jetty.xml
If jetty.xml is in the root of your project it will be used (you can also use a
fully
qualified path name).
The UI port, context and WebApp dir are ignored, since you can define them in
j
hello,
thank you for the reply.
yes - i did re-index after the changes to the schema.
also - thank you for the direction on using the analyzer - but i am not sure
if i am interpreting the feedback from the analyzer correctly.
here is what i did:
in the Field value (Index) box - i placed this:
Thanks Erick,
I didn't get confused with multiple tokens vs multiValued :)
Before I go ahead and re-index 4m docs, and believe me I'm using the
analysis page like a mad-man!
What do I need to configure to have the following both indexed with and
without the dots...
.net
sales manager.
£12.50
Add this as well:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.155.5030
On Wed, Feb 8, 2012 at 1:56 AM, Andrzej Bialecki wrote:
> On 08/02/2012 09:17, Ted Dunning wrote:
>
>> This is true with Lucene as it stands. It would be much faster if there
>> were a specialized in-memory inde
Hi,
According solr documentation the dismax score is calculating after the
formula :
(score of matching clause with the highest score) + ( (tie paramenter) *
(scores of any other matching clauses) ).
Is there a way to identify the field on which the matching clause score is
the highest?
For exa
Hi all,
I want to sort a SolrDocumentList after it has been queried and obtained
from the QueryResponse.getResults(). The reason is i have a SolrDocumentList
obtained after querying using QueryResponse.getResults() and i have added
few docs to it. Now i want to sort this SolrDocumentList based on
Sorry for inaccurate title.
I have a 3 fields (dc_title, dc_title_unicode, dc_unicode_full)
containing same value:
http://www.tei-c.org/ns/1.0";>cal.lígraf
and these fields are configured accordingly:
If you can not read this mail easily check this ticket:
https://issues.apache.org/jira/browse/SOLR-3106 This is a copy.
Regards!
Dalius Sidlauskas
On 08/02/12 15:44, Dalius Sidlauskas wrote:
Sorry for inaccurate title.
I have a 3 fields (dc_title, dc_title_unicode, dc_unicode_full)
containi
Hi Dalius,
If not already tried, Check http://localhost:8983/solr/admin/analysis.jsp
(enable verbose output for both Field Value index and query for details)
for your queries and see what all filters/tokenizers are being applied.
Hope it helps!
-param
On 2/8/12 10:48 AM, "Dalius Sidlauskas"
wr
I have already tried this and it did not helped because it does not
highlight matches if wild-card is used. The field configuration turns
data to:
dc_title: calligraf
dc_title_unicode: cal·lígraf
dc_title_unicode_full: cal·lígraf
Debug parsedquery says:
[Search for *cal·ligraf*]
+Disjunction
Hmmm, that all looks correct, from the output you pasted I'd expect
you to be finding the doc.
So next thing: add &debugQuery=on to your query and look at
the debug information after the list of documents, particularly
the "parsedQuery" bit. Are you searching against the fields you
think you are?
You'll probably have to index them in separate fields to
get what you want. The question is always whether it's
worth it, is the use-case really well served by having a
variant that keeps dots and things? But that's always more
a question for your product manager
Best
Erick
On Wed, Feb 8, 201
Attempting to re-produce legacy behaviour (i know!) of simple SQL
substring searching, with and without phrases.
I feel simply NGram'ing 4m CV's may be pushing it?
---
IntelCompute
Web Design & Local Online Marketing
http://www.intelcompute.com
On Wed, 8 Feb 2012 11:27:24 -0500, Erick Ericks
hello,
thanks for sticking with me on this ...very frustrating
ok - i did perform the query with the debug parms using two scenarios:
1) a successful search (where i insert the period / dot) in to the itemNo
field and the search returns a document.
itemNo:BP2.1UAA
http://hfsthssolr1.intra.sea
> I have already tried this and it did
> not helped because it does not
> highlight matches if wild-card is used. The field
> configuration turns
> data to:
This writeup should explain your scenario :
http://wiki.apache.org/solr/MultitermQueryAnalysis
On Feb 8, 2012, at 10:31 AM, Adeel Qureshi wrote:
> I have been using solr for a while and have recently started getting into
> solrcloud .. i am a bit confused with some of the concepts ..
>
> 1. what exactly is the relationship between a collection and the core ..
> can a core has multiple col
> I want to sort a SolrDocumentList after it has been queried
> and obtained
> from the QueryResponse.getResults(). The reason is i have a
> SolrDocumentList
> obtained after querying using QueryResponse.getResults() and
> i have added
> few docs to it. Now i want to sort this SolrDocumentList
> ba
Hi Adeel,
I just started looking into SolrCloud and had some of the same questions.
I wrote a blog with the understanding I gained so far, maybe it will help
you:
http://outerthought.org/blog/491-ot.html
Regards,
Bruno.
On Wed, Feb 8, 2012 at 4:31 PM, Adeel Qureshi wrote:
> I have been using
Vadim,
Would using xslt output help?
Otis
Performance Monitoring SaaS for Solr -
http://sematext.com/spm/solr-performance-monitoring/index.html
>
> From: Vadim Kisselmann
>To: solr-user@lucene.apache.org
>Sent: Wednesday, February 8, 2012 7:09 AM
>Subje
Anderson
I would say that this is highly unlikely, but you would need to pay attention
to how they are generated, this would be a good place to start:
http://en.wikipedia.org/wiki/Universally_unique_identifier
Cheers
François
On Feb 8, 2012, at 1:31 PM, Anderson vasconcelos wrote:
>
All,
It appears my attempt at using solr for the application I support is
about to fail. I'm personally and professionally disappointed, but I
wanted to say "Many Thanks" to those of you who have provided so much
help to so many on this list. In the right hands and in the right
environments, it ha
Hi,
I'm running solr+tomcat with the following configuration:
I have 16 slaves, which are being queried by aggregator, while aggregator
being queried by the users.
My slaveUrls variable in solr.xml (on aggregator) looks like - ''
I'm running it on linux machine (not dedicated, there are some other
Please forgive me if this is a dumb question. I've never dealt with SOLR
before, and I'm being asked to determine from the logs when a SOLR index is
kicked off (it is a Windows server). The TOMCAT service runs continually, so
no love there. In parsing the logs, I think
"org.apache.solr.core.
For those that are interested and have not noticed, the latest work on
SolrCloud and distributed indexing is now in trunk.
SolrCloud is our name for a new set of distributed capabilities that improve
upon the old style distributed search and index based replication.
It provides for high availab
Thanks
2012/2/8 François Schiettecatte
> Anderson
>
> I would say that this is highly unlikely, but you would need to pay
> attention to how they are generated, this would be a good place to start:
>
>http://en.wikipedia.org/wiki/Universally_unique_identifier
>
> Cheers
>
> François
>
> O
Good job on this work. A monumental effort.
On Wed, 8 Feb 2012 16:41:13 -0500, Mark Miller
wrote:
> For those that are interested and have not noticed, the latest work on
> SolrCloud and distributed indexing is now in trunk.
>
> SolrCloud is our name for a new set of distributed capabilities th
Hi Matthias-
I'm trying to understand how you have your data indexed so we can give
reasonable direction.
What field type are you using for your locations? Is it using the
solr spatial field types? What do you see when you look at the debug
information from &debugQuery=true?
>From my experienc
okay so after reading Bruno's blog post .. lets add slice to the mix as
well .. so we have got collections, cores, shards, partitions and slices :)
..
The whole point with cores is to be able to have different schemas on the
same solr server instance. So how does that changes with collections .. m
hi,
I have a question around documents linking in solr and want to know if its
possible. lets say i have a set of blogs and their authors that i want to
index seperately. is it possible to link a document describing a blog to
another document describing an author? if yes, can i search for blogs wit
On Feb 8, 2012, at 5:26 PM, Adeel Qureshi wrote:
> okay so after reading Bruno's blog post .. lets add slice to the mix as
> well .. so we have got collections, cores, shards, partitions and slices :)
> ..
Yeah - heh - this has bugged me, but we have not really all come down on
agreement of ter
I compared locallucene to spatial search and saw a performance
degradation, even using geohash queries, though perhaps I indexed things
wrong? Locallucene across 6 machines handles 150 queries per second fine,
but using geofilt and geohash I got lots of timeouts even when I was doing
only 50 querie
yes, I am using https://github.com/alexwinston/RunJettyRun that apparently is
a fork of the original project that originated in the need to use an
jetty.xml.
So I am already setting an additional jetty.xml, this can be done in the Run
configuration, no need to use -D param. But as I mentioned solr
Mark,
is the recommendation now to have each solr instance be a separate core in
solr cloud? I had thought that the core name was by default the collection
name? Or are you saying that although they have the same name they are
separate because they are in different JVMs?
On Wednesday, February 8,
On Feb 8, 2012, at 9:36 PM, Jamie Johnson wrote:
> Mark,
> is the recommendation now to have each solr instance be a separate core in
> solr cloud? I had thought that the core name was by default the collection
> name? Or are you saying that although they have the same name they are
> separate be
On Feb 8, 2012, at 9:52 PM, Jamie Johnson wrote:
> In solr cloud what is a better approach / use of resources having multiple
> cores on a single instance or multiple instances with a single core? What
> are the benefits and drawbacks of each?
It depends I suppose. If you are talking about on a
Thanks Mark, in regards to failover I completely agree, I am wondering more
about performance and memory usage if the indexes are large and wondering
if the separate Java instances under heavy load would more or less
performant. Currently we deploy a single core per instance but deploy
multiple in
Thanks for the explanation. It makes sense but I am hoping that you can
clarify things a bit more ..
so now it sounds like in solrcloud the concept of cores have changed a bit
.. as you explained that for me to have 2 cores with different schemas I
will need 2 different collections .. and one good
No that sorting is based on multiple fields. Basically i want to sort them
as the group by statement like in the SQL based on few fields and many
loops to go through. The problem is that i have say 1,000,000 solr docs
after injecting my few solr docs and then i want to do group by these solr
docs b
Hi all,
I have tried group by in solr with multiple shards but it does not work.
Basically i want to simply do GROUP BY statement like in SQL in solr with
multiple shards. Please suggest me how can i do this as it is not supported
currently OOB by solr.
Thanks & regards,
Kashif Khan
--
View this
Hello,
Have you tried to specify debugQuery=on and look into explain section?
Though it's not really performant, but anyway I propose to start from it.
Regards
On Wed, Feb 8, 2012 at 7:32 PM, crisfromnova wrote:
> Hi,
>
> According solr documentation the dismax score is calculating after the
>
Hey ,
I am using solr as my search engine to search my pdf files. I have 18219
files(different file names) and all the files are in one same directory。But
when I use solr to import the files into index using Dataimport method, solr
report only import 17233 files. It's very strange. This problem
63 matches
Mail list logo