There is good book http://nlp.stanford.edu/IR-book/
See chapter
http://nlp.stanford.edu/IR-book/html/htmledition/okapi-bm25-a-non-binary-model-1.html
15.11.2012 06:16, Floyd Wu wrote:
Hi there,
Does anybody can kindly tell me how to setup solr to use BM25?
By the way, are there any experime
Tim,
Combine them in "lat,lon" format using ScriptUpdateRequestProcessor using
JavaScript. I'm doing this already in fact. See a template of an example
that comes with Solr in update-script.js referenced by solrconfig.xml. I'd
paste it right here if I had it but I have the excerpt for it on an
See http://wiki.apache.org/solr/SchemaXml#Similarity
class="solr.BM25SimilarityFactory"
The factories for these have javadocs that document the parameters:
http://lucene.apache.org/solr/4_0_0/solr-core/org/apache/solr/search/similarities/package-summary.html
I don't know about comparisons betwee
Thank you eric
I didnt know that we could write a Java class for it , can you provide me
with some info on how to
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Indexing-MAX-FILE-LIMIT-tp4019952p4020407.html
Sent from the Solr - User mailing list archive at N
Sorry some more info. I have a field to store source and another for date.
I currently use faceting to get a temporal distribution across all
sources. What is the best way to get a temporal distribution per source?
Is the only thing I can do to execute 1 query for the list of sources and
then an
Yes, its a subset
On Nov 14, 2012 1:18 PM, "Shawn Heisey" wrote:
> I am using ICUFoldingFilterFactory in my Solr schema. Now I am looking at
> adding CJKBigramFilterFactory, and I've noticed that it often goes with
> CJKWidthFilterFactory. Here are the relevant Javadocs for my question:
>
> htt
Hi there,
Does anybody can kindly tell me how to setup solr to use BM25?
By the way, are there any experiment or research shows BM25 and classical
VSM model comparison in recall/precision rate?
Thanks in advanced.
It's included as soon as it has been indexed - though a request won't return
until it's affected all replicas. Low latency eventual consistency.
- Mark
On Nov 14, 2012, at 5:47 PM, Bill Au wrote:
> Will a newly indexed document included in search result in the shard leader
> as soon as it has
So, from looking at the code and talking to some of the Lucid guys
today, it seems like there is no good way (currently) to control the
shard leader selection, or even to "fail back" if the preferred leader
server comes back up.
We currently let indexing fail if the one master goes down, but addin
You can break your books into individual pages, each a separate Solr
"document", with the full page text as one tokenized text field value. Solr
(Lucene) will take care of indexing the individual terms on each page. Then
when you query on terms, Solr will find all pages that have the specified
Mikhail-
Let me know how to contribute a test case and I will put it on my to do
list.
When your many-to-many BlockJoin solution matures I would love to see it.
Thanks.
-Gerald
On Tue, Nov 13, 2012 at 11:52 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:
> Gerald,
> Nice to hear the
Howdy,
I now want to try my hand a spatial search. It looks fairly easy but I'm a
bit puzzled about how to set up my schema.xml file. I know that my field
must use the LatLon type but the columns of the database where I'll be
pulling my data for indexing have separate lat and lon columns (both
dou
Thanks mark !
On Sun, Nov 11, 2012 at 5:46 PM, Mark Miller wrote:
> When SolrCloud is in a steady state (eg the number of nodes in the cluster
> is not changing and config is not changing), Solr does not really talk to
> ZooKeeper other than really light stuff like a heartbeat and maintaining a
I am using ICUFoldingFilterFactory in my Solr schema. Now I am looking
at adding CJKBigramFilterFactory, and I've noticed that it often goes
with CJKWidthFilterFactory. Here are the relevant Javadocs for my question:
http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analy
Hi,
I use solrJ for cross core search and it is work correctly and fast.
At First, you can make attention on schema definition, you should try to use
as much as possible fields with the same name. For example all my scheme
have a subset of common fields like title, summary, date, geo, image, ecc
Hi,
With same configuration, same core, same data, but Solr 4.0 release my
project and junit test case works correctly by SolrCloudServer.
I'm working with Lucid Works Ent. that don't use last built solr version, we
asked to Lucid to upgrade solr.
Thanks
-
Complicare è facile, semplificare
thanks anyway, Shawn.
On Wed, Nov 14, 2012 at 5:24 PM, Carlos Alexandro Becker wrote:
> hmm... the less-horrible way I could think (if solr doesn't support it by
> default), is to create another core that "mix" the informations from other
> cores, and then, search in it.
>
> But, well, it would
hmm... the less-horrible way I could think (if solr doesn't support it by
default), is to create another core that "mix" the informations from other
cores, and then, search in it.
But, well, it would be ugly.
On Wed, Nov 14, 2012 at 5:14 PM, Shawn Heisey wrote:
> On 11/14/2012 10:48 AM, Carlos
I'm sure. I added it to 3.6 ;)
You must have something funky with your tomcat configuration, like an
exploded war with different versions of jars or some other form of jar
hell.
On Wed, Nov 14, 2012 at 9:32 AM, Frederico Azeiteiro
wrote:
> Are you sure about that?
>
> We have it working on:
>
>
On 11/14/2012 10:48 AM, Carlos Alexandro Becker wrote:
Hm, and in the case of my cores have different schemes?
You might have to do all the heavy lifting yourself, after using SolrJ
to retrieve the results. I will say that I have no idea -- there may be
ways you can avoid doing that. I hope
Hm, and in the case of my cores have different schemes?
Thanks in advance.
On Wed, Nov 14, 2012 at 3:35 PM, Shawn Heisey wrote:
> On 11/14/2012 10:19 AM, Carlos Alexandro Becker wrote:
>
>> What's the best way to search in multiple cores and merge the results
>> using
>> solrj?
>>
>
> Your bes
On 11/14/2012 10:19 AM, Carlos Alexandro Becker wrote:
What's the best way to search in multiple cores and merge the results using
solrj?
Your best bet really is to have Solr do this for you with distributed
search. You can add the shards parameter to your queries easily with
SolrJ, or you c
Thanks for your reply, Sergey!
Well, I was a bit puzzled. I tried adding a line to set the character set
before, but then it complained about that as well.
I installed the Russian dictionary and Solr was happy to load that. I
noticed that the character-set was only set in the affix file for Russia
Are you sure about that?
We have it working on:
Solr Specification Version: 3.5.0.2011.11.22.14.54.38
Solr Implementation Version: 3.5.0 1204988 - simon - 2011-11-22 14:54:38
Lucene Specification Version: 3.5.0
Lucene Implementation Version: 3.5.0 1204988 - simon - 2011-11-22 14:46:51
Current Tim
ok. but what are the problems when brining up multiple instances reading
from the same data directory?
also how to re-open the searchers without restarting solr?
Thanks,
Rohit
On Tue, Nov 13, 2012 at 11:20 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:
> Hi,
>
> If you have high query
On Mi, 2012-11-14 at 18:50 +0200, Artem Lokotosh wrote:
> See https://issues.apache.org/jira/browse/MAHOUT-1112
> Seems mahout doesn't yet support lucene 4.0
>
That indeed seems to be the reason. Running the test with solr 3.6.1
works fine.
thanks,
--tomw
See https://issues.apache.org/jira/browse/MAHOUT-1112
Seems mahout doesn't yet support lucene 4.0
On Wed, Nov 14, 2012 at 6:38 PM, Jack Krupansky wrote:
> Check the dates for the Solr/Lucene jars - they might be an early snapshot
> before the index format stabilized.
>
> Or, maybe that Mahout sub
Check the dates for the Solr/Lucene jars - they might be an early snapshot
before the index format stabilized.
Or, maybe that Mahout sub-project had a copy of some old Lucene data.
Keeping old Lucene data around as opposed to reindexing is a rather bad
idea.
-- Jack Krupansky
-Original
On Wed, Nov 14, 2012 at 8:12 AM, Frederico Azeiteiro
wrote:
> Fo make some further testing I installed SOLR 3.5.0 using default Jetty
> server.
>
> When tried to start SOLR using the same schema I get:
>
>
>
> SEVERE: org.apache.solr.common.SolrException: Error loading class
> 'solr.CJKBigramFilte
On Mi, 2012-11-14 at 17:57 +0200, Artem Lokotosh wrote:
> > Does it mean that Solr is creating the index in some kind of
> > old format? Is it possible to change the format?
>
> Try this
> http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/index/IndexUpgrader.html
>
I'm wondering why a n
Hi,
I've been testing some CJK tokenizers and I manage to get acceptable
results using:
> Does it mean that Solr is creating the index in some kind of
> old format? Is it possible to change the format?
Try this
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/index/IndexUpgrader.html
On Wed, Nov 14, 2012 at 5:42 PM, tomw wrote:
> Hi folks,
>
> I was trying to use an index
Hi folks,
I was trying to use an index created by Solr 4.0 by mahout. However,
creating the vectors like:
bin/mahout lucene.vector -d ~/apache-solr-4.0.0/example/solr/data/index
--output /tmp/mahout/vectors --field text --idField id
--dictOut /tmp/mahout/dict.txt --norm 2
fails with an error:
Hi all!
My index is dynamically updated. This means, that every day I have new data,
and every day I remove unused documents from it. Approximately, I know
number of documents, which I'm indexing per day.
Today I had tested a situation. Simple imagine, there is an one collection
and two shards wi
Missed the list in my last reply:
This used to work properly - I'm guess that the zk layout refactoring right
before 4.0 broke it. We likely need a JIRA issue, a fix, and a test.
Mark
On Nov 14, 2012, at 6:43 AM, Gilles Comeau wrote:
> Hi all,
>
> I just wanted to make the simplest repro of
Kobayashi-san
I suspect you are hitting this:
"The NOT operator excludes documents that contain the term after NOT. This is
equivalent to a difference using sets. The symbol ! can be used in place of the
word NOT."
If you appends &debugQuery=on to your search URL, you can see parsed query et
I am almost beginner for Solr. So this quiz site is very helpful for education
training.
Could you let me add a question to this site?
Q.
The documents are ranked by the "Score" which is calculated by the "Nearness"
of document and query.
This score is tend to increase depending on the length of
Hi.
I maiking "Neary text search system" with solr.
Example:
input text : Hello World!
query: Hello World!
response: Hello World!
this point went well
input text : Hello World!
query: World! Hello
response: Hello World!
This does not work.
I need switched back and forth text.
How to?
--
V
Hi,
Tomás help me, and we found the issues.
Basically, I had the solrconfig.xml, schema.xml and etc inside my war, and
looks like zookeeper does't look for these files in classpath.
That was pretty easy, just copied the files to the proper location inside
solr folder, so I got something like this
Hi all,
I just wanted to make the simplest repro of this issue, which now I am thinking
might be related to the decision made in:
https://issues.apache.org/jira/browse/SOLR-3080 ? And this is the expected
behaviour?
1. Download SOLR 4 production and extract.
2. Replace solr.xml in
Rob, as regards your "problem"
'SET charset'
'charset' word must be replaced with a name-of-character-set (i.e. encoding)
For exampe, you can write 'SET UTF-8'
BUT...
Be careful!
At least for russian language morthology HunspellStemFilterFactory has
bug(s) in its algorythm.
Simple co
I'm pretty sure that Solr only checks whether a field is multivalued at
the point at which it receives the second value for a specific field. In
your entry below, you only provided one value, so Solr wouldn't
complain. Add another line to your , and I bet you it will
moan at you.
Upayavira
On W
Hi - and thanks to you and Erik. I have changed to schema version 1.5.
/Peter
-Original Message-
From: Jeevanandam Madanagopal [mailto:je...@myjeeva.com]
Sent: 14. november 2012 10:38
To: solr-user@lucene.apache.org
Subject: Re: Multivalued or not
Okay, I believe you're using Solr 3.6,
Just to wrap up this one. Previously all the lib jars were located in the war
file on our setup, this was mainly to ease deployment as it's just a single
file. Moving the lib directory external to the war seems to have fixed the
issue.
Thanks for the pointer Erick.
-Original Message-
Basic http authentication can use to filter the accesses to different
urlas you want, so you can allow access
to the Query, Analysis, etc and Admin ban
2012/11/13 Erick Erickson
> Slap them firmly on the wrist if they do?
>
> The Solr admin is really designed with trusted users in mind. There a
Okay, I believe you're using Solr 3.6, here you can use schema version 1.5
However, you're currently using version 1.0, it safer to update your schema
version to 1.1 then multiValued is false by default.
FYI. Schema version info (from schema.xml):
Should be 1.1 I see.
-Original Message-
From: Peter Kirk [mailto:p...@alpha-solutions.dk]
Sent: 14. november 2012 10:24
To: solr-user@lucene.apache.org
Subject: RE: Multivalued or not
Hi, it says version 1.0
/Peter
-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@g
Hi, it says version 1.0
/Peter
-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
Sent: 14. november 2012 10:22
To: solr-user@lucene.apache.org
Subject: Re: Multivalued or not
But what is your schema version? See the top of schema.xml.
On Nov 14, 2012, at 4:17,
But what is your schema version? See the top of schema.xml.
On Nov 14, 2012, at 4:17, Peter Kirk wrote:
> Hi
>
> Thanks for the reply. It is strange, because when I index to a field defined
> like:
>
> name="*_string"
> stored="true"
>
Hi
Thanks for the reply. It is strange, because when I index to a field defined
like:
Then the results I receive are like:
Woodland
Which seems to indicate a multivalued field.
If I change the field definition, so I explicitly say multivalued is false:
Then the result is li
Hello Peter -
In Solr 3.6 multiValued is false by default.
Since Schema version 1.1 onwards multiValued attribute value is false by
default (, , )
-Jeeva
Blog: http://www.myjeeva.com
On Nov 14, 2012, at 2:04 PM, Peter Kirk wrote:
> Hi
>
> In Solr 3.6, is multivalued for fields, default tr
Hi
In Solr 3.6, is multivalued for fields, default true or false?
It appears that it is default false for normal fields, and default true for
dynamic fields - is that correct?
Thanks,
Peter
52 matches
Mail list logo