Thanks
Here was the issues. Concatenating 2 floats(lat,lng) at mysql end converted
it to a BLOB. Indexing would fail in storing BLOB in 'location' type field.
After BLOB issue was resolved, all worked ok.
Thank you all for your help
--
View this message in context:
http://lucene.472066.n3.na
I have used that type of location searching. But I have not used spatial
search. I wrote my logic at application end.
I have cached the location ids and their lat/lang. When queries are comming
for any location say "New Delhi" then my location searche logic at
application end calculate the distanc
On Thu, Jan 13, 2011 at 10:47 PM, PeterKerk wrote:
>
> I still have the default Solr example config running on Jetty. I use Cygwin
> to start my current site.
>
> Now I already have fully configured one solr instance with these files:
> \example\example-DIH\solr\db\conf\my-data-config.xml
> \examp
On Fri, Jan 14, 2011 at 1:02 AM, tjpoe wrote:
[...]
> I also tried creating datasources for each local and then using a variable
> datasource in the entity such as:
>
>
>
>
>
>
> and then the document as:
>
>
> rootEntity="false">
>
>
>
>
> but the ${local.code} variable is not resolve
On Thu, Jan 13, 2011 at 10:10 PM, supersoft wrote:
>
> On the one hand, I found really interesting those comments about the reasons
> for sharding. Documentation agrees you about why to split an index in
> several shards (big sizes problems) but I don't find any explanation about
> the inconvenien
I could put 1-10,000 fileds in any one document, as long as they are told what
type or they are dynamically matched by dynamic fields relative to what's in
the
schema.xml file?
It's very much like google 'big tables' or 'elastic search' that way, right?
It's up to me to enforce any field nam
Wait- it does enforce the schema names. What it does not enforce is
field contents when you change the schema. Since Lucene does not have
field replacement, it is not practical to remove or add a field to all
existing documents when you change the schema.
On Thu, Jan 13, 2011 at 8:15 PM, Lance Nor
Correct. Solr and Lucene do not store or enforce the schema. You're on
your own :)
On Thu, Jan 13, 2011 at 8:09 PM, Dennis Gearon wrote:
> I'm going to buy the book for Solr, since it looks like I need to do more of
> the
> work than I thought I would.
>
> But, from looking at it, the schema fil
Spatial does not support separate separate fields: you don't need
lat/long, only 'coord'.
To get latitude/longitude in the coord field from the DIH, you need to
use a transformer in the DIH script.
It would populate a field 'coord' with a text string made from the lat
and lon fields:
http://wiki.
I'm going to buy the book for Solr, since it looks like I need to do more of
the
work than I thought I would.
But, from looking at it, the schema file only says:
A/ What types of data can be in the 'fields' of the documents
B/ If there are any dynamically assigned fields.
C/ What parsers are av
Ahhh...the fun of open source software ;-). Requires a ton of trial and error!
I found what worked for me and figured it was worth passing it along. If you
don't mind...when you sort everything out on your end, please post results for
the rest of us to take a gander at.
Cheers,
Adam
On Jan 13
1) CheckIndex is not supposed to change a corrupt segment, only remove it.
2) Are you using local hard disks, or do run on a common SAN or remote
file server? I have seen corruption errors on SANs, where existing
files have random changes.
On Thu, Jan 13, 2011 at 11:06 AM, Michael McCandless
wrot
Thanks for your reply. However, it doesn't work for my case at all. I think
it's the problem with query parser or something else. It forces me to put
double quote to the search query in order to get the results found.
"sim 010"
"sim 010"
+DisjunctionMaxQuery((keyphrase:sim 010)) ()
+(keyphrase:sim
According to the documentation here:
http://wiki.apache.org/solr/SpatialSearch the field that identifies the
spatial point data is "sfield". See the console output below.
Jan 13, 2011 6:49:40 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/select
params={spellcheck=true&f.jtyp
Joan,
make sure that you are running the job on Hadoop 0.21 cluster. (It
looks like you have compiled the apache-solr-hadoop jar with Hadoop
0.21 but using it on 0.20 cluster).
-Alexander
I'm trying to understand the mechanics behind warming up, when new searchers
are registered, and their costs. A quick Google didn't point me in the right
direction, so hoping for some of that here.
--
David Cramer
You could have tried it and seen for yourself on any Solr server in your
possession in less time than it took to have this thread. And if you don't have
a Solr server, then why do you care?
But the answer is 0.
http://wiki.apache.org/solr/CommonQueryParameters#start
"The default value is "0""
So you're allowed to put the entire original document in a stored field in
Solr, but you aren't allowed to stick it in, say, a redis or couchdb too? Ah,
beaurocracy. But no reason what you are doing won't work, as you of course
already know from doing it.
If you actually know the charset of a
Thanks for all the responses.
CharsetDetector does look promising. Unfortunately, we aren't allowed
to keep the original of much of our data, so the solr index is the
only place it exists (to us). I do have a java app that "reindexes",
i.e., reads all documents out of one index, does some transfor
I'm migrating to CTO/CEO status in life due to building a small company. I find
I don't have too much time for theory. I work with wht is.
So, what is it, not what should it be.
Dennis Gearon
Signature Warning
It is always a good idea to learn from your own mistakes. It is us
> Please, read every wiki page you can find and write notes.
NO!!! Once you start down this road, there is no turning back! Soon you will
feel the need to turn your notes into a new wiki page or a blog post, and
people will read those and write notes, and the process will repeat, ad
infinitum
Perhaps it would be more useful to RTFM instead of messing around on the
mailing list: http://wiki.apache.org/solr/CommonQueryParameters#start
Please, read every wiki page you can find and write notes.
> Do I even need a body for this message? ;-)
>
> Dennis Gearon
>
>
> Signature Warning
>
Hi Joan,
I am not sure whether it applies, but are you really using Solr 1.4 (not
1.4.1) and were also using the Hadoop-Jars provided by this patch (0.20.1
not 0.0.21)?
I ask, because I had some other issues with other classes that were related
to different package-definitions etc. - in short: so
On Jan 13, 2011, at 1:28 PM, Dennis Gearon wrote:
> Do I even need a body for this message? ;-)
>
> Dennis Gearon
Are you asking "is it" or "should it be"? If the latter, we can also discuss
Emacs and vi.
wunder
--
Walter Underwood
K6WRU
On Thu, Jan 13, 2011 at 2:05 PM, Jonathan Rochkind wrote:
>
> There are various packages of such heuristic algorithms to guess char
> encoding, I wouldn't try to write my own. icu4j might include such an
> algorithm, not sure.
>
it does:
http://icu-project.org/apiref/icu4j/com/ibm/icu/text/Chars
Do I even need a body for this message? ;-)
Dennis Gearon
Signature Warning
It is always a good idea to learn from your own mistakes. It is usually a
better
idea to learn from others’ mistakes, so you do not have to make them yourself.
from 'http://blogs.techrepublic.com.com
I have several similar databases that I'd like to import from 14 to be exact.
there is also a 15th database where I can get a listing of the 14 database.
I'm trying to do a variable datasource such as:
then my import query looks like this
The above configuration works, but the $
The tokens that Lucene sees (pre-4.0) are char[] based (ie, UTF16), so
the first place where invalid UTF8 is detected/corrected/etc. is
during your analysis process, which takes your raw content and
produces char[] based tokens.
Second, during indexing, Lucene ensures that the incoming char[]
toke
Generally it's not safe to run CheckIndex if a writer is also open on the index.
It's not safe because CheckIndex could hit FNFE's on opening files,
or, if you use -fix, CheckIndex will change the index out from under
your other IndexWriter (which will then cause other kinds of
corruption).
That
Scanning for only 'valid' utf-8 is definitely not simple. You can
eliminate some obviously not valid utf-8 things by byte ranges, but you
can't confirm valid utf-8 alone by byte ranges. There are some bytes
that can only come after or before other certain bytes to be valid utf-8.
There is no
We're just getting started with Solr and are very interested in using Solr
for search applications.
I've got the rss example working 1.4.1 didn't work out of the box, but we
figured it out -then found fixes in the svn. Any way we are learning how
to load the data/rss & atom feeds into the Solr in
take a look also into icu4j which is one of the contrib projects ...
> converting on the fly is not supported by Solr but should be relative
> easy in Java.
> Also scanning is relative simple (accept only a range). Detection too:
> http://www.mozilla.org/projects/intl/chardet.html
>
>> We've crea
I still have the default Solr example config running on Jetty. I use Cygwin
to start my current site.
Now I already have fully configured one solr instance with these files:
\example\example-DIH\solr\db\conf\my-data-config.xml
\example\example-DIH\solr\db\conf\schema.xml
\example\example-DIH\solr
So you are interested in collection frequency of words.
TermsComponent gives you document frequency of terms. You can modify it to give
collection frequency info. http://search-lucene.com/m/of5Fn1PUOHU/
--- On Wed, 1/12/11, Juan Grande wrote:
> From: Juan Grande
> Subject: Re: Term frequency
On the one hand, I found really interesting those comments about the reasons
for sharding. Documentation agrees you about why to split an index in
several shards (big sizes problems) but I don't find any explanation about
the inconvenients as an Access Control List. I guess there should be some
an
I appreciate the reply and blog posting. For now, I just enabled stopwords for
all the fields on "Qf". We have a very short list anyhow and our legacy search
engine didn't even allow field-by-field configuration (stopwords are global on
that system).
I do wonder...what if (e)dismax had a flag
What field type do you recommend for a float stats.field for optimal Solr
1.4.1 StatsComponent performance ?
float, pfloat or tfloat ?
Do you recommend to index the field ?
2011/1/12 stockii
>
> my field Type is "double" maybe "sint" is better ? but i need double ...
> =(
> --
> View this m
It's a known 'issue' in dismax, (really an inherent part of dismax's
design with no clear way to do anything about it), that qf over fields
with different stop word definitions will produce odd results for a
query with a stopword.
Here's my understanding of what's going on:
http://bibwild.wor
I understand less and less what is happening to my solr.
I did a checkIndex (without -fix) and there was an error...
So a did another checkIndex with -fix and then the error was gone. The
segment was alright
During checkIndex I do not shut down the solr server, I just make sure
no client co
Hi,
the following seems to work pretty well.
Hi,
Is there a way to get the relevant nearby words in the entire index
given a single word?
I want to know all the relevance ranked words before and after the queried
word.
thanks for any tips.
Darren
Ok, thanks.
That's what I expected :D
>
> From: dante stroe
> Sent: Thu Jan 13 15:56:33 CET 2011
> To:
> Subject: Re: Solr boolean operators
>
>
> To my understanding: in terms of the results that will be matched by your
> query ... it's the same. In te
To my understanding: in terms of the results that will be matched by your
query ... it's the same. In terms of the score of the results no,
since, if you are using the first query, the documents that will match both
the "a" and the "b" terms, will match higher then the ones matching just the
"
To fill the gaps:
b. the old version remains on disk but is flagged for deletion
d. optimize equals merging, the difference is how many segments come out
e. yes
On Thursday 13 January 2011 15:21:54 kenf_nc wrote:
> A/ You have to update all the fields, if you leave one off, it won't be in
> the d
Hi,
with the Lucene query syntax, is :
a AND (a OR b)
equivalent to :
a
(absorption)
?
A/ You have to update all the fields, if you leave one off, it won't be in
the document anymore. I have my 'persisted' data stored outside of Solr, so
on update I get the stored data, modify it and update Solr with every field
(even if one changed). You could also do a Query/Modify/Update directly
If this is a one-time cleanup, not something you need to do programmatically,
you could delete the index directory ( /data/index ). In my case I
have to stop Tomcat, delete .\index and restart Tomcat. It is very fast and
starts me out with a fresh, empty, index. Noticed you are multi-core, I'm
not
Hi,
I'm trying build solr index with MapReduce (Hadoop) and I'm using
https://issues.apache.org/jira/browse/SOLR-1301 but I've a problem with
hadoop version and this patch.
When I compile this patch, I use 0.21.0 hadoop version, I don't have any
problem but when I'm trying to run my job in Hadoop
Hi,
I am sorry to ask this silly question but I could not find the
documentation about this and I am very new to lucene solr. I want to run a
range query on one of the multivalued field e.g. I have a point say [10,20],
which is the point of intersection of the diagonals of a rectangle. Now I
w
As I seen the code for QueryElevationComponent ,there is no supports for
Distributed Search i.e. query elevation does not works with shards.
-
Grijesh
--
View this message in context:
http://lucene.472066.n3.nabble.com/Dismax-Sharding-and-Elevation-tp2247156p2247522.html
Sent from the Solr
Similar type of work I have done earlier by using spell-check component with
auto-suggest combined.
Autosuggest will provide the words starting with query term and spellcheck
returns the words similar to that.
I have combined both suggestion in single list to display
-
Grijesh
--
View this
Hi all,
I have discovered a strange thing with Dismax and Elevation and hope
someone can enlighten me what to do.
Whenever I search for something using the elevation Request Handler the
hits are from a normal Lucene query (with elevated results if the search
term was defined in elevation.xml). El
52 matches
Mail list logo