One more thanks for posting this!
I struggled with the same issue yesterday and solved it with _version_ hint
from mailing list .
Alex.
-Original Message-
From: Mark Mandel [mailto:mark.man...@gmail.com]
Sent: Thursday, September 06, 2012 1:53 AM
To: solr-user@lucene.apache.org
Subject
Hi,
Thanks,
Iam getting the results with below url.
*suggest/?q="michael b"&df=title&defType=lucene&fl=title*
But, i want the results in spellcheck section.
i want to search with title or empname or both.
Aniljayanti
--
View this message in context:
http://lucene.472066.n3.nabble.com/aut
Hi
I am trying to implement some "auto suggest" functionality, and am currently
looking at the terms component (Solr 3.6).
For example, I can form a query like this:
http://solrhost/solr/mycore/terms?terms.fl=title_s&terms.sort=index&terms.limit=5&terms.prefix=Hotel+C
which searches in the "ti
If your interest is focusing on the real textual content of a web page, you
could try this : JReadability (https://github.com/ifesdjeen/jReadability ,
Apache 2.0 license), which wraps JSoup (as Lance suggested) and applies a
set of predefined rules to scrap crap (nav, headers, footers, ...) off of
Hi Peter,
Yes if you want to do complex things in suggest mode, you'd better rely on
the SearchComponent...
For example, this blog post is a good read
http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/ ,
if you have complex requirements on the searched fields.
(Although y
Commit is not too often, it's a batch of 100 records, takes 40 to 60 secs
before another commit.
No I am not indexing with multi threads. It uses a single thread executor.
I have seen steady performance for now after increasing the merge factor
from 10 to 25.
Will have to wait and watch if that re
Hi
I have installed solr 3.6.1 on tomcat 7.0 following the steps here.
http://ralf.schaeftlein.de/2012/02/10/installing-solr-3-5-under-tomcat-7/
The slor home page loads fine but the admin page
(http://localhost:8080/solr/admin/) throws error missing core name in path.
I am installing single cor
Hello,
i'm currently devoloping a custom component in Solr.
This component works fine. The problem I have is, I only have an access to the
searcher which gives me the option to fire e.g. BooleanQueries.
This searcher gives me a result, which I have to iterate to calculate
informations which co
Hi,
just found a solution, but you have to know, what you want to count:
try {
final SolrIndexSearcher s = rb.req.getSearcher();
final SolrQueryParser qp = new SolrQueryParser(rb.req.getSchema(), null);
final String queryString = "entity_type:RELEASE";
final Query q = qp.parse(queryString);
Hello,
I was under the impression that edismax was supposed to be crash proof
and just ignore bad syntax. But I am either misconfiguring it or hit a
weird bug. I basically searched for text containing '/' and got this:
{
'responseHeader'=>{
'status'=>400,
'QTime'=>9,
'params'=>{
As far as I understand, / is a special character and needs to be escaped.
Maybe "foo\/bar" should work?
I found this when I looked at the code of ClientUtils.escapeQueryChars:
// These characters are part of the query syntax and must be escaped
if (c == '\\' || c == '+' || c == '-' || c ==
I believe this is caused by the regex support in
https://issues.apache.org/jira/browse/LUCENE-2039
It certainly seems wrong to interpret a slash in the middle of the
word as the start of a regex, so I've reopened the issue.
-Yonik
http://lucidworks.com
On Thu, Sep 6, 2012 at 9:34 AM, Alexandre
Thanks Rafał and Markus for your comments.
I think Droids it has serious problem with URL parameters in current version
(0.2.0) from Maven central:
https://issues.apache.org/jira/browse/DROIDS-144
I knew about Nutch, but I haven't been able to implement a crawler with it.
Have you done that or
You have "deletedPKQuery", but the correct spelling is "deletedPkQuery"
(lowercase "k"). Try that and see if it fixes your problem.
Also, you can probably simplify this if you do this as
"command=full-import&clean=false", then use something like this for your query:
select product_id as '$de
That's what I was thinking, but when I tried foo/bar in Solr 3.6 and
4.0-BETA it was working fine - it split the term and generated the proper
query without any error.
I think the problem is if you use the default Lucene query parser, not
edismax. I removed &defType==edismax from my query requ
I am on 4.0 alpha. Maybe it was fixed in beta. But I am most
definitely seeing this in edismax. If I get rid of / and use
debugQuery, I get:
'responseHeader'=>{
'status'=>0,
'QTime'=>14,
'params'=>{
'debugQuery'=>'true',
'indent'=>'true',
'q'=>'foobar',
'qf'=>'Ti
Hello!
I think that really depends on what you want to achieve and what parts
of your current system you would like to reuse. If it is only HTML
processing I would let Nutch and Solr do that. Of course you can
extend Nutch (it has a plugin API) and implement the custom logic you
need as a Nutch pl
I do in fact see your problem with an earlier 4.0 build, but not with
4.0-BETA.
-- Jack Krupansky
-Original Message-
From: Alexandre Rafalovitch
Sent: Thursday, September 06, 2012 10:13 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4.0alpha: edismax complaints on certain charac
-Original message-
> From:Lochschmied, Alexander
> Sent: Thu 06-Sep-2012 16:04
> To: solr-user@lucene.apache.org
> Subject: AW: Website (crawler for) indexing
>
> Thanks Rafał and Markus for your comments.
>
> I think Droids it has serious problem with URL parameters in current version
The fix in edismax was made just a few days (6/28) before the formal
announcement of 4.0-ALPHA (7/3), but unfortunately the fix came a few days
after the cutoff for 4.0-ALPHA (6/25).
See:
https://issues.apache.org/jira/browse/SOLR-3467
(That issue should probably be annotated to indicate that
: gpg: Signature made 08/06/12 19:52:21 Pacific Daylight Time using RSA key
: ID 322
: D7ECA
: gpg: Good signature from "Robert Muir (Code Signing Key) "
: *gpg: WARNING: This key is not certified with a trusted signature!*
: gpg: There is no indication that the signature belongs to the
:
: Some extra information. If I use curl and force it to use HTTP 1.0, it is more
: visible that Solr doesn't allow persistent connections:
a) solr has nothing to do with it, it's entirely something under the
control of jetty & the client.
b) i think you are introducing confusion by trying to fo
Thank you. I did the test with curl the same way you did it and it works.
I still can not get ab ("apache benchmark") to reuse connections to
solr. I'll investigate this further.
$ ab -c 1 -n 100 -k 'http://localhost:8983/solr/select?q=*:*' | grep Alive
Keep-Alive requests:0
-- Aleksey
O
Hey Guys,
I created a program to export Solr index data to XML.
The url is https://github.com/eltu/Solr-Export
Tell me about any problem, please.
*** I only tested with the Solr 3.6.1
Thanks,
Helton
I have a made a schema change to copy an existing field "name" (Source Field)
to an existing search field "text" (Destination Field).
Since I made the schema change, I updated all the documents thinking the new
source field will be clubbed together with the "text" field. The search for
a specifi
We have a distributed solr setup with 8 servers and 8 cores on each server in
production. We see this error multiple times in our solr servers. we are
using solr 3.6.1. Has anyone seen this error before and have you resolved
it ?
2012-09-04 02:16:40,995 [http-nio-8080-exec-7] ERROR
org.apache.so
Hi Jack,
24bit => 16M possibilities, it's clear; just to confirm... the rest is
unclear, why 4-byte can have 4 million cardinality? I thought it is 4
billions...
And, just to confirm: UnInvertedField allows 16M cardinality, correct?
On 12-08-20 6:51 PM, "Jack Krupansky" wrote:
>It appears
Hi Lance,
Use case is "keyword extraction", and it could be 2- and 3-grams (2- and
3- words); so that theoretically we can have 10,000^3 = 1,000,000,000,000
3-grams for English only... of course my suggestion is to use statistics and
to build a dictionary of such 3-word combinations (remove top,
It's actually limited to 24 bits to point to the term list in a
byte[], but there are 256 different arrays, so the maximum capacity is
4B bytes of un-inverted terms, but each bucket is limited to 4B/256 so
the real limit can come in at a little less due to luck.
>From the comments:
* There is
Hi,
I am using Solr with DIH and started getting errors when the database
time/date fields are getting imported in to Solr. I have used the date as
the field type but when i looked up at the docs it looks like the date
field does not accept (Thu, 06 Sep 2012 22:32:33 +) or (1346976590)
formats
: I am using Solr with DIH and started getting errors when the database
: time/date fields are getting imported in to Solr. I have used the date as
what actual error are you getting?
If you are pulling dates from a SQL Date field, that the jdbc driver
returns as java.util.Date objects, then you
http://www.electrictoolbox.com/article/mysql/format-date-time-mysql/ hth --
H
On 6 Sep 2012 17:23, "kiran chitturi" wrote:
> Hi,
>
> I am using Solr with DIH and started getting errors when the database
> time/date fields are getting imported in to Solr. I have used the date as
> the field type b
: I am facing a strange problem. I am searching for word "jacke" but solr also
: returns result where my description contains 'RCA-Jack/'. Íf i search
: "jacka" or "jackc" or "jackd", it works fine and does not return me any
: result which is what i am expecting in this case.
you need to tell us
I don't know for sure, but I remember something around this being a problem,
yes ... maybe https://issues.apache.org/jira/browse/LUCENE-3907 ?
Otis
Performance Monitoring for Solr / ElasticSearch / HBase -
http://sematext.com/spm
- Original Message -
> From: Walter Underwood
>
Hi,
Thank you for your response.
The error i am getting is 'org.apache.solr.common.SolrException: Invalid
Date String: '1345743552'.
I think it was being saved as a string in DB, so i will use the
DateFormatTransformer.
When i index a text field which has arabic and English like this tweet
“@a
Hey guys!
I've been attempting to get solrcloud set up on a ubuntu vm, but I believe
I'm stuck.
I've got tomcat setup, the solr war file in place, and when I browser to
localhost:port/solr, I can see solr. CHECK
I've set the zoo.cfg to use port 5200. I can start it up and see it's
running (ls
Yes, that is exactly the bug. EdgeNgram should work like the synonym filter.
wunder
On Sep 6, 2012, at 5:51 PM, Otis Gospodnetic wrote:
> I don't know for sure, but I remember something around this being a problem,
> yes ... maybe https://issues.apache.org/jira/browse/LUCENE-3907 ?
>
> Otis
>
Greetings,
I'm looking to add some additional logging to a solr 3.6.0 setup to
allow us to determine actual time spent by Solr responding to a
request.
We have a custom QueryComponent that sometimes returns 1+ MB of data
and while QTime is always on the order of ~100ms, the response time at
the c
On 7 September 2012 06:24, kiran chitturi wrote:
[...]
> When i index a text field which has arabic and English like this tweet
> “@anaga3an: هو سعد الحريري بيعمل ايه غير تحديد الدوجلاس ويختار الكرافته ؟؟”
> #gcc #ksa #lebanon #syria #kuwait #egypt #سوريا
> with field_type as 'text_ar' and when i
I'd still love to see a query lifecycle flowchart, but, in case it
helps any future users or in case this is still incorrect, here's how
I'm tackling this:
1) Override default json responseWriter with my own in solrconfig.xml:
2) Define JSONResponseWriterWithTiming as just extending
JSONRespo
Also, your browser may use a platform default for the encoding instead of
UTF-8. Some MacOS and Windows browsers have this problem.
Tomcat sometimes needs adjustment to use UTF-8. If you are on tomcat, check
this:
http://find.searchhub.org/link?url=http://wiki.apache.org/solr/SolrTomcat
http://f
Grouping isn't defined for tokenized fields I don't think. See:
http://wiki.apache.org/solr/FieldCollapsing where it says for
group.field:
"..The field must currently be single-valued..."
Are you sure you don't want faceting?
Best
Erick
On Tue, Sep 4, 2012 at 5:27 AM, mechravi25 wrote:
> Hi,
>
Try using edismax to distribute the search across the fields rather
than using the catch-all field. There's no way that I know of to
reconstruct what field the source was.
But storing the source fields without indexing them is OK too, it won't affect
searching speed noticeably...
Best
Erick
On T
I don't know of any better way to do this. Conflating the fields is
not _that_ error prone, although it is annoying I agree. I think that
idea is better than storing them separately.
Best
Erick
On Tue, Sep 4, 2012 at 4:58 PM, Alexandre Rafalovitch
wrote:
> Hello,
>
> I have some fields that have
And you've illustrated my viewpoint I think by saying
"two obvious choices".
I may prefer the first, and you may prefer the second. Neither is
necessarily more "correct" IMO, it depends on the problem
space. Choosing either one will be unpopular with anyone
who likes the other
And I suspect t
Securing Solr pretty much universally requires that you only allow trusted
clients to access the machines directly, usually secured with a firewall
and allowed IP addresses, the admin handler is the least of your worries.
Consider if you let me ping solr directly, I can do something really
annoyin
Guenter:
Are you using SolrCloud or straight Solr? And were you updating in
batches (i.e. updating multiple docs at once from SolrJ by using the
server.add(doclist) form)?
There was a bug in this process that caused various docs to show up
in various shards differently. This has been fixed in 4x,
Erick, thanks for response!
Our use case is very straight forward and basic.
- no cloud infrastructure
- XMLUpdateRequest - handler (transformed library bibliographic data
which is pushed by the post.jar component). For deletions I used to use
the solrJ component until two month ago but because
Erick,
I think that should be described differently...
You need to set-up protected access for some paths.
/update is one of them.
And you could make this protected at the jetty level or using Apache proxies
and rewrites.
Probably /select should be kept open but you need to evaluate if that can
49 matches
Mail list logo