date:20120906

RE: Delete all documents in the index

2012-09-06 Thread Alexey Kozhemiakin

One more thanks for posting this! I struggled with the same issue yesterday and solved it with _version_ hint from mailing list . Alex. -Original Message- From: Mark Mandel [mailto:mark.man...@gmail.com] Sent: Thursday, September 06, 2012 1:53 AM To: solr-user@lucene.apache.org Subject

Re: AW: AW: auto completion search with solr using NGrams in SOLR

2012-09-06 Thread aniljayanti

Hi, Thanks, Iam getting the results with below url. *suggest/?q="michael b"&df=title&defType=lucene&fl=title* But, i want the results in spellcheck section. i want to search with title or empname or both. Aniljayanti -- View this message in context: http://lucene.472066.n3.nabble.com/aut

terms component search

2012-09-06 Thread Peter Kirk

Hi I am trying to implement some "auto suggest" functionality, and am currently looking at the terms component (Solr 3.6). For example, I can form a query like this: http://solrhost/solr/mycore/terms?terms.fl=title_s&terms.sort=index&terms.limit=5&terms.prefix=Hotel+C which searches in the "ti

Re: Document Processing

2012-09-06 Thread Tanguy Moal

If your interest is focusing on the real textual content of a web page, you could try this : JReadability (https://github.com/ifesdjeen/jReadability , Apache 2.0 license), which wraps JSoup (as Lance suggested) and applies a set of predefined rules to scrap crap (nav, headers, footers, ...) off of

Re: terms component search

2012-09-06 Thread Tanguy Moal

Hi Peter, Yes if you want to do complex things in suggest mode, you'd better rely on the SearchComponent... For example, this blog post is a good read http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/ , if you have complex requirements on the searched fields. (Although y

Re: solr indexing slows down after few minutes

2012-09-06 Thread amit

Commit is not too often, it's a batch of 100 records, takes 40 to 60 secs before another commit. No I am not indexing with multi threads. It uses a single thread executor. I have seen steady performance for now after increasing the merge factor from 10 to 25. Will have to wait and watch if that re

solr 3.6.1 tomcat 7.0 missing core name in path

2012-09-06 Thread amit

Hi I have installed solr 3.6.1 on tomcat 7.0 following the steps here. http://ralf.schaeftlein.de/2012/02/10/installing-solr-3-5-under-tomcat-7/ The slor home page loads fine but the admin page (http://localhost:8080/solr/admin/) throws error missing core name in path. I am installing single cor

Facetting inside a custom component

2012-09-06 Thread Ralf Heyde

Hello, i'm currently devoloping a custom component in Solr. This component works fine. The problem I have is, I only have an access to the searcher which gives me the option to fire e.g. BooleanQueries. This searcher gives me a result, which I have to iterate to calculate informations which co

Re: Facetting inside a custom component

2012-09-06 Thread Ralf Heyde

Hi, just found a solution, but you have to know, what you want to count: try { final SolrIndexSearcher s = rb.req.getSearcher(); final SolrQueryParser qp = new SolrQueryParser(rb.req.getSchema(), null); final String queryString = "entity_type:RELEASE"; final Query q = qp.parse(queryString);

Solr 4.0alpha: edismax complaints on certain characters

2012-09-06 Thread Alexandre Rafalovitch

Hello, I was under the impression that edismax was supposed to be crash proof and just ignore bad syntax. But I am either misconfiguring it or hit a weird bug. I basically searched for text containing '/' and got this: { 'responseHeader'=>{ 'status'=>400, 'QTime'=>9, 'params'=>{

RE: Solr 4.0alpha: edismax complaints on certain characters

2012-09-06 Thread Yoni Amir

As far as I understand, / is a special character and needs to be escaped. Maybe "foo\/bar" should work? I found this when I looked at the code of ClientUtils.escapeQueryChars: // These characters are part of the query syntax and must be escaped if (c == '\\' || c == '+' || c == '-' || c ==

Re: Solr 4.0alpha: edismax complaints on certain characters

2012-09-06 Thread Yonik Seeley

I believe this is caused by the regex support in https://issues.apache.org/jira/browse/LUCENE-2039 It certainly seems wrong to interpret a slash in the middle of the word as the start of a regex, so I've reopened the issue. -Yonik http://lucidworks.com On Thu, Sep 6, 2012 at 9:34 AM, Alexandre

AW: Website (crawler for) indexing

2012-09-06 Thread Lochschmied, Alexander

Thanks Rafał and Markus for your comments. I think Droids it has serious problem with URL parameters in current version (0.2.0) from Maven central: https://issues.apache.org/jira/browse/DROIDS-144 I knew about Nutch, but I haven't been able to implement a crawler with it. Have you done that or

RE: deletedPkQuery not work in solr 3.3

2012-09-06 Thread Dyer, James

You have "deletedPKQuery", but the correct spelling is "deletedPkQuery" (lowercase "k"). Try that and see if it fixes your problem. Also, you can probably simplify this if you do this as "command=full-import&clean=false", then use something like this for your query: select product_id as '$de

Re: Solr 4.0alpha: edismax complaints on certain characters

2012-09-06 Thread Jack Krupansky

That's what I was thinking, but when I tried foo/bar in Solr 3.6 and 4.0-BETA it was working fine - it split the term and generated the proper query without any error. I think the problem is if you use the default Lucene query parser, not edismax. I removed &defType==edismax from my query requ

Re: Solr 4.0alpha: edismax complaints on certain characters

2012-09-06 Thread Alexandre Rafalovitch

I am on 4.0 alpha. Maybe it was fixed in beta. But I am most definitely seeing this in edismax. If I get rid of / and use debugQuery, I get: 'responseHeader'=>{ 'status'=>0, 'QTime'=>14, 'params'=>{ 'debugQuery'=>'true', 'indent'=>'true', 'q'=>'foobar', 'qf'=>'Ti

Re: AW: Website (crawler for) indexing

2012-09-06 Thread Rafał Kuć

Hello! I think that really depends on what you want to achieve and what parts of your current system you would like to reuse. If it is only HTML processing I would let Nutch and Solr do that. Of course you can extend Nutch (it has a plugin API) and implement the custom logic you need as a Nutch pl

Re: Solr 4.0alpha: edismax complaints on certain characters

2012-09-06 Thread Jack Krupansky

I do in fact see your problem with an earlier 4.0 build, but not with 4.0-BETA. -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Thursday, September 06, 2012 10:13 AM To: solr-user@lucene.apache.org Subject: Re: Solr 4.0alpha: edismax complaints on certain charac

RE: Website (crawler for) indexing

2012-09-06 Thread Markus Jelsma

-Original message- > From:Lochschmied, Alexander > Sent: Thu 06-Sep-2012 16:04 > To: solr-user@lucene.apache.org > Subject: AW: Website (crawler for) indexing > > Thanks Rafał and Markus for your comments. > > I think Droids it has serious problem with URL parameters in current version

Re: Solr 4.0alpha: edismax complaints on certain characters

2012-09-06 Thread Jack Krupansky

The fix in edismax was made just a few days (6/28) before the formal announcement of 4.0-ALPHA (7/3), but unfortunately the fix came a few days after the cutoff for 4.0-ALPHA (6/25). See: https://issues.apache.org/jira/browse/SOLR-3467 (That issue should probably be annotated to indicate that

Re: Problem with verifying signature ?

2012-09-06 Thread Chris Hostetter

: gpg: Signature made 08/06/12 19:52:21 Pacific Daylight Time using RSA key : ID 322 : D7ECA : gpg: Good signature from "Robert Muir (Code Signing Key) " : *gpg: WARNING: This key is not certified with a trusted signature!* : gpg: There is no indication that the signature belongs to the :

Re: Solr not allowing persistent HTTP connections

2012-09-06 Thread Chris Hostetter

: Some extra information. If I use curl and force it to use HTTP 1.0, it is more : visible that Solr doesn't allow persistent connections: a) solr has nothing to do with it, it's entirely something under the control of jetty & the client. b) i think you are introducing confusion by trying to fo

Re: Solr not allowing persistent HTTP connections

2012-09-06 Thread Aleksey Vorona

Thank you. I did the test with curl the same way you did it and it works. I still can not get ab ("apache benchmark") to reuse connections to solr. I'll investigate this further. $ ab -c 1 -n 100 -k 'http://localhost:8983/solr/select?q=*:*' | grep Alive Keep-Alive requests:0 -- Aleksey O

Solr-Export

2012-09-06 Thread Helton Alponti

Hey Guys, I created a program to export Solr index data to XML. The url is https://github.com/eltu/Solr-Export Tell me about any problem, please. *** I only tested with the Solr 3.6.1 Thanks, Helton

Solr search not working after copying a new field to an existing Indexed Field

2012-09-06 Thread Mani

I have a made a schema change to copy an existing field "name" (Source Field) to an existing search field "text" (Destination Field). Since I made the schema change, I updated all the documents thinking the new source field will be clubbed together with the "text" field. The search for a specifi

NoHttpResponseException: The server failed to respond

2012-09-06 Thread srinir

We have a distributed solr setup with 8 servers and 8 cores on each server in production. We see this error multiple times in our solr servers. we are using solr 3.6.1. Has anyone seen this error before and have you resolved it ? 2012-09-04 02:16:40,995 [http-nio-8080-exec-7] ERROR org.apache.so

Re: UnInvertedField limitations

2012-09-06 Thread Fuad Efendi

Hi Jack, 24bit => 16M possibilities, it's clear; just to confirm... the rest is unclear, why 4-byte can have 4 million cardinality? I thought it is 4 billions... And, just to confirm: UnInvertedField allows 16M cardinality, correct? On 12-08-20 6:51 PM, "Jack Krupansky" wrote: >It appears

Re: UnInvertedField limitations

2012-09-06 Thread Fuad Efendi

Hi Lance, Use case is "keyword extraction", and it could be 2- and 3-grams (2- and 3- words); so that theoretically we can have 10,000^3 = 1,000,000,000,000 3-grams for English only... of course my suggestion is to use statistics and to build a dictionary of such 3-word combinations (remove top,

Re: UnInvertedField limitations

2012-09-06 Thread Yonik Seeley

It's actually limited to 24 bits to point to the term list in a byte[], but there are 256 different arrays, so the maximum capacity is 4B bytes of un-inverted terms, but each bucket is limited to 4B/256 so the real limit can come in at a little less due to luck. >From the comments: * There is

Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-06 Thread kiran chitturi

Hi, I am using Solr with DIH and started getting errors when the database time/date fields are getting imported in to Solr. I have used the date as the field type but when i looked up at the docs it looks like the date field does not accept (Thu, 06 Sep 2012 22:32:33 +) or (1346976590) formats

Re: Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-06 Thread Chris Hostetter

: I am using Solr with DIH and started getting errors when the database : time/date fields are getting imported in to Solr. I have used the date as what actual error are you getting? If you are pulling dates from a SQL Date field, that the jdbc driver returns as java.util.Date objects, then you

Re: Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-06 Thread Hasan Diwan

http://www.electrictoolbox.com/article/mysql/format-date-time-mysql/ hth -- H On 6 Sep 2012 17:23, "kiran chitturi" wrote: > Hi, > > I am using Solr with DIH and started getting errors when the database > time/date fields are getting imported in to Solr. I have used the date as > the field type b

Re: solr issue with seaching words

2012-09-06 Thread Chris Hostetter

: I am facing a strange problem. I am searching for word "jacke" but solr also : returns result where my description contains 'RCA-Jack/'. Íf i search : "jacka" or "jackc" or "jackd", it works fine and does not return me any : result which is what i am expecting in this case. you need to tell us

Re: EdgeNgramTokenFilter and positions

2012-09-06 Thread Otis Gospodnetic

I don't know for sure, but I remember something around this being a problem, yes ... maybe https://issues.apache.org/jira/browse/LUCENE-3907 ? Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm - Original Message - > From: Walter Underwood >

Re: Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-06 Thread kiran chitturi

Hi, Thank you for your response. The error i am getting is 'org.apache.solr.common.SolrException: Invalid Date String: '1345743552'. I think it was being saved as a string in DB, so i will use the DateFormatTransformer. When i index a text field which has arabic and English like this tweet “@a

solrcloud setup using tomcat, single machine

2012-09-06 Thread JesseBuesking

Hey guys! I've been attempting to get solrcloud set up on a ubuntu vm, but I believe I'm stuck. I've got tomcat setup, the solr war file in place, and when I browser to localhost:port/solr, I can see solr. CHECK I've set the zoo.cfg to use port 5200. I can start it up and see it's running (ls

Re: EdgeNgramTokenFilter and positions

2012-09-06 Thread Walter Underwood

Yes, that is exactly the bug. EdgeNgram should work like the synonym filter. wunder On Sep 6, 2012, at 5:51 PM, Otis Gospodnetic wrote: > I don't know for sure, but I remember something around this being a problem, > yes ... maybe https://issues.apache.org/jira/browse/LUCENE-3907 ? > > Otis >

Solr request/response lifecycle and logging full response time

2012-09-06 Thread Aaron Daubman

Greetings, I'm looking to add some additional logging to a solr 3.6.0 setup to allow us to determine actual time spent by Solr responding to a request. We have a custom QueryComponent that sometimes returns 1+ MB of data and while QTime is always on the order of ~100ms, the response time at the c

Re: Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-06 Thread Gora Mohanty

On 7 September 2012 06:24, kiran chitturi wrote: [...] > When i index a text field which has arabic and English like this tweet > “@anaga3an: هو سعد الحريري بيعمل ايه غير تحديد الدوجلاس ويختار الكرافته ؟؟” > #gcc #ksa #lebanon #syria #kuwait #egypt #سوريا > with field_type as 'text_ar' and when i

Re: Solr request/response lifecycle and logging full response time

2012-09-06 Thread Aaron Daubman

I'd still love to see a query lifecycle flowchart, but, in case it helps any future users or in case this is still incorrect, here's how I'm tackling this: 1) Override default json responseWriter with my own in solrconfig.xml: 2) Define JSONResponseWriterWithTiming as just extending JSONRespo

Re: Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-06 Thread Lance Norskog

Also, your browser may use a platform default for the encoding instead of UTF-8. Some MacOS and Windows browsers have this problem. Tomcat sometimes needs adjustment to use UTF-8. If you are on tomcat, check this: http://find.searchhub.org/link?url=http://wiki.apache.org/solr/SolrTomcat http://f

Re: Doubts in Result Grouping in solr 3.6.1

2012-09-06 Thread Erick Erickson

Grouping isn't defined for tokenized fields I don't think. See: http://wiki.apache.org/solr/FieldCollapsing where it says for group.field: "..The field must currently be single-valued..." Are you sure you don't want faceting? Best Erick On Tue, Sep 4, 2012 at 5:27 AM, mechravi25 wrote: > Hi, >

Re: How to preserve source column names in multivalue catch all field

2012-09-06 Thread Erick Erickson

Try using edismax to distribute the search across the fields rather than using the catch-all field. There's no way that I know of to reconstruct what field the source was. But storing the source fields without indexing them is OK too, it won't affect searching speed noticeably... Best Erick On T

Re: Best practices on managing facets with Code and Name

2012-09-06 Thread Erick Erickson

I don't know of any better way to do this. Conflating the fields is not _that_ error prone, although it is annoying I agree. I think that idea is better than storing them separately. Best Erick On Tue, Sep 4, 2012 at 4:58 PM, Alexandre Rafalovitch wrote: > Hello, > > I have some fields that have

Re: Sorting on mutivalued fields still impossible?

2012-09-06 Thread Erick Erickson

And you've illustrated my viewpoint I think by saying "two obvious choices". I may prefer the first, and you may prefer the second. Neither is necessarily more "correct" IMO, it depends on the problem space. Choosing either one will be unpopular with anyone who likes the other And I suspect t

Re: SOLR 4.0 / Jetty Security Set Up

2012-09-06 Thread Erick Erickson

Securing Solr pretty much universally requires that you only allow trusted clients to access the machines directly, usually secured with a firewall and allowed IP addresses, the admin handler is the least of your worries. Consider if you let me ping solr directly, I can do something really annoyin

Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

2012-09-06 Thread Erick Erickson

Guenter: Are you using SolrCloud or straight Solr? And were you updating in batches (i.e. updating multiple docs at once from SolrJ by using the server.add(doclist) form)? There was a bug in this process that caused various docs to show up in various shards differently. This has been fixed in 4x,

Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

2012-09-06 Thread guenter.hip...@unibas.ch

Erick, thanks for response! Our use case is very straight forward and basic. - no cloud infrastructure - XMLUpdateRequest - handler (transformed library bibliographic data which is pushed by the post.jar component). For deletions I used to use the solrJ component until two month ago but because

Re: SOLR 4.0 / Jetty Security Set Up

2012-09-06 Thread Paul Libbrecht

Erick, I think that should be described differently... You need to set-up protected access for some paths. /update is one of them. And you could make this protected at the jetty level or using Apache proxies and rewrites. Probably /select should be kept open but you need to evaluate if that can

49 matches

Mail list logo