from:"Simon"

Re: Solr Text Tagger | All tags in desc order

2019-10-04 Thread Simon Rosenthal

m the returned JSON. HTH -Simon On Fri, Oct 4, 2019 at 5:41 AM Vipul Sharma wrote: > Hi All, > > After putting all the master data in Solr Text Tagger, I want to parse > resume text to fetch the top five skills based on there score is there any > way to fetch the result in descending order? >

Re: SolrClient from inside processAdd function

2019-09-04 Thread Simon Rosenthal

Similarly, I had considered a URP which would call the Solr Tagger to add new metadata fields for indexing to incoming documents (and recall discussing this with David Smiley), but eventually decided against this approach on the grounds of complexity. -Simon On Wed, Sep 4, 2019 at 2:10 PM

Re: checksum failed (hardware problem?)

2018-09-26 Thread simon

problem Eventually I cloned our environment to a new AWS instance, which proved to be the solution. Why, I have no idea... -Simon On Mon, Sep 24, 2018 at 1:13 PM, Susheel Kumar wrote: > Got it. I'll have first hardware folks check and if they don't see/find > anything suspicious then

Solr edismax multi-word match issue

2018-09-20 Thread Simon Bloch

ot;:"+((name_c:vietnam)^10.0 | (ancestor_name:viet nam)^1.25 | (name:viet nam)^1.0) #### I would really appreciate any support or debugging advice in this matter! -Simon Bloch

Re: Sorting and pagination in Solr json range facet

2018-07-11 Thread simon

Looking carefully at the documentation for JSON facets, it looks as though the offset parameter is not supported for range facets, only for term facets. You'd have to do pagination in your application. -Simon On Tue, Jul 10, 2018 at 11:45 AM, Anil wrote: > HI Eric, > > i mean

Re: CURL command problem on Solr

2018-05-29 Thread simon

Could it be that the header should be 'Content-Type' (which is what I see in the relevant RFC) rather than 'Content-type' as shown in your email ? I don't know if headers are case-sensitive, but it's worth checking. -Simon On Tue, May 29, 2018 at 11:02 AM, Roee

Re: Defining Document Transformers in Solr Configuration

2018-02-28 Thread simon

Thanks Mikhail: I considered that, but not all queries would request that field, and there are in fact a couple more similar DocTransformer-generated aliased fields which we can optionally request, so it's not a general enough solution. -Simon On Wed, Feb 28, 2018 at 1:18 AM, Mikhail Khl

Re: Defining Document Transformers in Solr Configuration

2018-02-27 Thread simon

hat it's not tied to one particular > external API - defining a macro, if you will, so that you could supply > 'fl='a,b,c,%numcites%,...' in the request and have Solr do the expansion. > > Is there some way to do this that I've overlooked ? if not, I think it > would be a useful new feature. > > > -Simon > > >

Defining Document Transformers in Solr Configuration

2018-02-27 Thread simon

r configuration so that it's not tied to one particular external API - defining a macro, if you will, so that you could supply 'fl='a,b,c,%numcites%,...' in the request and have Solr do the expansion. Is there some way to do this that I've overlooked ? if not, I think it would be a useful new feature. -Simon

Re: Solr search word NOT followed by another word

2018-02-12 Thread simon

Tim: How up to date is the Solr-5410 patch/zip in JIRA ?. Looking to use the Span Query parser in 6.5.1, migrating to 7.x sometime soon. Would love to see these committed ! -Simon On Mon, Feb 12, 2018 at 10:41 AM, Allison, Timothy B. wrote: > That requires a SpanNotQuery. AFAIK, there

Re: use mutiple ssd in solr cloud

2017-11-07 Thread simon

y - I can't say. best -Simon On Tue, Nov 7, 2017 at 1:44 AM, Amin Raeiszadeh wrote: > Hi > i want to use more than one ssd in each server of solr cluster but i don't > know how to set multiple hdd in solr.xml configurations. > i set on hdd path in solr.xml by: > /media

Re: Upgrade path from 5.4.1

2017-11-02 Thread simon

though see SOLR-11078 , which is reporting significant query slowdowns after converting *Trie to *Point fields in 7.1, compared with 6.4.2 On Wed, Nov 1, 2017 at 9:06 PM, Yonik Seeley wrote: > On Wed, Nov 1, 2017 at 2:36 PM, Erick Erickson > wrote: > > I _always_ prefer to reindex if possible.

Re: How to remove control characters in stored value at Solr side

2017-09-14 Thread simon

, and you could live with dropping the offending document(s) then you might want to investigate the TolerantUpdateProcessorFactory Solr 6.1 or later) -Simon On Thu, Sep 14, 2017 at 3:56 PM, arnoldbronley wrote: > Thanks for information. Here is the full stack trace. I thought to handle >

Re: How to remove control characters in stored value at Solr side

2017-09-14 Thread simon

@Arnold: are these non UTF-8 control characters (which is what the Nutch issue was about) or otherwise legal UTF-8 characters which Solr for some reason is choking on ? If you could provide a full stack trace it would be really helpful. On Thu, Sep 14, 2017 at 2:55 PM, Markus Jelsma wrote: >

Re: How to remove control characters in stored value at Solr side

2017-09-14 Thread simon

ctory might work for this. best -Simon On Thu, Sep 14, 2017 at 1:46 PM, Arnold Bronley wrote: > I know I can apply PatternReplaceFilterFactory to remove control characters > from indexed value. However, is it possible to do similar thing for stored > value? Because of some cont

Re: How Solr knows the Cores it has on startup?

2017-09-12 Thread simon

deleted in current versions of Solr - so you'll have to find a way (outside Solr) to copy it or re-create it. What is the use case here ? best -Simon On Tue, Sep 12, 2017 at 1:27 PM, Shashank Pedamallu wrote: > Hi, > > I wanted to know how does Solr pick up cores on startup. Bas

Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread simon

with multiple tokens. Then construct a query which searches both field1 for an exact match, and field2 using ComplexQueryParser (use the localparams syntax) to combine them. Boost the field1 (exact match). HTH -Simon On Thu, Jun 15, 2017 at 1:20 PM, Max Bridgewater wrote: > Thanks Susheel. The c

Re: Solr 6.6 UNLOAD core broken?

2017-06-09 Thread simon

o it looks like a bug. -Simon On Fri, Jun 9, 2017 at 5:14 AM, Andreas Hubold wrote: > Hi, > > I just tried to update from Solr 6.5.1 to Solr 6.6.0 and observed a > changed behaviour with regard to unloading cores in Solr standalone mode. > > After unloading a core using the Core

Re: SOLR | De-Duplication | Remove duplicate records based on their status

2017-05-31 Thread simon

Your updateRequestProcessorChain config snippet specifies the "id" field to generate a signature, but the sample data doesn't contain an "id" field ... check that out first. -Simon On Wed, May 31, 2017 at 12:06 PM, Lebin Sebastian wrote: > Hello, > > I am

Re: Indexing I/O errors and CorruptIndex messages

2017-05-04 Thread simon

dexer scripts running concurrently, but the duration goes up proportionately. -Simon On Thu, Apr 27, 2017 at 9:26 AM, simon wrote: > Nope ... huge file system (600gb) only 50% full, and a complete index > would be 80gb max. > > On Wed, Apr 26, 2017 at 4:04 PM, Erick Erickson > wr

Re: Reload an unloaded core

2017-05-02 Thread simon

-Simon On Tue, May 2, 2017 at 4:04 PM, Erick Erickson wrote: > IIRC, the core.properties file _is_ renamed to > core.properties.unloaded or something like that. > > Yeah, this is something of a pain. The inverse of "unload" is "create" > but you have to know e

Re: Reload an unloaded core

2017-05-02 Thread simon

I ran into the exact same situation recently. I unloaded from the browser GUI which does not delete the data or instance dirs, but does delete core.properties. I couldn't find any API either so I eventually manually recreated core.properties and restarted Solr. Would be nice if the core.propert

Re: Indexing I/O errors and CorruptIndex messages

2017-04-27 Thread simon

W > if you look now and have free space it still may have been all used up > but had some space reclaimed. > > Best, > Erick > > On Wed, Apr 26, 2017 at 12:02 PM, simon wrote: > > reposting this as the problem described is happening again and there were > > no

Indexing I/O errors and CorruptIndex messages

2017-04-26 Thread simon

reposting this as the problem described is happening again and there were no responses to the original email. Anyone ? I'm seeing an odd error during indexing for which I can't find any reason. The relevant solr log entry: 2017-03-24 19:09:35.363 ERROR (commitSchedule

Re: keywords not found - google like feature

2017-04-13 Thread simon

will return a boolean if the term is in a specific field. I've used this for simple cases where it worked well, though I wouldn't like to speculate on how well this scales if you have an edismax query where you might need to generate multiple term/field combinations. HTH -Simon On Thu, Ap

Re: Is there a way to retrieve the a term's position/offset in Solr

2017-03-28 Thread simon

sitions with no need for actual highlighting. The patch is pretty old - I applied it to Solr 4.10 I think, so will probably need some work for later releases. HTH -Simon On Tue, Mar 28, 2017 at 4:59 AM, forest_soup wrote: > Thanks Eric. > > Actually solr highlighting function does not

Unexplainable indexing i/o errors

2017-03-27 Thread simon

ystem logs and didn't see any evidence of hardware errors I'm puzzled as to why this would start happening out of the blue and I can't find any partiuclarly relevant posts to this forum or Stackexchange. Anyone have an idea what's going on ? -Simon

Re: Highlighting, offsets -- external doc store

2016-11-29 Thread simon

You might want to take a look at https://issues.apache.org/jira/browse/SOLR-4722 ( 'highlighter which generates a list of query term positions'). We used it a while back and doesn't appear to have been used in any Solr > 4.10) -Simon On Tue, Nov 29, 2016 at 11:43 AM, John

Re: Can Solr find related terms in a document

2016-10-17 Thread simon

Do you already have a set of terms for which you would want to find out their co-occurence, or are you trying to do data mining, looking in a collection for terms which occur together more often than by chance ? On Sun, Oct 16, 2016 at 3:45 AM, Yangrui Guo wrote: > Hello > > I'm curious to know

Solr suddenly starts creating .cfs (compound) segments during indexing

2016-09-27 Thread simon

thal/defsolr/server/logs --module=http solrconfig.xml: basically the default with some minor tweaks in the indexConfig section 5.0 200 1 20 60 20 ... everything else is default Insights as to why this is happening would be welcome. -Simon

Re: Metadata and HTML ending up in searchable text

2016-06-02 Thread Simon Blandford

xt mode). It seems some Javascript creeps into the text version. (See below) Regards, Simon HTML mode sample: 051<?xml version="1.0" encoding="UTF-8"?> <html xmlns="http://www.w3.org/1999/xhtml">; <head> <link rel="styleshee

Re: Metadata and HTML ending up in searchable text

2016-06-01 Thread Simon Blandford

Thanks Timothy, Will give the DIH a try. I have submitted a bug report. Regards, Simon On 31/05/16 13:22, Allison, Timothy B. wrote: From the same page, extractFormat=text only applies when extractOnly is true, which just shows the output from tika without indexing the document. Y, sorry

Re: Metadata and HTML ending up in searchable text

2016-05-31 Thread Simon Blandford

ng a bug report. Regards, Simon On 27/05/16 20:22, Alexandre Rafalovitch wrote: I think Solr's layer above Tika was merging in metadata and text all together without a way (that I could see) to separate them. That's all I remember of my examination of this issue when I run into something sim

Re: Metadata and HTML ending up in searchable text

2016-05-27 Thread Simon Blandford

uot;extractOnly" mode resulting in a XML output. The difference between selecting "text" or "xml" format is that the escaped document in the tag is either the original HTML (xml mode) or stripped HTML (text mode). It seems some Javascript creeps into the text version.

Metadata and HTML ending up in searchable text

2016-05-26 Thread Simon Blandford

Hi, I am using Solr 6.0 on Ubuntu 14.04. I am ending up with loads of junk in the text body. It starts like, The JSON entry output of a search result shows the indexed text starting with... body_txt_en: " stream_size 36499 X-Parsed-By org.apache.tika.parser.DefaultParser X-Parsed-By" An

Re: fl=value equals?

2015-11-13 Thread simon

Please do push your script to github - I (re)-compile custom code infrequently and never remember how to setup the environment. On Thu, Nov 12, 2015 at 5:14 AM, Upayavira wrote: > Okay, makes sense. As to your question - making a new ValueSourceParser > that handles 'equals' sounds pretty straig

Re: OpenNLP plugin or similar NER software for Solr ??? !!!

2015-11-09 Thread simon

https://github.com/OpenSextant/SolrTextTagger/ We're using it for country tagging successfully. On Wed, Nov 4, 2015 at 3:10 PM, Doug Turnbull < dturnb...@opensourceconnections.com> wrote: > David Smiley had a place name and general tagging engine that for the life > of me I can't find. > > It di

Re: Detect term occurrences

2015-09-11 Thread simon

it is ingested into our main Solr collection. How many documents/product leaflets do you have ? The tagger is very fast at the Solr level but I'm seeing quite a bit of HTTP overhead. best -Simon On Fri, Sep 11, 2015 at 1:39 PM, Sujit Pal wrote: > Hi Francisco, > > >>

Re: how to index document with multiple words (phrases) and words permutation?

2015-08-25 Thread simon

we've been using with some success for this task. best -Simon On Mon, Aug 24, 2015 at 2:13 PM, afrooz wrote: > Thanks Erick, > I will explain the detail scenario so you might give me a solution: > I want to annotate a medical document base on only medical dictionary. I > don&

Re: Solr Matched Terms

2015-08-18 Thread simon

Check out https://issues.apache.org/jira/browse/SOLR-4722, which will return matching terms (and their offsets). Patch can be applied cleanly to Solr 4; doesn't appear to have been tried with Solr 5 -Simon On Tue, Aug 18, 2015 at 11:30 AM, Jack Krupansky wrote: > Maybe a spe

Custom Function for date reformatting

2015-06-12 Thread simon

every place where a date format conversion is needed is proving painful indeed ;=( My thought is to write a custom function of the form datereformatter(, ) but I thought I'd check if it's already been done or if someone can suggest a better approach. regards -Simon

How to trace error records during POST?

2015-04-07 Thread Simon Cheng

Good morning, I used Solr 4.7 to post 186,745 XML files and 186,622 files have been indexed. That means there are 123 XML files with errors. How can I trace what these files are? Thank you in advance, Simon Cheng.

Re: Alphanumeric Wild card search

2015-04-02 Thread Simon Martinelli

Hi, Have a look at the generated terms to see how they look. Simon On Thu, Apr 2, 2015 at 9:43 AM, Palagiri, Jayasankar < jayashankar.palag...@honeywell.com> wrote: > Hello Team, > > Below is my field type > > positionIncrementGap="100"

solr.DictionaryCompoundWordTokenFilterFactory extracts words in string

2015-03-31 Thread Simon Martinelli

compound of lindor and schlitten but i get lindor dorsch schlitten so the filter is extracting dorsch but the word before (lin) and after (litten) are not valid word parts. Is there any better compound word filter for German? Thanks, Simon

Re: Retrieving list of words for highlighting

2015-03-27 Thread simon

There's a JIRA ( https://issues.apache.org/jira/browse/SOLR-4722 ) describing a highlighter which returns term positions rather than snippets, which could then be mapped to the matching words in the indexed document (assuming that it's stored or that you have a copy elsewhere). -Sim

Creating a collection/core on HDFS with SolrCloud

2015-02-25 Thread Simon Minery

hdfs.security.kerberos.principal">solr/@CLUSTER.HADOOP and on Hadoop' core-site.xml, my hadoop.security.authentication parameter is set to Kerberos. Am I missing something ? Thank you very much for your input, have a great day. Simon M.

Re: Simple Sort Is Not Working In Solr 4.7?

2015-02-18 Thread Simon Cheng

e Analysis > screen. > > Regards, > Alex. > > > Sign up for my Solr resources newsletter at http://www.solr-start.com/ > > On 17 February 2015 at 22:36, Simon Cheng wrote: > > Hi Alex, > > > > It's okay after I added in a new field "s_tit

Re: Simple Sort Is Not Working In Solr 4.7?

2015-02-17 Thread Simon Cheng

ess releases and articles on policy changes affecting the Singapore property market] / compiled by the Information Resource Centre, Monetary Authority of Singapore dataq Simon is testing Solr - This one is in English. Color of the Wind. 我是中国人 , БOΛbШ OЙ PYCCKO-KИTAЙCKИЙ CΛOBAPb , Français-Chinois

Re: Simple Sort Is Not Working In Solr 4.7?

2015-02-17 Thread Simon Cheng

Hi Alex, It's simply defined like this in the schema.xml : and it is cloned to the other multi-valued field o_title : Should I simply change the type to be "string" instead? Thanks again, Simon. On Wed, Feb 18, 2015 at 12:00 PM, Alexandre Rafalovitch wrote: >

Simple Sort Is Not Working In Solr 4.7?

2015-02-17 Thread Simon Cheng

ard Club of New York City Nationalist dictatorships versus open society / by George Soros 15891 Soros, George The new paradigm for financial markets : the credit crisis of 2008 and what it means / George Soros Thank you for the help in advance, Simon.

SASL with zkcli.sh

2015-02-12 Thread Simon Minery

ud ? Thank you, Simon M.

Re: Suggester on Dynamic fields

2014-10-22 Thread Simon

n user configure a field to be auto completion. Thanks, Simon -- View this message in context: http://lucene.472066.n3.nabble.com/Suggester-on-Dynamic-fields-tp4165270p4165329.html Sent from the Solr - User mailing list archive at Nabble.com.

Variable date range facets and fixed range labels

2014-10-17 Thread Simon Fairey

Hi I'm trying to get solr (4.10) doing more of what it does best rather than a lot of hacking that is currently in our front end code, one area I'm trying to fix is date ranges, I have 2 types of date and want to display them in 2 different ways: dateA - blocks of 25 years, this works but o

RE: Solr configuration, memory usage and MMapDirectory

2014-10-08 Thread Simon Fairey

e reading! Cheers Si -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: 08 October 2014 21:09 To: solr-user@lucene.apache.org Subject: Re: Solr configuration, memory usage and MMapDirectory On 10/8/2014 4:02 AM, Simon Fairey wrote: > I'm currently setting

RE: Solr configuration, memory usage and MMapDirectory

2014-10-08 Thread Simon Fairey

large for these, are these what are using up the memory? Thanks Si -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: 06 October 2014 16:56 To: solr-user@lucene.apache.org Subject: Re: Solr configuration, memory usage and MMapDirectory On 10/6/2014 9:24 AM, Simon F

RE: Solr configuration, memory usage and MMapDirectory

2014-10-06 Thread Simon Fairey

Thanks I will have a read and digest this. -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: 06 October 2014 16:56 To: solr-user@lucene.apache.org Subject: Re: Solr configuration, memory usage and MMapDirectory On 10/6/2014 9:24 AM, Simon Fairey wrote: > I

RE: Solr configuration, memory usage and MMapDirectory

2014-10-06 Thread Simon Fairey

hints on how to read top: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Meanwhile, Shawn gave you some very good info so I won't repeat any On Mon, Oct 6, 2014 at 8:24 AM, Simon Fairey wrote: > Hi > > I've inherited a Solr config and am doing some

Solr configuration, memory usage and MMapDirectory

2014-10-06 Thread Simon Fairey

Hi I've inherited a Solr config and am doing some sanity checks before making some updates, I'm concerned about the memory settings. System has 1 index in 2 shards split across 2 Ubuntu 64 bit nodes, each node has 32 CPU cores and 132GB RAM, we index around 500k files a day spread out over the da

Re: ICUTokenizer or StandardTokenizer or ??? for "text_all" type field that might include non-whitespace langs

2014-06-20 Thread Simon Cheng

I am exploring this approach at the moment. Simon. On Sat, Jun 21, 2014 at 7:37 AM, T. Kuro Kurosaka wrote: > On 06/20/2014 04:04 AM, Allison, Timothy B. wrote: > >> Let's say a predominantly English document contains a Chinese sentence. >> If the English field uses the Whit

Fwd: Tracing Files Which Have Errors

2014-06-19 Thread Simon Cheng

Hi there, I have posted 190,000 simple XML using POST.JAR and there are only 8 files that were with errors. But how do I know which are the ones have errors? Thank you in advance, Simon Cheng.

Tracing Files Which Have Errors

2014-06-19 Thread Simon Cheng

Hi there, I have posted 190,000 simple XML using POST.JAR and there are only 8 files that were with errors. But how do I know which are the ones have errors? Thank you in advance, Simon Cheng.

Re: Export big extract from Solr to [My]SQL

2014-05-02 Thread simon

problems (and DBI takes care of writing to a database). I'm probably going to rewrite in Python since the final destination of many of our extracts is Tableau, which has a Python API for creating TDEs (Tableau data extracts) regards -Simon On Fri, May 2, 2014 at 7:43 AM, Siegfried Goeschl

Re: Duplicate Unique Key

2014-04-08 Thread Simon

MergingIndex is not the case here as I am not doing that. Even the issue is gone for now, it is not a relief for me as I am not sure how to explain this to others (peer, boss and user). I am thinking of implement a watch dog to check whenever the total Solr documents exceeds the number of items i

Re: Duplicate Unique Key

2014-04-07 Thread Simon

Erick, It's indeed quite odd. And after I trigger re-indexing all documents (via the normal process of existing program). The duplication is gone. It can not be reproduced easily. But it did occur occasionally and that makes it a frustrating task to troubleshoot. Thanks, Simon --

Duplicate Unique Key

2014-04-07 Thread Simon

derstanding solr uniqueKey is like a database primary key. I am wondering how could I end up with two documents with same uniqueKey in the index. Thanks, Simon -- View this message in context: http://lucene.472066.n3.nabble.com/Duplicate-Unique-Key-tp4129651.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Luke 4.7.0 released

2014-04-03 Thread simon

adding that worked - thanks. On Thu, Apr 3, 2014 at 4:18 AM, Dmitry Kan wrote: > Hi Joshua, Simon, > > do you pass the -XX:MaxPermSize=512m to your jvm? > > java -XX:MaxPermSize=512m -jar luke-with-deps.jar > > My java runtime environment is of the same version as Simon&#x

Re: Luke 4.7.0 released

2014-04-02 Thread simon

Also seeing this on Mac OS X. java version = Java(TM) SE Runtime Environment (build 1.7.0_51-b13) On Wed, Apr 2, 2014 at 11:01 AM, Joshua P wrote: > Hi there! > > I'm recieving the following errors when trying to run luke-with-deps.jar > > SLF4J: Failed to load class "org.slf4j.impl.StaticLogg

[ANNOUNCE] Apache Solr 4.7.0 released.

2014-02-26 Thread Simon Willnauer

February 2014, Apache Solr™ 4.7 available The Lucene PMC is pleased to announce the release of Apache Solr 4.7 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted searc

Re: Solr server requirements for 100+ million documents

2014-01-26 Thread simon

Tika code as I am not using it). You should replace StreamingUpdateSolrServer by ConcurrentUpdateSolrServer and experiment to find the optimal number of threads to configure. -Simon On Sun, Jan 26, 2014 at 11:28 AM, Erick Erickson wrote: > 1> That's what I'd do. For incremen

[ANNOUNCE] Apache Solr 4.6 released.

2013-11-24 Thread Simon Willnauer

tware Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. Happy Searching Simon

Solr block join

2013-10-28 Thread Simon

e to share your solutions? Thanks, Simon -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-block-join-tp4098128.html Sent from the Solr - User mailing list archive at Nabble.com.

Next official Solr release

2013-10-02 Thread Simon Zeng

Hi Solr team, I am working on a project that needs Solr 'block join' feature that currently available in 4.6 nightly build. My boss feel more comfortable with an official release like 4.4. I am wondering if there is any target release date for Solr 4.6(+) or 5.0? Thanks, Simon

[ANNOUNCE] Apache Solr 4.3 released

2013-05-06 Thread Simon Willnauer

May 2013, Apache Solr™ 4.3 available The Lucene PMC is pleased to announce the release of Apache Solr 4.3. Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, d

Re: SPAN queries in solr

2012-11-23 Thread simon

take a look at SOLR-2703, which was committed for 4.0. It provides a Solr wrapper for the surround query parser, which supports span queries. On Fri, Nov 23, 2012 at 3:38 PM, Anirudha Jadhav wrote: > What is the best way to use span queries in solr ? > > I see https://issues.apache.org/jira/brow

Re: multi-core sharing synonym map

2012-10-12 Thread simon

to it... -Simon On Fri, Oct 12, 2012 at 12:27 PM, Phil Hoy wrote: > Hi, > > We have a multi-core set up with a fairly large synonym file, all cores > share the same schema.xml and synonym file but when solr loads the cores, > it loads multiple instances of the synonym map, this is a

Re: Installing Solr on a shared hosting server?

2012-10-10 Thread simon

some time back I used dreamhost for a Solr based project. Looks as though all their offerings, including shared hosting have Java support - see http://wiki.dreamhost.com/What_We_Support. I was very happy with their service and support. -Simon On Tue, Oct 9, 2012 at 10:44 AM, Michael Della Bitta

Re: Memory leak?? with CloseableThreadLocal with use of Snowball Filter

2012-08-01 Thread Simon Willnauer

ike to follow it AFAIK Robert already created and issue here: https://issues.apache.org/jira/browse/LUCENE-4279 and it seems fixed. Given the massive commit last night its already committed and backported so it will be in 4.0-BETA. simon > > Thanks again > Saroj > > > > >

Re: Solr 4.0 IllegalStateException: this writer hit an OutOfMemoryError; cannot commit

2012-07-10 Thread Simon Willnauer

Cache usage. Are you sorting / facet on anything? simon On Tue, Jul 10, 2012 at 4:49 PM, Vadim Kisselmann wrote: > Hi Robert, > >> Can you run Lucene's checkIndex tool on your index? > > No, unfortunately not. This Solr should run without stoppage, an > tomcat-restart i

Re: Multiple document types

2012-01-25 Thread Simon Willnauer

On Thu, Jan 26, 2012 at 12:05 AM, Frank DeRose wrote: > Hi Simon, > > No, not different entity types, but actually different document types (I > think). What would be ideal is if we could have multiple elements > in the data-config.xml file and some way of mapping each different

Call for Submission Berlin Buzzwords 2012all for Submission Berlin Buzzwords - http://berlinbuzzwords.de

2012-01-11 Thread Simon Willnauer

ittee Chairs: * Isabel Drost (Nokia & Apache Mahout) * Jan Lehnardt (CouchBase & Apache CouchDB) * Simon Willnauer (SearchWorkings & Apache Lucene) * Grant Ingersoll (Lucid Imagination & Apache Lucene) * Owen O’Malley (Yahoo Inc. & Apache Hadoop) * Jim Webber (Neo Tec

Re: Solr Scoring question

2012-01-05 Thread Simon Willnauer

sue here is that with *:* you don't have anything to score while with q=tag:car you can score the term car with tf idf etc. does that make sense? simon > > I have a JSP file that will take in parameters, do some work on them > to make them appropriate for Solr, then pass the query

Heads Up - Index File Format Change on Trunk

2012-01-05 Thread Simon Willnauer

Folks, I just committed LUCENE-3628 [1] which cuts over Norms to DocVaues. This is an index file format change and if you are using trunk you need to reindex before updating. happy indexing :) simon [1] https://issues.apache.org/jira/browse/LUCENE-3628

Re: spellcheck-index is rebuilt on commit

2012-01-03 Thread Simon Willnauer

istener doesn't safe any state or the version of the index since it was last called and assumes the index was just optimized. simon > > Thanks > Oliver > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/spellcheck-index-is-rebuilt-on-commit-

Re: spellcheck-index is rebuilt on commit

2012-01-02 Thread Simon Willnauer

ks if there is a single segment in the index and rebuilds the index. if this is the case, I think this is a bug... can you open a jira ticket? simon On Mon, Jan 2, 2012 at 8:36 PM, OliverS wrote: > Hi > > Looks like they strip the -Text for the list. Whole message here: > http:/

Re: Matching all documents in the index

2011-12-13 Thread Simon Willnauer

try *:* instead of *.* simon On Tue, Dec 13, 2011 at 5:03 PM, Kissue Kissue wrote: > Hi, > > I have come across this query in the admin interface: *.* > Is this meant to match all documents in my index? > > Currently when i run query with q= *.*, numFound is 130310 but the a

Re: Integrating Surround Query Parser

2011-12-02 Thread simon

oops, didn't see all of the thread before I hit send. Good work, Erik On Fri, Dec 2, 2011 at 5:21 PM, simon wrote: > Take a look at https://issues.apache.org/jira/browse/SOLR-2703, which > integrates the surround parser into Solr trunk. There's a dependency on a > Lucene pat

Re: Integrating Surround Query Parser

2011-12-02 Thread simon

ersions of Lucene I'm not sure how easily this would all would backport to Solr 3.1, but you could try.... best -Simon On Tue, Nov 22, 2011 at 1:05 AM, Rahul Mehta wrote: > Hello, > > I want to Run surround query . > > > 1. Downloading from >

Re: Seek past EOF

2011-11-30 Thread Simon Willnauer

can you give us some details about what filesystem you are using? simon On Wed, Nov 30, 2011 at 3:07 PM, Ruben Chadien wrote: > Happened again…. > > I got 3 directories in my index dir > > 4096 Nov 4 09:31 index.2004083156 > 4096 Nov 21 10:04 index.2021090440 > 40

Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Simon Willnauer

I wonder if you have a explicitly configured merge policy? In Solr 1.4 ie. Lucene 2.9 LogMergePolicy was the default but in 3.5 TieredMergePolicy is used by default. This could explain the differences segment wise since from what I understand you are indexing the same data on 1.4 and 3.5? simon

[ANNOUNCE] Apache Solr 3.5 released

2011-11-26 Thread Simon Willnauer

27 November 2011, Apache Solr™ 3.5.0 available The Lucene PMC is pleased to announce the release of Apache Solr 3.5.0. Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, fa

JVM Bugs affecting Lucene & Solr

2011-11-15 Thread Simon Willnauer

ing on an older JVM you could be affected by this bug and should either upgrade to a new JVM or use -XX:+UseMembar to start you JVM. In general its a good idea to keep an eye on http://wiki.apache.org/lucene-java/SunJavaBugs we try to keep this up-to-date thanks, Simon

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Simon Willnauer

On Fri, Oct 28, 2011 at 9:17 PM, Simon Willnauer wrote: > Hey Roman, > > On Fri, Oct 28, 2011 at 8:38 PM, Roman Alekseenkov > wrote: >> Hi everyone, >> >> I'm looking for some help with Solr indexing issues on a large scale. >> >> We are indexing fe

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Simon Willnauer

s via IndexWriterConfig#setMaxBufferedDeleteTerms which are also applied without blocking other threads. In trunk we hijack indexing threads to do all that work concurrently so you get better cpu utilization and due to concurrent flushing better and usually continuous IO utilization. hope that

Re: changing omitNorms on an already built index

2011-10-28 Thread Simon Willnauer

On Fri, Oct 28, 2011 at 12:20 AM, Robert Muir wrote: > On Thu, Oct 27, 2011 at 6:00 PM, Simon Willnauer > wrote: >> we are not actively removing norms. if you set omitNorms=true and >> index documents they won't have norms for this field. Yet, other >> segment st

Re: changing omitNorms on an already built index

2011-10-27 Thread Simon Willnauer

ue it will be true for other segment eventually. If you optimize you index you should see that norms go away. simon On Thu, Oct 27, 2011 at 11:17 PM, Marc Sturlese wrote: > As far as I know there's no issue about this. You have to reindex and that's > it. > In which kind of fi

Re: How can I force the threshold for a fuzzy query?

2011-10-27 Thread Simon Willnauer

simon On Thu, Oct 27, 2011 at 4:54 PM, Gustavo Falco wrote: > Hi guys, > > I'm new to Solr (as you may guess for the subject). I'd like to force the > threshold for fuzzy queries to, say, 0.7. I've read that fuzzy queries are > expensive, but limiting it's thre

Re: Optimization /Commit memory

2011-10-25 Thread Simon Willnauer

A commit is basically flushing the segment you have in memory (IndexWriter memory) to disk. compression ratio can be up to 30% of the ram cost or even more depending on your data. The actual commit doesn't need a notable amount of memory. hope this helps simon On Mon, Oct 24, 2011 at 7:38 PM,

Re: some basic information on Solr

2011-10-25 Thread Simon Willnauer

large set of document formats (http://tika.apache.org/0.10/formats.html). Hope this helps here?! > > 2. How much is estimated cost of incidents per year for Solr ? I have to admit I don't know what you are asking for. can you elaborate on this a bit? What is an incident in this context? s

Re: accessing the query string from inside TokenFilter

2011-10-25 Thread Simon Willnauer

he query parser? Maybe even in lucene. can you bring this to the dev list? simon > > Regards Bernd >

Re: How to make UnInvertedField faster?

2011-10-22 Thread Simon Willnauer

On Fri, Oct 21, 2011 at 4:37 PM, Michael McCandless wrote: > Well... the limitation of DocValues is that it cannot handle more than > one value per document (which UnInvertedField can). you can pack this into one byte[] or use more than one field? I don't see a real limitation h

1 2 3 >

1 - 100 of 282 matches

Mail list logo