Extended characters

2017-10-29 Thread Robert Brown
Hi, I have a text field in my index containing extended characters, which I'd like to match against when searching without the extended characters. e.g. field contains "Ensō" which I want to match when searching for just "enso". My current config for that field (type) is given below:

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Robert Brown
of them. And check whether your swap device is working. With a working swap disk, maybe your system would just slow down instead of crashing. No, sorry, your swap _is_ working, and java is mostly swapped out. It must be slow. Cheers -- Rick On May 26, 2017 1:25:55 PM EDT, Robert Brown wrote

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Robert Brown
Thanks Shawn, It's more inquisitiveness now more than anything. http://web.lavoco.com/top.png (forgot to mention mariadb on there too :) On 26/05/17 16:20, Shawn Heisey wrote: On 5/26/2017 11:01 AM, Robert Brown wrote: Let's assume I can't get more RAM - why would an i

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Robert Brown
s and filtering, about 20 fields in total. Currently just 500 docs. On 26/05/17 15:43, Erick Erickson wrote: Or get more physical memory? Solr _likes_ memory, you won't be able to do much with only 2G physical memory.. On Fri, May 26, 2017 at 2:00 AM, Robert Brown wrote: Thanks Rick,

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Robert Brown
ually, every one). cheers -- Rick On 2017-05-25 06:55 PM, Robert Brown wrote: Hi, I'm currently running 6.5.1 with a tiny index, less than 1MB. When I restart another app on the same server as Solr, Solr occasionally dies, but no solr_oom_killer.log file. Heap size is 256MB (~30MB used)

(Tiny Index) Solr dies but not OOM

2017-05-25 Thread Robert Brown
Hi, I'm currently running 6.5.1 with a tiny index, less than 1MB. When I restart another app on the same server as Solr, Solr occasionally dies, but no solr_oom_killer.log file. Heap size is 256MB (~30MB used), Physical RAM 2GB, typically using 1.5GB. How else can I debug what's causing it?

Grouping performance with MLT

2016-07-05 Thread Robert Brown
Hi All, I have an index with 10m documents. When performing an MLT query and grouping by a field, response times are roughly 20s. The group field is currently populated with unique values, as we now start to manually group documents (hence using MLT). The group field has docValues turned o

Re: Alternate Port Not Working for Solr 6.0.0

2016-06-02 Thread Robert Brown
In addition to a separate proxy you could use iptables, I use this technique for another app (running on port 5000 but requests come in port 80)... *nat :PREROUTING ACCEPT [0:0] :POSTROUTING ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A PREROUTING -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 5000

Re: [E] Re: Faceting Question(s)

2016-06-02 Thread Robert Brown
MaryJo, I think you've mis-understood. The counts are different simply because the 2nd query contains an filter of a facet value from the 1st query - that's completely expected. The issue is how to get the original facet counts (with no filters but same q) in the same call as also filtering b

MongoDB and Solr - Massive re-indexing

2016-06-02 Thread Robert Brown
Hi, Currently we import data-sets from various sources (csv, xml, json, etc.) and POST to Solr, after some pre-processing to get it into a consistent format, and some other transformations. We currently dump out to a json file in batches of 1,000 documents and POST that file to Solr. Rough

Re: Idle timeout expired: 50000/50000 ms

2016-04-29 Thread Robert Brown
, Shawn Heisey wrote: On 4/28/2016 3:13 PM, Robert Brown wrote: I operate several collections (about 7-8) all using the same 5-node ZooKeeper cluster. They've been in production for 3 months, with only 2 previous issues where a Solr node went down. Tonight, during several updates to the vario

Idle timeout expired: 50000/50000 ms

2016-04-28 Thread Robert Brown
Hi, I operate several collections (about 7-8) all using the same 5-node ZooKeeper cluster. They've been in production for 3 months, with only 2 previous issues where a Solr node went down. Tonight, during several updates to the various collections, a handful failed due to the below error.

HTTP Client Only

2016-04-14 Thread Robert Brown
Hi, I have a collection with 2 shards, 1 replica each. When I send updates, I currently /admin/ping each of the nodes, and then pick one at random. I'm guessing it makes more sense to only send updates to one of the leaders, so I'm contemplating getting the collection status instead, and fi

Commiting with no updates

2016-04-13 Thread Robert Brown
Hi, My autoSoftCommit is set to 1 minute. Does this actually affect things if no documents have actually been updated/created? Will this also affect the clearing of any caches? Is this also the same for hard commits, either with autoCommit or making an explicit http request to commit. Th

Bad Request

2016-04-12 Thread Robert Brown
Hi, My collection had issues earlier, 1 shard showed as Down, the other only replica was Gone. Both were actually still up and running, no disk or CPU issues. This occurred during updates. The server since recovered after a reboot. Upon trying to update the index again, I'm now getting cons

Re: Range filters: inclusive?

2016-04-11 Thread Robert Brown
It's a string field, ean... http://paste.scsys.co.uk/510132 On 04/11/2016 06:00 PM, Yonik Seeley wrote: On Mon, Apr 11, 2016 at 12:52 PM, Robert Brown wrote: Hi, When I perform a range query of ['' TO *] to filter out docs where a particular field has a value, this does wha

Range filters: inclusive?

2016-04-11 Thread Robert Brown
Hi, When I perform a range query of ['' TO *] to filter out docs where a particular field has a value, this does what I want, but I thought using the square brackets was inclusive, so empty-string values should actually be included? The JSON I post to Solr has empty values, not null/undefine

Re: Delete by query, including negative filters

2016-04-09 Thread Robert Brown
excellent writeup on this subtlety: https://lucidworks.com/blog/2011/12/28/why-not-and-or-and-not/ Best, Erick On Sat, Apr 9, 2016 at 3:51 AM, Robert Brown wrote: Hi, I have this delete query: "*partner:pg AND market:us AND last_seen:[* TO 2016-04-09T02:01:06Z]*" And would like to

Delete by query, including negative filters

2016-04-09 Thread Robert Brown
Hi, I have this delete query: "*partner:pg AND market:us AND last_seen:[* TO 2016-04-09T02:01:06Z]*" And would like to add "AND merchant_id != 12345 AND merchant_id != 98765" Would this be done by including "*AND -merchant_id:12345 AND -merchant_id:98765*" ? Thanks, Rob

Re: Update Speed: QTime 1,000 - 5,000

2016-04-05 Thread Robert Brown
ents will make a difference too - so the comparison to 300 - 500 on other cloud setups may or may not be comparing apples to oranges... Are the "new" documents actually new or are you overwriting existing solr doc ID's? If you are overwriting, you may want to optimize and see if that

Update Speed: QTime 1,000 - 5,000

2016-04-05 Thread Robert Brown
Hi, I'm currently posting updates via cURL, in batches of 1,000 docs in JSON files. My setup consists of 2 shards, 1 replica each, 50m docs in total. These updates are hitting a node at random, from a server across the Internet. Apart from the obvious delay, I'm also seeing QTime's of 1,00

Re: Parallel Updates

2016-04-04 Thread Robert Brown
was really important you could have two or more "indexing" servers and fire multiple threads at each one... You probably already know this, but the key is how often you "commit" and force the indexing to occur... On Mon, Apr 4, 2016 at 3:33 PM, Robert Brown wrote: Hi, Does So

Parallel Updates

2016-04-04 Thread Robert Brown
Hi, Does Solr have any sort of limit when attempting multiple updates, from separate clients? Are there any safe thresholds one should try to stay within? I have an index of around 60m documents that gets updated at key points during the day from ~200 downloaded files - I'd like to fork off

Re: Facet by truncated date

2016-03-31 Thread Robert Brown
et use facet.range.gap to set how dates are "truncated". Regards, Emir On 31.03.2016 10:52, Robert Brown wrote: > Hi, > > Is it possible to facet by a date (solr.TrieDateField) but truncated > to the day, or even the hour? > > If not, are there any other options apart from

Facet by truncated date

2016-03-31 Thread Robert Brown
Hi, Is it possible to facet by a date (solr.TrieDateField) but truncated to the day, or even the hour? If not, are there any other options apart from storing that truncated data in another (string?) field? Thanks, Rob

Re: Index not fitting in memory (file-cache)

2016-03-24 Thread Robert Brown
olve those random slowdowns - or at least rule it out. On 03/24/2016 01:44 PM, Shawn Heisey wrote: On 3/24/2016 4:02 AM, Robert Brown wrote: If my index data directory size is 70G, and I don't have 70G (plus heap, etc) in the system, this will occasionally affect search speed right? When Sol

Index not fitting in memory (file-cache)

2016-03-24 Thread Robert Brown
Hi, If my index data directory size is 70G, and I don't have 70G (plus heap, etc) in the system, this will occasionally affect search speed right? When Solr has to resort to reading from disk? Before I go out and throw more RAM into the system, in the above example, what would you recommend

Re: Creating new cluster with existing config in zookeeper

2016-03-23 Thread Robert Brown
which is probably sounding confused. Cheers, Rob On 03/23/2016 04:03 PM, Tom Evans wrote: On Wed, Mar 23, 2016 at 3:43 PM, Robert Brown wrote: So I setup a new solr server to point to my existing ZK configs. When going to the admin UI on this new server I can see the shards/replica

Re: Creating new cluster with existing config in zookeeper

2016-03-23 Thread Robert Brown
ote: The whole _point_ of configsets is to re-use them in multiple collections, so please do! Best, Erick On Tue, Mar 22, 2016 at 5:38 AM, Robert Brown wrote: Hi, Is it safe to create a new cluster but use an existing config set that's in zookeeper? Or does that config set contain the cluster

Re: Delete by query using JSON?

2016-03-22 Thread Robert Brown
"why do you care? just do this ..." I see this a lot on mailing lists these days, it's usually a learning curve/task/question. I know I fall into these types of questions/tasks regularly. Which usually leads to "don't tell me my approach is wrong, just explain what's going on, and why", or

Re: Creating new cluster with existing config in zookeeper

2016-03-22 Thread Robert Brown
sn't it? (I added it to ZK as per the docs), just a bit confusing to see some files/directories from ZK, and some not. Thanks for any more insight. On 03/22/2016 04:57 PM, Shawn Heisey wrote: On 3/22/2016 6:38 AM, Robert Brown wrote: Is it safe to create a new cluster but use an exist

Creating new cluster with existing config in zookeeper

2016-03-22 Thread Robert Brown
Hi, Is it safe to create a new cluster but use an existing config set that's in zookeeper? Or does that config set contain the cluster status too? I want to (re)-build a cluster from scratch, with a different amount of shards, but not using shard-splitting. Thanks, Rob

Re: Boosts for relevancy (shopping products)

2016-03-20 Thread Robert Brown
e used to boost individual products for specific keywords - I'm beginning to think this is actually our best hope? e.g. A multi-valued field containing keywords that resulted in a click on that product. On 03/18/2016 04:14 PM, Robert Brown wrote: That does sound rather useful! We c

Re: Boosts for relevancy (shopping products)

2016-03-19 Thread Robert Brown
rity boost can also be useful: you can measure it by sales or by number of clicks. I use a combination of both, and store those values using partial updates. Hope it helps, John On 17/03/16 09:36, Robert Brown wrote: Hi, I currently have an index of ~50m docs representing shopping products: name, de

Re: Shard splitting for immediate performance boost?

2016-03-19 Thread Robert Brown
about how much memory you have on your machine, how much RAM you're allocating to Solr and the like so it's hard to say much other than generalities Best, Erick On Sat, Mar 19, 2016 at 10:41 AM, Shawn Heisey wrote: On 3/19/2016 11:12 AM, Robert Brown wrote: I have an index o

Re: Boosts for relevancy (shopping products)

2016-03-19 Thread Robert Brown
l. On Mar 18, 2016 10:40 AM, "Robert Brown" wrote: Thanks for the added input. I'll certainly look into the machine learning aspect, will be good to put some basic knowledge I have into practice. I'd been led to believe the tie parameter didn't actually do a lot. :-/

Shard splitting for immediate performance boost?

2016-03-19 Thread Robert Brown
Hi, I have an index of 60m docs split across 2 shards (each with a replica). When load testing queries (picking random keywords I know exist), and randomly requesting facets too, 95% of my responses are under 0.5s. However, during some random manual tests, sometimes I see searches taking bet

Re: Boosts for relevancy (shopping products)

2016-03-19 Thread Robert Brown
osted. Even if you haven't seen them at all. Cheers On Fri, Mar 18, 2016 at 4:21 PM, Robert Brown wrote: It's also worth mentioning that our platform contains shopping products in every single category, and will be searched by absolutely anyone, via an API made available to various websi

Re: Boosts for relevancy (shopping products)

2016-03-19 Thread Robert Brown
als, and you need to carefully tune the features of your interest. But the results could be surprising . [1] https://issues.apache.org/jira/browse/SOLR-8542 [2] Learning to Rank in Solr <https://www.youtube.com/watch?v=M7BKwJoh96s> Cheers On Thu, Mar 17, 2016 at 10:15 AM, Robert Brown wrot

Boosts for relevancy (shopping products)

2016-03-19 Thread Robert Brown
Hi, I currently have an index of ~50m docs representing shopping products: name, description, brand, category, etc. Our "qf" is currently setup as: name^5 brand^2 category^3 merchant^2 description^1 mm: 100% ps: 5 I'm getting complaints from the business concerning relevancy, and was hopin

Relevancy for "tablet"

2016-03-09 Thread Robert Brown
Hi, I'm looking for some advice and possible options for dealing with our relevancy when searching through shopping products. A search for "tablet" returns pills, when the user would expect electronic devices. Without any extra criteria (like category), how would/could you manage this situ

Different scores depending on cloud node

2016-03-08 Thread Robert Brown
Hi, I have 2 shards, each with 1 replica. When sending the same request to the cluster, I'm seeing the same results, but ordered differently, and with different scores. Does this highlight an issue with my index, or is this an accepted anomaly? Example of 8 results: 1st call: 160.2047 160.

Re: Disk Usage anomoly across shards/replicas

2016-03-06 Thread Robert Brown
ithin the shard directory there should be multiple directories - "tlog" "index." . Do you see multiple "index.*" directories in there for the shard which has more data on disk? On Sat, Mar 5, 2016 at 6:39 PM, Robert Brown wrote: Hi, I have an index with 65m docs

Re: Disk Usage anomoly across shards/replicas

2016-03-05 Thread Robert Brown
Thanks Shawn, I'm just about to remove that node and rebuild it, at least there won't be any actual downtime. On 05/03/16 14:44, Shawn Heisey wrote: On 3/5/2016 6:09 AM, Robert Brown wrote: I have an index with 65m docs spread across 2 shards, each with 1 replica. The replica1

Re: Disk Usage anomoly across shards/replicas

2016-03-05 Thread Robert Brown
Nope, we never run optimise. Would there be some tell-tale files in the index dir to indicate if someone else had ran an optimise? On 05/03/16 13:11, Binoy Dalal wrote: Have you executed an optimize across that particular shard? On Sat, 5 Mar 2016, 18:39 Robert Brown, wrote: Hi, I

Disk Usage anomoly across shards/replicas

2016-03-05 Thread Robert Brown
Hi, I have an index with 65m docs spread across 2 shards, each with 1 replica. The replica1 of shard2 is using up nearly double the amount of disk space as the other shards/replicas. Could there be a reason/fix for this? /home/s123/solr/data/de_shard1_replica1 = 72G numDocs:34,786,026 maxD

SolrCloud, Best performance directly from C

2016-02-22 Thread Robert Brown
Hi, As a pure C user, without wishing to use Java, what's my best approach for managing the SolrCloud environment? I operate a FastCGI environment, so I have the persistence to cache the state of the "cloud". So far I see good utilisation of the collections API being my best bet? Any other

Re: MLT Component only returns ID and score

2016-01-31 Thread Robert Brown
erhaps it will work better for you. Upayavira On Sun, Jan 31, 2016, at 06:31 PM, Robert Brown wrote: Hi, I've had to switch to using the MLT component, rather than the handler, since I'm running on Solrcloud (5.4) and if I hit a node without the starting document, I get nothing back.

MLT Component only returns ID and score

2016-01-31 Thread Robert Brown
Hi, I've had to switch to using the MLT component, rather than the handler, since I'm running on Solrcloud (5.4) and if I hit a node without the starting document, I get nothing back. When I perform a MLT query, I only get back the ID and score for the similar documents, yet my fl=*,score.

Query cache with grouping

2016-01-28 Thread Robert Brown
Hi, During some testing, I've found that the queryResultCache is not used when I use grouping. Is there another cache that is being used in this scenario, if so, which, and how can I ensure they'[re providing a real benefit? Thanks, Rob

Leader Election Time

2016-01-15 Thread Robert Brown
Hi, I have 2 shards, 1 leader and 1 replica in each. I've just removed a leader from one of the shards but the replica hasn't become a leader yet. How quickly should this normally happen? tickTime=2000 dataDir=/home/rob/zoodata clientPort=2181 initLimit=5 syncLimit=2 Thanks, Rob

Re: Querying only replica's

2016-01-11 Thread Robert Brown
hat I was after. On 01/11/2016 05:16 PM, Alessandro Benedetti wrote: mmm i think there is a misconception here : On 10 January 2016 at 19:00, Robert Brown wrote: I'm thinking more about how the external load-balancer will know if a node is down, as to take it out the pool of active ser

Re: Querying only replica's

2016-01-10 Thread Robert Brown
care to Or just let Zookeeper do that for you. One of the tasks of Zookeeper is pinging all the machines with all the replicas and, if any of them are unreachable, telling the rest of the cluster that that machine is down. Best, Erick On Sun, Jan 10, 2016 at 5:19 AM, Robert Brown wrote: T

Re: Querying only replica's

2016-01-10 Thread Robert Brown
to start the 6.0 release process, so it's up in the air. On Sat, Jan 9, 2016 at 12:04 PM, Robert Brown wrote: Hi, (btw, when is 5.5 due? I see the docs reference it, but not the download page) Anyway, I index and query Solr over HTTP (no SolrJ, etc.) - is it best/good to get the CL

Querying only replica's

2016-01-09 Thread Robert Brown
Hi, (btw, when is 5.5 due? I see the docs reference it, but not the download page) Anyway, I index and query Solr over HTTP (no SolrJ, etc.) - is it best/good to get the CLUSTERSTATUS via the collection API and explicitly send queries to a replica to ensure I don't send queries to the leade

Re: SolrCloud: Setting/finding node names for deleting replicas

2016-01-08 Thread Robert Brown
k0.example.com:2181/myapp -c collection1 --node node1.example.com --slice shard2 I mention this tool every now and then on this list because I like it, but I’m the author, so take that with a pretty big grain of salt. Feedback is very welcome. On 1/8/16, 1:18 PM, "Robert Brown&qu

SolrCloud: Setting/finding node names for deleting replicas

2016-01-08 Thread Robert Brown
Hi, I'm having trouble identifying a replica to delete... I've created a 3-shard cluster, all 3 created on a single host, then added a replica for shard2 onto another host, no problem so far. Now I want to delete the original shard, but got this error when trying a *replica* param value I th

Re: Which Tokeniser (and/or filter)

2012-02-08 Thread Robert Brown
stion for your product manager > > Best > Erick > > On Wed, Feb 8, 2012 at 9:23 AM, Robert Brown wrote: >> Thanks Erick, >> >> I didn't get confused with multiple tokens vs multiValued  :) >> >> Before I go ahead and re-index 4m docs, and belie

Re: Which Tokeniser (and/or filter)

2012-02-08 Thread Robert Brown
Thanks Erick, I didn't get confused with multiple tokens vs multiValued :) Before I go ahead and re-index 4m docs, and believe me I'm using the analysis page like a mad-man! What do I need to configure to have the following both indexed with and without the dots... .net sales manager. £12.50

Re: Which Tokeniser (and/or filter)

2012-02-07 Thread Robert Brown
This all seems a bit too much work for such a real-world scenario? --- IntelCompute Web Design & Local Online Marketing http://www.intelcompute.com On Tue, 7 Feb 2012 05:11:01 -0800 (PST), Ahmet Arslan wrote: >> I'm still finding matches across >> newlines >> >> index... >> >> i am fluent >>

Re: Which Tokeniser (and/or filter)

2012-02-07 Thread Robert Brown
I'm still finding matches across newlines index... i am fluent german racing search... "fluent german" Any suggestions? I've currently got this in wdftypes.txt for WordDelimiterfilterfactory \u000A => ALPHANUM \u000B => ALPHANUM \u000C => ALPHANUM \u000D => ALPHANUM # \u000D\u000A => ALPHA

Re: Which Tokeniser (and/or filter)

2012-02-06 Thread Robert Brown
mapping dots to spaces. I don't think that's workable anyway since ".net" would cause issues. Tying out the wdftypes now... --- IntelCompute Web Design & Local Online Marketing http://www.intelcompute.com On Mon, 6 Feb 2012 04:10:18 -0800 (PST), Ahmet Arslan wrote: >> My fear is what will t

Re: Which Tokeniser (and/or filter)

2012-02-06 Thread Robert Brown
My fear is what will then happen with highlighting if I use re-mapping? On Mon, 6 Feb 2012 03:33:03 -0800 (PST), Ahmet Arslan wrote: >> I need to tokenise on whitespace, full-stop, and comma >> ONLY. >> >> Currently using solr.WhitespaceTokenizerFactory with >> WordDelimiterFilterFactory but th

Symbols in synonyms

2012-02-06 Thread Robert Brown
is it good practice, common, or even possible to put symbols in my list of synonyms? I'm having trouble indexing and searching for "A&E", with it being split on the &. we already convert .net to dotnet, but don't want to store every combination of 2 letters, A&E, M&E, etc. -- IntelComp

Which Tokeniser (and/or filter)

2012-02-06 Thread Robert Brown
Hi, I need to tokenise on whitespace, full-stop, and comma ONLY. Currently using solr.WhitespaceTokenizerFactory with WordDelimiterFilterFactory but this is also splitting on &, /, new-line, etc. It seems such a simple setup, what am I doing wrong? what do you use for such "normal searchin

"sage 200" not matching "... sage 200."

2012-01-30 Thread Robert Brown
The trailing full-stop above is not being matched when searching for "sage 200" for the below field type... Do I need the WordDelimiterFilterFactory for this to work as expected? I don't see any mention of periods being discussed in the docs. positionIncrementGap="100">

edismax phrase matching with a non-word char inbetween

2011-12-13 Thread Robert Brown
I have a field which is indexed and queried as follows: ignoreCase="true" expand="true"/> words="stopwords.txt" enablePositionIncrements="true" /> generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> protected="protwords.txt"/> When search

Highlighting to include stop words

2011-12-08 Thread Robert Brown
I have a text field, using stopwords... Index and query analysers setup as follows: SynonymFilterFactory StopFilterFactory WordDelimiterFilterFactory LowerCaseFilterFactory SnowballPorterFilterFactory Searching for "front of house" brings back perfect matches, but doesn't highlight the "of".

highlight 1 field twice

2011-12-06 Thread Robert Brown
When searching against 1 field, is it possible to have highlighting returned 2 different ways? We'd like the full field returned with keywords highlighted, but then also returned as snippets. Any possible approaches? -- IntelCompute Web Design & Local Online Marketing http://www.intelcomp

lower score for synonyms

2011-12-06 Thread Robert Brown
is it possible to lower the score for synonym matches? we setup... admin => administration but if someone searches specifically for "admin", we want those specific matches to rank higher than matches for "administration" -- IntelCompute Web Design & Local Online Marketing http://www.inte

Re: overriding qf in q affecting boosts

2011-12-05 Thread Robert Brown
se, the boost and fields in the "qf" parameter won't be > considered for the search. With this query Solr will search for documents > with the terms "this" and/or (depending on your default operator) "that" in > the field1 and the term "other"

Re: overriding qf in q affecting boosts

2011-12-05 Thread Robert Brown
l not be considered and if > you use LuceneQP the qf are not considered and it is going to use the > default search field for the term "other" and no boost. > > You can see this very easily turning on the "debugQuery". > > Regards, > > Tomás > >

overriding qf in q affecting boosts

2011-12-05 Thread Robert Brown
If I have a set list in solrconfig for my "qf" along with their boosts, and I then specify field names directly in q (where I could also override the boosts), are the boosts left in place, or reset to 1? this^3 that^2 other^9 ie q=field1:+(this that) +(other) -- IntelCompute We

switching on hl.requireFieldMatch reducing highlighted fields returned

2011-12-01 Thread Robert Brown
I have a query which is highlighting 3 snippets in 1 field, and 1 snippet in another field. By enabling hl.requireFieldMatch, only the latter highlighted field is returned. from this... plc Whetstone Temporary [hl-on]Sales[hl-off] Assistant Customer service Cashier work 08 and custom

Re: Don't snowball depending on terms

2011-11-30 Thread Robert Brown
nt this capability? I'd strongly advise that you > just forget about > this feature unless and until there's a demonstrated need. Here's a > blog I made at > Lucid. Long-winded, but I'm like that sometimes > > http://www.lucidimagination.com/blog/2011/11/03/sto

Re: Don't snowball depending on terms

2011-11-30 Thread Robert Brown
Boosts can be included there too can't they? so this is valid? q=+(stemmed^2:perl or stemmed^3:java) +unstemmed^5:"development manager" is it possible to have different boosts on the same field btw? We currently search across 5 fields anyway, so my queries are gonna start getting messy. :-/

Don't snowball depending on terms

2011-11-29 Thread Robert Brown
Is it possible to search a field but not be affected by the snowball filter? ie, searching for "manage" is matching "management", but a user may want to restrict results to only containing "manage". I was hoping that simply quoting the term would do this, but it doesn't appear to make any di

Highlighting too much, indexing not seeing commas?

2011-11-23 Thread Robert Brown
Solr 3.3.0 I have a field/type indexed as below. For a particular document the content of this field is 'FreeBSD,Perl,Linux,Unix,SQL,MySQL,Exim,Postgresql,Apache,Exim' Using eDismax, mm=1 When I query for... +perl +(apache sql) +(linux unix) Strangely, the highlighting is being returned as

Re: Always return total number of documents

2011-10-28 Thread Robert Brown
011 11:43:11 +0200, Michael Kuhlmann wrote: > Am 28.10.2011 11:16, schrieb Robert Brown: >> Is there no way to return the total number of docs as part of a search? > > No, it isn't. Usually this information is of absolutely no value to the > end user. > > A workar

Always return total number of documents

2011-10-28 Thread Robert Brown
Currently I'm making 2 calls to Solr to be able to state "matched 20 out of 200 documents". Is there no way to return the total number of docs as part of a search? -- IntelCompute Web Design & Local Online Marketing http://www.intelcompute.com

Limit by score? sort by other field

2011-10-27 Thread Robert Brown
When we display search results to our users we include a percentage score. Top result being 100%, then all others normalised based on the maxScore, calculated outside of Solr. We now want to limit returned docs with a percentage score higher than say, 50%. e.g. We want to search but only r

Getting single documents by fq on unique field, performance

2011-10-21 Thread Robert Brown
Hi, We do regular searches against documents, with highlighting on. To then view a document in more detail, we re-do the search but using fq=id:12345 to return the single document of interest, but still want highlighting on, so sending the q param back again. Is there anything you would rec

Re: Multi CPU Cores

2011-10-17 Thread Robert Brown
ene ecosystem search :: http://search-lucene.com/ > > > > >> >>From: Robert Brown >>To: solr-user@lucene.apache.org >>Sent: Monday, October 17, 2011 4:01 AM >>Subject: Re: Multi CPU Cores >> >>Where exactly do you set this up? 

Re: Multi CPU Cores

2011-10-17 Thread Robert Brown
Where exactly do you set this up? We're running Solr3.4 under tomcat, OpenJDK 1.6.0.20 btw, is the JRE just a different name for the VM? Apologies for such a newbie Java question. On Sun, 16 Oct 2011 12:51:44 -0400, Johannes Goll wrote: > we use the the following in production > > java -ser

Re: what is the recommended way to store locations?

2011-10-06 Thread Robert Brown
Expanding CA to California sounds like a use for a synonyms config file? you can then do that translation at index and query time, if needed. On Thu, 6 Oct 2011 12:01:33 -0400, Jason Toy wrote: > Hi Otis, > Thanks for the response. So just to make sure I understand clearly, so I > would store

Re: negative boosts for docs with common field value

2011-10-06 Thread Robert Brown
We don't want to limit the number of results coming back, so unfortunately grouping doesn't quite fix it, plus it would, by nature, group docs by a particular Author together which might not necessarily be adjacent. On Thu, 6 Oct 2011 07:16:48 -0700 (PDT), Ahmet Arslan wrote: >> For the sake of

negative boosts for docs with common field value

2011-10-06 Thread Robert Brown
Hi, For the sake of simplicity, I have an index with docs containing the following fields: Title Description Author Some searches will obviously be saturated by docs from any given author if they've simply written more. I'd like to give a negative boost to these matches, there-by making s