Re: Exception using distributed field-collapsing

2012-06-20 Thread Martijn v Groningen
Hi Bryan, What is the fieldtype of the groupField? You can only group by field that is of type string as is described in the wiki: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters When you group by another field type a http 400 should be returned instead if this error. At least that

Re: Issue with field collapsing in solr 4 while performing distributed search

2012-06-11 Thread Martijn v Groningen
The ngroups returns the number of groups that have matched with the query. However if you want ngroups to be correct in a distributed environment you need to put document belonging to the same group into the same shard. Groups can't cross shard boundaries. I guess you need to do some manual documen

Re: Grouping ngroups count

2012-05-02 Thread Martijn v Groningen
Hi Francois, The issue you describe looks like a similar issue we have fixed before with matches count. Open an issue and we can look into it. Martijn On 1 May 2012 20:14, Francois Perron wrote: > Thanks for your response Cody, > >  First, I used distributed grouping on 2 shards and I'm sure th

Re: Solr Cloud vs sharding vs grouping

2012-04-20 Thread Martijn v Groningen
Hi Jean-Sebastien, For some grouping features (like total group count and grouped faceting), the distributed grouping requires you to partition your documents into the right shard. Basically groups can't cross shards. Otherwise the group counts or grouped facet counts may not be correct. If you us

Re: To truncate or not to truncate (group.truncate vs. facet)

2012-04-09 Thread Martijn v Groningen
The group.facet option only works for field facets (facet.field). Others facets types (query, range and pivot) aren't supported yet. The group.facet works for both single and multivalued fields specified in the facet.field parameter. Martijn On 9 April 2012 20:58, danjfoley wrote: > I am using

Re: Distributed grouping issue

2012-04-02 Thread Martijn v Groningen
; > 4 > > Let me know if you have any luck reproducing. > > Thanks, > Cody > > -Original Message- > From: martijn.is.h...@gmail.com [mailto:martijn.is.h...@gmail.com] On > Behalf Of Martijn v Groningen > Sent: Monday, April 02, 2012 1:48 PM > To: solr-user

Re: Distributed grouping issue

2012-04-02 Thread Martijn v Groningen
> > All documents of a group exist on a single shard, there are no cross-shard > groups. > You only have to partition documents by group when the groupCount and some other features need to be accurate. For the "matches" this is not necessary. The matches are summed up during merging the shared resp

Re: Distributed grouping issue

2012-04-02 Thread Martijn v Groningen
The "matches" element in the response should return the number of documents that matched with the query and not the number of groups. Did you encountered this issue also with other Solr versions (3.5 or another nightly build)? Martijn On 2 April 2012 09:41, fbrisbart wrote: > Hi, > > when you w

Re: Grouping queries

2012-03-23 Thread Martijn v Groningen
> > Where is Join documented? I looked at > http://wiki.apache.org/solr/Join and see no reference to "fromIndex". > Also does this work in a distributed environment? > The "fromIndex" isn't documented in the wiki It is mentioned in the issue and you can find in the Solr code: https://issues.ap

Re: Grouping queries

2012-03-23 Thread Martijn v Groningen
On 22 March 2012 03:10, Jamie Johnson wrote: > I need to apologize I believe that in my example I have too grossly > over simplified the problem and it's not clear what I am trying to do, > so I'll try again. > > I have a situation where I have a set of access controls say user, > super user and

Re: Solr group witch minimum count in each group

2012-03-21 Thread Martijn v Groningen
Filtering results based on group count isn't supported yet. There is already an issue created for this feature: https://issues.apache.org/jira/browse/SOLR-3152 Martijn On 21 March 2012 11:52, ViruS wrote: > Hi, > > I try to get all duplicated documents in my index. > I have signature field and

Re: Grouping queries

2012-03-21 Thread Martijn v Groningen
I'm not sure if grouping is the right feature to use for your requirements... Grouping does have an impact on performance which you need to take into account. Depending on what grouping features you're going to use (grouped facets, ngroups), grouping performs well on large indices if you use filter

Re: To truncate or not to truncate (group.truncate vs. facet)

2012-03-19 Thread Martijn v Groningen
Hi Rasmus, You might want to use the group.facet parameter: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters I think that will give you the right facet counts with faceting. The parameter is not available in Solr 3.x, so you'll need to use a 4.0 nightly build. Martijn On 19 March

Re: JoinQuery and document score problem

2012-03-06 Thread Martijn v Groningen
Hi Stefan, The score isn't "moved" from the "from" side to the "to" side and as far as I know there isn't a way to configure the scoring of the joined documents. The Solr join query isn't a real join (like in sql) and should be used as filtering mechanism. The best way is to achieve that is to put

Re: Multiple Sort for Group/Folding

2012-01-11 Thread Martijn v Groningen
Hi Mauro, During the first pass search the sort param is used to determine the top N groups. Then during the second pass search the documents inside the top N groups are sorted using the group.sort parameter. The group.sort doesn't change how the groups them self are sorted. Martijn On 11 Januar

Re: Setting group.ngroups=true considerable slows down queries

2011-12-12 Thread Martijn v Groningen
e a lot of unique groups. Martijn On 12 December 2011 14:32, Michael Jakl wrote: > Hi! > > On Mon, Dec 12, 2011 at 13:57, Martijn v Groningen > wrote: >> As as I know currently there isn't another way. Unfortunately the >> performance degrades badly when having a lo

Re: Setting group.ngroups=true considerable slows down queries

2011-12-12 Thread Martijn v Groningen
uires quite some heap space (also without group.ngroups=true). Martijn On 9 December 2011 23:08, Michael Jakl wrote: > Hi! > > On Fri, Dec 9, 2011 at 17:41, Martijn v Groningen > wrote: >> On what field type are you grouping and what version of Solr are you >> using

Re: Setting group.ngroups=true considerable slows down queries

2011-12-09 Thread Martijn v Groningen
Hi Micheal, On what field type are you grouping and what version of Solr are you using? Grouping by string field is faster. Martijn On 9 December 2011 12:46, Michael Jakl wrote: > Hi, I'm using the grouping feature of Solr to return a list of unique > documents together with a count of the dupl

Re: Field collapsing results caching

2011-12-09 Thread Martijn v Groningen
There is no cross query cache for result grouping. The only caching option out there is the group.cache.percent option: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters Martijn On 8 December 2011 14:29, Kissue Kissue wrote: > Hi, > > I was just testing field collapsing in my solr a

Re: Group.ngroup parameter memory consumption

2011-11-12 Thread Martijn v Groningen
BTW this applies for 4.0-dev. In 3x the String instance from a StringIndex is directly used, this is then put into a list. So there is no extra object instance created per group matching the query. Martijn On 12 November 2011 08:49, Rafał Kuć wrote: > Hello! > > Thanks, that's what I was looking

Re: Group.ngroup parameter memory consumption

2011-11-11 Thread Martijn v Groningen
The ngroup option collects per search the number of unique groups matching the query. Based on the collected groups it returns the count. So it depends of the number of groups matching the query. To get more in detail: per unique group a ByteRef instance is created to represent a group and this put

Re: UnInvertedField vs FieldCache for facets for single-token text fields

2011-11-03 Thread Martijn v Groningen
Hi Micheal, The FieldCache is an easier data structure and easier to create, so I also expect it to be faster. Unfortunately for TextField UnInvertedField is always used even if you have one token per document. I think overriding the multiValuedFieldCache method and return false would work. If yo

Re: facet with group by (or field collapsing)

2011-11-03 Thread Martijn v Groningen
collapse.facet=after doesn't exists in Solr 3.3. This parameter exists in the SOLR-236 patches and is implemented differently in the released versions of Solr. >From Solr 3.4 you can use group.truncate. The facet counts are then computed based on the most relevant documents per group. Martijn On

Re: Selective Result Grouping

2011-11-03 Thread Martijn v Groningen
open an issue for this? Martijn On 1 November 2011 19:58, entdeveloper wrote: > > Martijn v Groningen-2 wrote: >> >> When using the group.field option values must be the same otherwise >> they don't get grouped together. Maybe fuzzy grouping would be nice. >>

Re: Solr 3.4 group.truncate does not work with facet queries

2011-10-28 Thread Martijn v Groningen
Hi Ian, I think this is a bug. After looking into the code the facet.query feature doesn't take into account the group.truncate option. This needs to be fixed. You can open a new issue in Jira if you want to. Martijn On 28 October 2011 12:09, Ian Grainger wrote: > Hi, I'm using Grouping with gr

Re: joins and filter queries effecting scoring

2011-10-28 Thread Martijn v Groningen
Have your tried using the join in the fq instead of the q? Like this (assuming user_id_i is a field in the post document type and self_id_i a field in the user document type): q=posts_text:"hello"&fq={!join from=self_id_i to=user_id_i}is_active_boolean:true In this example the fq produces a docset

Re: About the indexing process

2011-10-25 Thread Martijn v Groningen
Hi Amos, How are you currently indexing files? Are you indexing Solr input documents or just regular files? You can use Solr cell to index binary files: http://wiki.apache.org/solr/ExtractingRequestHandler Martijn On 25 October 2011 10:21, 刘浪 wrote: > Hi, >     I appreciate you can help me. Wh

Re: Selective Result Grouping

2011-10-23 Thread Martijn v Groningen
> The current grouping functionality using group.field is basically > all-or-nothing: all documents will be grouped by the field value or none > will. So there would be no way to, for example, collapse just the videos or > images like they do in google. When using the group.field option values must

Re: Field Collapsing and Record Filtering

2011-10-07 Thread Martijn v Groningen
I don't think this possible in only one search with what Solr currently has to offer. I guess the only way to support this, is by post processing your results on the client side. So for each group you display you query what to latest version is. If that doesn't match then you omit the result from r

Re: Selective Result Grouping

2011-10-07 Thread Martijn v Groningen
So if look at the old SOLR-236 fieldcollapsing (http://wiki.apache.org/solr/FieldCollapsingUncommitted) you mean collapse.type=adjacent ? I think we shouldn't change group.query parameter. Since it serves a different purpose. I think it is better to have a new parameter for this different way of g

Re: Faceted query performance problem when group.truncate set to true

2011-10-07 Thread Martijn v Groningen
Hi Dmitry, What is the fieldtype of field objid? Grouping works much slower on non string fields. Post grouped faceting in general slows down your search time, b/c it is an expensive operation to compute the grouped docset. However a qtime that is 100 times slower is a lot. I have noticed that if

Re: Solr 3.4 Grouping group.main=true results in java.lang.NoClassDefFound

2011-09-28 Thread Martijn v Groningen
Hi Frank, How is Solr deployed? And how did you upgrade? The commons-lang library (containing ArrayUtils) is included in the Solr war file. Martijn On 28 September 2011 09:16, Frank Romweber wrote: > I use drupal for accessing the solr search engine. After updating an > creating my new index ev

Re: Problem with SolrJ and Grouping

2011-09-12 Thread Martijn v Groningen
t; > On Mon, Sep 12, 2011 at 5:45 PM, Martijn v Groningen > wrote: >> Also the error you described when wt=xml and using SolrJ is also fixed >> in 3.4 (and in trunk / branch3x). >> You can wait for the 3.4 release of use a night 3x build. >> >> Mart

Re: Nested documents

2011-09-12 Thread Martijn v Groningen
To support this, we also need to implement indexing block of documents in Solr. Basically the UpdateHandler should also use this method: IndexWriter#addDocuments(Collection documents) On 12 September 2011 01:01, Michael McCandless wrote: > Even if it applies, this is for Lucene.  I don't think we

Re: Problem with SolrJ and Grouping

2011-09-12 Thread Martijn v Groningen
Also the error you described when wt=xml and using SolrJ is also fixed in 3.4 (and in trunk / branch3x). You can wait for the 3.4 release of use a night 3x build. Martijn On 12 September 2011 12:41, Sanal K Stephen wrote: > Kirill, > >         Parsing the grouped result using SolrJ is not releas

Re: TermsComponent from deleted document

2011-09-10 Thread Martijn v Groningen
I'd use the suggester: http://wiki.apache.org/solr/Suggester The suggester can give a collation. The TermsComponent can't do that. The suggester builds on top of the spellchecking infrastructure, so should be easy to use if you're familiar with that. Martijn On 10 September 2011 08:37, Manish Ba

Re: Sorting groups by numFound group size

2011-09-10 Thread Martijn v Groningen
Not yet. If you want you can create an issue for sorting groups by numFound. On 9 September 2011 18:49, O. Klein wrote: > I am also looking for way to sort on numFound. > > Has an issue been created? > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Sorting-groups-by-nu

Re: Sorting groups by numFound group size

2011-09-08 Thread Martijn v Groningen
No, as far as I know sorting by group count isn't planned. You can create an issue in Jira where future development of this feature can be tracked. On 7 September 2011 23:54, bobsolr wrote: > Hi Martijn, > > Thanks for the reply. Unfortunately I can't reference the group size using > a > functio

Re: Sorting groups by numFound group size

2011-09-07 Thread Martijn v Groningen
Sorting groups by numfound isn't possible. You can sort groups by specifying a function or a field (from your schema) in the sort parameter. The numFound isn't a field so that is why you can't sort on it. Martijn On 7 September 2011 08:17, bobsolr wrote: > Hi, > > I'm using this sample query to

Re: Reading results from FieldCollapsing

2011-08-31 Thread Martijn v Groningen
The CollapseComponent was never comitted. This class exists in the SOLR-236 patches. You don't need to change the configuration in order to use grouping. The blog you mentioned is based on the SOLR-236 patches. The current grouping in Solr 3.3 has superseded these patches. >From Solr 3.4 (not yet

Re: Grouping and performing statistics per group

2011-08-25 Thread Martijn v Groningen
If you take this query from the wiki: http://localhost:8983/solr/select?q=*:*&stats=true&stats.field=price&stats.field=popularity&stats.twopass=true&rows=0&indent=true&stats.facet=inStock In this case you get stats about the popularity per inStock value (true / false). Replacing this values with we

Re: Grouping and performing statistics per group

2011-08-25 Thread Martijn v Groningen
Or if you dont care about grouped results you can also add the following option: stats.facet=gender On 25 August 2011 14:40, Martijn v Groningen wrote: > Hi Omri, > > I think you can achieve that with grouping and the Solr StatsComponent ( > http://wiki.apache.org/solr/StatsComp

Re: Grouping and performing statistics per group

2011-08-25 Thread Martijn v Groningen
Hi Omri, I think you can achieve that with grouping and the Solr StatsComponent ( http://wiki.apache.org/solr/StatsComponent). In order to compute statistics on groups you must set the option group.truncate=true An example query: q=*:*&group=true&group.field=gender&group.truncate=true&stats=true&s

Re: Results Group-By using SolrJ

2011-08-14 Thread Martijn v Groningen
Hi Omri, SOLR-2637 was concerned with adding grouped response parsing. There is no convenience method for grouping, but you can use the normal SolrQuery#set(...) methods to enable grouping. The following code should enable grouping via SolrJ api: SolrQuery query = new SolrQuery(); query.set(GroupP

Re: SOLR 3.3.0 multivalued field sort problem

2011-08-13 Thread Martijn v Groningen
The first solution would make sense to me. Some kind of a strategy mechanism for this would allow anyone to define their own rules. Duplicating results would be confusing to me. On 13 August 2011 18:39, Michael Lackhoff wrote: > On 13.08.2011 18:03 Erick Erickson wrote: > > > The problem I've al

Re: SOLR 3.3.0 multivalued field sort problem

2011-08-12 Thread Martijn v Groningen
Hi Johnny, Sorting on a multivalued field has never really worked in Solr. Solr versions <= 1.4.1 allowed it, but there was a change that an error occurred and that the sorting might not be what you expect. >From Solr 3.1 and up sorting on a multivalued isn't allowed and a http 400 is returned. D

Re: Minimum Score

2011-08-05 Thread Martijn v Groningen
As far as I know there is no built-in solution for this like there is for max score. An alternative approach to the one already mentioned is to send a second request with rows=1 and sort=score asc This will return the lowest scoring document and you can then retrieve the score from that document (i

Re: SOLR Support for Lucene Nested Documents

2011-08-05 Thread Martijn v Groningen
Hi Josh, Solr doesn't expose this Lucene feature yet. To support this Solr needs to be able to index documents in a single block. Also the BlockJoinQuery needs to be exposed to Solr (this can easily happen via a QParserPlugin). Martijn On 5 August 2011 00:00, Joshua Harness wrote: > I noticed

Re: A rant about field collapsing

2011-08-04 Thread Martijn v Groningen
Well, the original page moved to: http://wiki.apache.org/solr/FieldCollapsingUncommitted Assuming that you're using Solr 3.3 you can't get the grouped result () with SolrJ. I added grouping support to SolrJ some time ago and will be in Solr 3.4. You can use a nightly 3.x build to use the grouping

Re: A rant about field collapsing

2011-08-04 Thread Martijn v Groningen
The development of the field collapse feature is a long and confusing story. The main point is that SOLR-236 was never going to scale and the performance in general was bad. A new approach was needed. This was implemented in SOLR-1682 and added to the trunk (4.0-dev) around September last year. Lat

Re: ideas for versioning query?

2011-08-01 Thread Martijn v Groningen
Hi Mike, how many docs and groups do you have in your index? I think the group.sort option fits your requirements. If I remember correctly group.ngroup=true adds something like 30% extra time on top of the search request with grouping, but that was on my local test dataset (~30M docs, ~8000 groups

Re: SolrJ and class versions

2011-07-26 Thread Martijn v Groningen
t; On 07/26/2011 09:26 AM, Martijn v Groningen wrote: > > Where you upgrading from Solr 1.4? > Yep. > > SolrJ uses by default for querying the javabin format (wt parameter). > > The javabin format is not compatible between 1.4 and 3.1 and above. > > So If your clients wher

Re: SolrJ and class versions

2011-07-26 Thread Martijn v Groningen
Where you upgrading from Solr 1.4? SolrJ uses by default for querying the javabin format (wt parameter). The javabin format is not compatible between 1.4 and 3.1 and above. So If your clients where running with SolrJ 1.4 versions I would expect errors to occur. Martijn On 25 July 2011 12:15, Tarj

Re: Possible bug in Solr 3.3 grouping

2011-07-12 Thread Martijn v Groningen
Hi Nikhil, Thanks for raising this issue. I checked this particular issue in a test case and I ran into the same error, so this is indeed a bug. I've fixed this issue for 3x in revision 1145748. So checking out the latest 3x branch and building Solr yourself should give you this bug fix. Or you ca

Re: SolrJ and Range Faceting

2011-06-13 Thread Martijn v Groningen
ote: > Martjin, > > I had not considered doing something like > > > manufacturedate_dt:[2007-02-13T15:26:37Z TO 2007-02-13T15:26:37Z+1YEAR] > > does this work? If so that completely eliminates the need to use the date > math parsers right? > > On Sun, Jun 12, 2011

Re: SolrJ and Range Faceting

2011-06-12 Thread Martijn v Groningen
version. Again thanks! > > On Sat, Jun 11, 2011 at 8:15 AM, Martijn v Groningen < > martijn.is.h...@gmail.com> wrote: > > > Hi James, > > > > Good idea! I'll add a getAsFilterQuery method to the patch. > > > > Martijn > > > > On 6

Re: SolrJ and Range Faceting

2011-06-11 Thread Martijn v Groningen
w > > SimpleDateFormat("-MM-dd'T'HH:mm:ss"); > > > > parser.setNow(dateCount.getStart()); > > Date end = parser.parseMath(dateCount.getGap()); > > String startStr = sdf.format(dateCount.getStart()) + "Z"; &

Re: SolrJ and Range Faceting

2011-06-03 Thread Martijn v Groningen
Hi Jamie, I don't know why range facets didn't make it into SolrJ. But I've recently opened an issue for this: https://issues.apache.org/jira/browse/SOLR-2523 I hope this will be committed soon. Check the patch out and see if you like it. Martijn On 2 June 2011 18:22, Jamie Johnson wrote: > C

Re: Result Grouping always returns grouped output

2011-06-02 Thread Martijn v Groningen
Hi Karel, group.main=true should do the trick. When that is set to true the group.format is always simple. Martijn On 27 May 2011 19:13, kare...@gmail.com wrote: > Hello, > > I am using the latest nightly build of Solr 4.0 and I would like to > use grouping/field collapsing while maintaining c

Re: [POLL] How do you (like to) do logging with Solr

2011-05-16 Thread Martijn v Groningen
[ ] I always use the JDK logging as bundled in solr.war, that's perfect [ ] I sometimes use log4j or another framework and am happy with re-packaging solr.war [X] Give me solr.war WITHOUT an slf4j logger binding, so I can choose at deploy time [ ] Let me choose whether to bundle a binding or no

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Martijn v Groningen
[] ASF Mirrors (linked in our release announcements or via the Lucene website) [ X ] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [ X ] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors them internally or via a downstream project)

Re: Field Collapse question

2010-07-04 Thread Martijn v Groningen
Hi Ken, Not collapsing on null field values is not possible in the patch. However you can if you want to fix this in the patch it is a really small change. Assuming that you're using the default collapsing algorithm you can add the following piece of code in the NonAdjacentDocumentCollapser.java f

Re: SOLR-236 Patch

2010-06-25 Thread Martijn v Groningen
Hi Sam, It seems that the patch is out of sync again with the trunk. Can you try patching with revision 955615? I'll update the patch shortly. Martijn On 24 June 2010 09:49, Amdebirhan, Samson, VF-Group wrote: > Hi > > > > Trying to apply the SOLR-236 patch to a trunk i get what follows. Can >

Re: collapse exception

2010-06-23 Thread Martijn v Groningen
t actually being in core.  I > haven't looked at the specifics, but I imagine we could get the core stuff > adjusted to suit this plugin. > >        Erik > > On Jun 22, 2010, at 5:24 PM, Martijn v Groningen wrote: > >> I checked your stacktrace and I can't remem

Re: collapse exception

2010-06-22 Thread Martijn v Groningen
n't know because it's patched by someone else but I can't get his > help. When this component become a contrib? Using patch is so annoying > > 2010/6/22 Martijn v Groningen : >> What version of Solr and which patch are you using? >> >> On 21 June 2010 1

Re: Field Collapsing SOLR-236

2010-06-22 Thread Martijn v Groningen
hread (i.e. rev 955615) but cannot find it in the repository. (it has > revision 955569 followed by revision 955785). > > Any pointers?? > Regards > Raakhi > > On Tue, Jun 22, 2010 at 2:03 AM, Martijn v Groningen < > martijn.is.h...@gmail.com> wrote: > >> O

Re: collapse exception

2010-06-21 Thread Martijn v Groningen
What version of Solr and which patch are you using? On 21 June 2010 11:46, Li Li wrote: > it says  "Either filter or filterList may be set in the QueryCommand, > but not both." I am newbie of solr and have no idea of the exception. > What's wrong with it? thank you. > > java.lang.IllegalArgumentE

Re: Field Collapsing SOLR-236

2010-06-21 Thread Martijn v Groningen
t then i am not able to sort on any > other field. is there any workaround to support this feature?? > > Regards, > Raakhi > > On Fri, Jun 18, 2010 at 6:14 PM, Martijn v Groningen < > martijn.is.h...@gmail.com> wrote: > >> Hi Rakhi, >> >> The patch is

Re: Field Collapsing SOLR-236

2010-06-18 Thread Martijn v Groningen
> Regards, > Raakhi > > > On Fri, Jun 18, 2010 at 1:24 AM, Moazzam Khan wrote: > >> I knew it wasn't me! :) >> >> I found the patch just before I read this and applied it to the trunk >> and it works! >> >> Thanks Mark and martijn for all

Re: Field Collapsing SOLR-236

2010-06-17 Thread Martijn v Groningen
I've added a new patch to the issue, so building the trunk (rev 955615) with the latest patch should not be a problem. Due to recent changes in the Lucene trunk the patch was not compatible. On 17 June 2010 20:20, Erik Hatcher wrote: > > On Jun 16, 2010, at 7:31 PM, Mark Diggory wrote: >> >> p.s.

Re: question about the fieldCollapseCache

2010-06-09 Thread Martijn v Groningen
I agree. I'll add this information to the wiki. On 9 June 2010 14:32, Jean-Sebastien Vachon wrote: > ok great. > > I believe this should be mentioned in the wiki. > > Later > > On 2010-06-09, at 4:06 AM, Martijn v Groningen wrote: > >> The fieldCollapseCache

Re: question about the fieldCollapseCache

2010-06-09 Thread Martijn v Groningen
The fieldCollapseCache should not be used as it is now, it uses too much memory. It stores any information relevant for a field collapse search. Like document collapse counts, collapsed document ids / fields, collapsed docset and uncollapsed docset (everything per unique search). So the memory usag

Re: Applying collapse patch

2010-05-28 Thread Martijn v Groningen
Have you executed: "ant example" after building? (Assuming that this is the example solr) On 28 May 2010 12:17, Sophie M. wrote: > > It is ok for applying the patch, thanks Martin. When I start Solr I get this > logs in my console : > > C:\Users\Sophie\workspace\lucene-solr\lucene-solr\solr\solr>

Re: Applying collapse patch

2010-05-28 Thread Martijn v Groningen
The trunk should work with the latest patch (SOLR-236-trunk.patch). Did patching go successful? What compilation errors you get? On 28 May 2010 11:10, Sophie M. wrote: > > Ok I will have a look on the comments and I will post if necessary. > > Thanks ^^ > -- > View this message in context: > htt

Re: Field Collapsing SOLR-236

2010-03-25 Thread Martijn v Groningen
Hi Blargy, The latest path is not compatible with 1.4, I believe that the latest field-collapse-5.patch file is compatible with 1.4. The file should at least compile with 1.4 trunk. I'm not sure how the performance is. Martijn On 25 March 2010 01:49, Dennis Gearon wrote: > Boy, I hope that fiel

Re: Dedupe of document results at query-time

2010-01-23 Thread Martijn v Groningen
This manner of detecting duplicates at query time does really match with what field collapsing does. So I suggest you look into that. As far as I know there isn't any function query that does something you have described in your example. Cheers, Martijn On 23 January 2010 12:31, Peter S wrote:

Re: Solr 1.4 Field collapsing - What are the steps for applying the SOLR-236 patch?

2010-01-12 Thread Martijn v Groningen
I wouldn't use the patches of the sub issues right now as they are under development right now (the are currently a POC). I also think that the latest patch in SOLR-236 is currently the best option. There are some memory related problems with the patch that have to do with caching. The fieldCollaps

Re: Field Collapsing - disable cache

2009-12-23 Thread Martijn v Groningen
werWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) > at > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685) > at java.lang.Thread.run(Thread.java:636) > > > > > > > > On Wed 23/12/09 13:26 , r...@intelcompute.com wrote:

Re: Field Collapsing - disable cache

2009-12-23 Thread Martijn v Groningen
instead? >> can't remember what i did last time to get field-collapse-5.patch >> working successfully. >> On Tue 22/12/09 22:43 , Lance Norskog  wrote: >> > To avoid this possible bug, you could change the cache to only >> have a >> > few e

Re: Field Collapsing - disable cache

2009-12-22 Thread Martijn v Groningen
In the latest patch some changes where made on the configuration side, but if you add the CollapseComponent to the conf no field collapse cache should be enabled. If not let me know. Martijn 2009/12/22 : > > > > > > On Tue 22/12/09 12:28 , Martijn v Groningen wrote: > >

Re: Field Collapsing - disable cache

2009-12-22 Thread Martijn v Groningen
Hi Rob, What patch are you actually using from SOLR-236? Martijn 2009/12/22 : > I've tried both, the whole fieldCollapsing tag, and just the > fieldCollapseCache tag inside it. >        both cause error. >        I guess I can just set size, initialSize, and autowarmCount to 0 ?? > On Tue 22/12

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

2009-12-20 Thread Martijn v Groningen
" /> >>> > >>> > In solrconfig.xml, this is how I am enabling field collapsing: >>> >    >> > class="org.apache.solr.handler.component.CollapseComponent"/> >>> > >>> > Apart from this, I made no changes in so

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

2009-12-13 Thread Martijn v Groningen
Solr 1.4 build. I am not using the latest > solr nightly build. Can that cause any problem? > > -- > Thanks > Varun Gupta > > > On Fri, Dec 11, 2009 at 3:44 AM, Martijn v Groningen < > martijn.is.h...@gmail.com> wrote: > >> I tried to reproduce a similar si

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

2009-12-10 Thread Martijn v Groningen
core&start=0&q=weight+loss&facet.field=ctype&qt=contentsearch > > I append "&fq=ctype:1" to the above queries when trying to get results for a > particular category. > > -- > Thanks > Varun Gupta > > > On Thu, Dec 10, 2009 at 5:58 PM, Martijn v G

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

2009-12-10 Thread Martijn v Groningen
Hi Varun, Can you send the whole requests (with params), that you send to Solr for both queries? In your situation the collapse parameters only have to be used for the first query and not the second query. Martijn 2009/12/10 Varun Gupta : > Hi, > > I have documents under 6 different categories.

Re: Grouping

2009-12-06 Thread Martijn v Groningen
Field collapsing has some aggregation functions like sum() and avg(), but the statistics are computed based on collapse groups instead of all documents with the same field value. A collapse group contains documents that were not relevant enough to end up (collapsed documents) in the search result a

Re: Deduplication in 1.4

2009-11-26 Thread Martijn v Groningen
Message ---- > >> From: Martijn v Groningen >> To: solr-user@lucene.apache.org >> Sent: Thu, November 26, 2009 3:19:40 AM >> Subject: Re: Deduplication in 1.4 >> >> Field collapsing has been used by many in their production >> environment. > > Got any po

Re: Deduplication in 1.4

2009-11-26 Thread Martijn v Groningen
Field collapsing has been used by many in their production environment. The last few months the stability of the patch grew as quiet some bugs were fixed. The only big feature missing currently is caching of the collapsing algorithm. I'm currently working on that and I will put it in a new patch in

Re: field collapse using 'adjacent' & 'includeCollapsedDocs' + 'sort' query field

2009-11-15 Thread Martijn v Groningen
Hi Micheal, What you are saying seems logical, but that is currently not the case with the collapsedDocs functionality. This functionality was build with computing aggregated statistics in mind and not really to have a separate collapse group search result. Although the collapsed documents are col

Re: question about collapse.type = adjacent

2009-11-02 Thread Martijn v Groningen
Hi Micheal, Field collapsing is basicly done in two steps. The first step is to get the uncollapsed sorted (whether it is score or a field value) documents and the second step is to apply the collapse algorithm on the uncollapsed documents. So yes, when specifying collapse.type=adjacent the docume

Re: weird problem with letters S and T

2009-10-28 Thread Martijn v Groningen
I think that is not a problem, because your are only storing one character per field. There are other text field types that do not have the stop word filter, so give your first letter field that field type. In this way stopword filter analyser is only disabled for searches on the first letter field

Re: field collapsing exception

2009-10-27 Thread Martijn v Groningen
I have attached a new patch to SOLR-236 that fixes this issue. Cheers, Martijn 2009/10/27 Martijn v Groningen : > Hi Joe, > > Ok that is not good. A took a look at it and this would mean that the > variable docHeadCollapseGroupAssoc (declared a

Re: field collapsing exception

2009-10-27 Thread Martijn v Groningen
Hi Joe, Ok that is not good. A took a look at it and this would mean that the variable docHeadCollapseGroupAssoc (declared at line 80 in FieldValueCountCollapseCollectorFactory) is null. I always thought that the variable was set, but that is not the case when no documents are being collapsed. I w

Re: field collapsing bug (java.lang.ArrayIndexOutOfBoundsException)

2009-10-25 Thread Martijn v Groningen
/10/25 Martijn v Groningen : > Hi Joe, > > Can you give a bit more context info? Like the exact search and the > field types you are using for example. Also are you doing a lot of > frequent updates to the index? > > Cheers, > > Martijn > > 2009/10/23 Joe Calderon

Re: field collapsing bug (java.lang.ArrayIndexOutOfBoundsException)

2009-10-25 Thread Martijn v Groningen
Hi Joe, Can you give a bit more context info? Like the exact search and the field types you are using for example. Also are you doing a lot of frequent updates to the index? Cheers, Martijn 2009/10/23 Joe Calderon : > seems to happen when sort on anything besides strictly score, even > score de

Re: Collapse with multiple fields

2009-10-23 Thread Martijn v Groningen
No this actually not supported at the moment. If you really need to collapse on two different field you can concatenate the two fields together in another field while indexing and then collapse on that field. Martijn 2009/10/23 Thijs : > I haven't had time to actually ask this on the list my self

Re: field collapsing sums

2009-10-02 Thread Martijn v Groningen
Well that is odd. How have you configured field collapsing with the dismax request handler? The collapse counts should X - 1 (if collapse.threshold=1). Martijn 2009/10/1 Joe Calderon : > thx for the reply, i just want the number of dupes in the query > result, but it seems i dont get the correct

Re: JVM OOM when using field collapse component

2009-10-02 Thread Martijn v Groningen
No I have not encountered OOM exception yet with current field collapse patch. How large is your configured JVM heap space (-Xmx)? Field collapsing requires more memory then regular searches so. Does Solr run out of memory during the first search(es) or does it run out of memory after a while when

Re: field collapsing sums

2009-10-01 Thread Martijn v Groningen
ike only dupes of records on the first page are returned > > or is tehre a a setting im missing? currently im only sending, > collapse.field=brand and collapse.includeCollapseDocs.fl=num_in_stock > > --joe > > On Thu, Oct 1, 2009 at 1:14 AM, Martijn v Groningen > wrote: >>

Re: field collapsing sums

2009-10-01 Thread Martijn v Groningen
Hi Joe, Currently the patch does not do that, but you can do something else that might help you in getting your summed stock. In the latest patch you can include fields of collapsed documents in the result per distinct field value. If your specify collapse.includeCollapseDocs.fl=num_in_stock in t

  1   2   >