refiltering search results

2012-08-28 Thread Johannes . Schwendinger
Hello,

Im trying to develop a search component to filter the search results agein 
with current data so that the user only sess results he is permitted to 
see.

Can someone give me a hint where to start and how to do this? Is a Search 
Component the right place to do this?

Regards
Johannes

Antwort: Re: refiltering search results

2012-08-28 Thread Johannes . Schwendinger
The main idea is to filter results as much as possible with solr an then 
check this result again. 
To do this I have to read some information from some fields of the 
documents in the result. 
At the moment I am trying to do this in the process method of a Search 
Component. But I even dont know 
how to get access to the search results or the index Fields of the 
documents. 
I have thought of ResponseBuilder.getResults() but after I have the 
DocListandSet Object I get stuck. 

I know the time of the search will increase but security has priority

Regards,
Johannes



Von:
Alexandre Rafalovitch 
An:
solr-user@lucene.apache.org
Datum:
28.08.2012 16:48
Betreff:
Re: refiltering search results



I think there was a JOIN example (for version 4) somewhere with the
permission restrictions. Or, if you have very broad categories, you
can use different search handlers with restriction queries baked in.

These might be enough. Otherwise, you have to send the list of IDs
back and forth and it could be expensive.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Tue, Aug 28, 2012 at 9:28 AM,   wrote:
> Hello,
>
> Im trying to develop a search component to filter the search results 
agein
> with current data so that the user only sess results he is permitted to
> see.
>
> Can someone give me a hint where to start and how to do this? Is a 
Search
> Component the right place to do this?
>
> Regards
> Johannes



Antwort: Re: Antwort: Re: refiltering search results

2012-08-29 Thread Johannes . Schwendinger
Von:
Ahmet Arslan 
An:
solr-user@lucene.apache.org
Datum:
29.08.2012 10:50
Betreff:
Re: Antwort: Re: refiltering search results


Thanks for the answer. 

My next question is how can i filter the result or how to replace the old 
ResponseBuilder Result with a new one?


--- On Wed, 8/29/12, johannes.schwendin...@blum.com 
 wrote:

> From: johannes.schwendin...@blum.com 
> Subject: Antwort: Re: refiltering search results
> To: solr-user@lucene.apache.org
> Date: Wednesday, August 29, 2012, 8:22 AM
> The main idea is to filter results as
> much as possible with solr an then 
> check this result again. 
> To do this I have to read some information from some fields
> of the 
> documents in the result. 
> At the moment I am trying to do this in the process method
> of a Search 
> Component. But I even dont know 
> how to get access to the search results or the index Fields
> of the 
> documents. 
> I have thought of ResponseBuilder.getResults() but after I
> have the 
> DocListandSet Object I get stuck. 


You can read information from some fields using DocListandSet with

org.apache.solr.util.SolrPluginUtils#docListToSolrDocumentList

method.



LateBinding

2012-08-29 Thread Johannes . Schwendinger
Hello,

Has anyone ever implementet the security feature called late-binding? 

I am trying this but I am very new to solr and I would be very glad if I 
would get some hints to this.

Regards,
Johannes

Query during a query

2012-08-30 Thread Johannes . Schwendinger
Hi list,

I want to get distinct data from a single solr field when ever a search 
query is started by an user. 

How can I do this?

Regards,
Johannes

Antwort: Re: Query during a query

2012-08-30 Thread Johannes . Schwendinger
Thanks for the answer, but I want to know how I can do a seperate query 
before the main query. 
And I only want this data in my programm. The user won't see it. 
I need the values from one field to get some information from an external 
source while the main query is executed.

pravesh  schrieb am 31.08.2012 07:42:48:

> Von:
> 
> pravesh 
> 
> An:
> 
> solr-user@lucene.apache.org
> 
> Datum:
> 
> 31.08.2012 07:43
> 
> Betreff:
> 
> Re: Query during a query
> 
> Did you checked SOLR Field Collapsing/Grouping.
> http://wiki.apache.org/solr/FieldCollapsing
> http://wiki.apache.org/solr/FieldCollapsing 
> If this is what you are looking for.
> 
> 
> Thanx
> Pravesh
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/
> Query-during-a-query-tp4004624p4004631.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Antwort: Re: Antwort: Re: Query during a query

2012-09-02 Thread Johannes . Schwendinger
The problem is, that I don't know how to do this. :P

My sequence: the user enters his search words. This is sent to solr. There 
I need to make another query first to get metadata from the index. with 
this metadata I have to connect to an external source to get some 
information about the user. With this information and the first search 
words I query then the solr index to get the search result.

I hope its clear now wheres my problem and what I want to do

Regards,
Johannes



Von:
"Jack Krupansky" 
An:

Datum:
31.08.2012 15:03
Betreff:
Re: Antwort: Re: Query during a query



So, just do another query before doing the main query. What's the problem? 

Be more specific. Walk us through the sequence of processing that you 
need.

-- Jack Krupansky

-Original Message- 
From: johannes.schwendin...@blum.com
Sent: Friday, August 31, 2012 1:52 AM
To: solr-user@lucene.apache.org
Subject: Antwort: Re: Query during a query

Thanks for the answer, but I want to know how I can do a seperate query
before the main query.
And I only want this data in my programm. The user won't see it.
I need the values from one field to get some information from an external
source while the main query is executed.

pravesh  schrieb am 31.08.2012 07:42:48:

> Von:
>
> pravesh 
>
> An:
>
> solr-user@lucene.apache.org
>
> Datum:
>
> 31.08.2012 07:43
>
> Betreff:
>
> Re: Query during a query
>
> Did you checked SOLR Field Collapsing/Grouping.
> http://wiki.apache.org/solr/FieldCollapsing
> http://wiki.apache.org/solr/FieldCollapsing
> If this is what you are looking for.
>
>
> Thanx
> Pravesh
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/
> Query-during-a-query-tp4004624p4004631.html
> Sent from the Solr - User mailing list archive at Nabble.com. 




Solr Cell Questions

2012-09-24 Thread Johannes . Schwendinger
Hi,

Im currently experimenting with Solr Cell to index files to Solr. During 
this some questions came up.

1. Is it possible (and wise) to connect to Solr Cell with multiple Threads 
at the same time to index several documents at the same time?
This question came up because my prrogramm takes about 6hours to index 
round 35000 docs. (no production environment, only example solr and a 
little desktop machine but I think its very slow, and I know solr isn't 
the bottleneck (yet)) 

2. If 1 is possible, how many Threads should do this and how many memory 
Solr needs? I've tried it but i run into an out of memory exception.

Thanks in advantage

Best Regards
Johannes

Antwort: Re: Solr Cell Questions

2012-09-25 Thread Johannes . Schwendinger
Thank you Erick for your respone,

I've already tried what you've suggested and got some out of memory 
exceptions. Because of this i like the solution with solr Cell where i can 
send the file directly to solr via stream and don't collect them in my 
memory. 

And another question that came to my mind, how many documents per minute, 
second, what ever can i put into solr. Say XML format and from 100kb to 
100MB. 
Is there a number or is it to dependent from hardware and settings?


Best
Johannes

Erick Erickson  schrieb am 25.09.2012 00:22:26:

> Von:
> 
> Erick Erickson 
> 
> An:
> 
> solr-user@lucene.apache.org
> 
> Datum:
> 
> 25.09.2012 00:23
> 
> Betreff:
> 
> Re: Solr Cell Questions
> 
> If you're concerned about throughput, consider moving all the
> SolrCell (Tika) processing off the server. SolrCell is way cool
> for showing what can be done, but its downside is you're
> moving all the processing of the structured documents to the
> same machine doing the indexing. Pretty soon, especially
> with significant size files, you're spending all your CPU cycles
> parsing the files...
> 
> Happens there's a blog about this:
> http://searchhub.org/dev/2012/02/14/indexing-with-solrj/
> 
> By moving the indexing to N clients, you can increase
> throughput until you make Solr work hard to do the indexing
> 
> Best
> Erick
> 
> On Mon, Sep 24, 2012 at 10:04 AM,   
wrote:
> > Hi,
> >
> > Im currently experimenting with Solr Cell to index files to Solr. 
During
> > this some questions came up.
> >
> > 1. Is it possible (and wise) to connect to Solr Cell with multiple 
Threads
> > at the same time to index several documents at the same time?
> > This question came up because my prrogramm takes about 6hours to index
> > round 35000 docs. (no production environment, only example solr and a
> > little desktop machine but I think its very slow, and I know solr 
isn't
> > the bottleneck (yet))
> >
> > 2. If 1 is possible, how many Threads should do this and how many 
memory
> > Solr needs? I've tried it but i run into an out of memory exception.
> >
> > Thanks in advantage
> >
> > Best Regards
> > Johannes


Antwort: Re: Re: Solr Cell Questions

2012-09-25 Thread Johannes . Schwendinger
The difference with solr cell is, that i'am sending every single document 
to solr cell and don't collect them until i have a couple of them in my 
memory. 
Using mainly the code form here: 
http://wiki.apache.org/solr/ExtractingRequestHandler#SolrJ


Erick Erickson  schrieb am 25.09.2012 15:47:34:

> Von:
> 
> Erick Erickson 
> 
> An:
> 
> solr-user@lucene.apache.org
> 
> Datum:
> 
> 25.09.2012 15:48
> 
> Betreff:
> 
> Re: Re: Solr Cell Questions
> 
> bq: how many documents per minute, second, what ever can i put into solr
> 
> Too many variables to say. I've seen several thousand truly simple
> docs/sec. But since you're doing the Tika processing that's probably
> going to be your limiting factor. And it'll be many fewer...
> 
> I don't understand your OOM issue when running Tika on the client. Or,
> rather, why you think using SolrCell makes this different. SolrCell also
> uses Tika. So my suspicion it that your client-side process simply isn't
> allocating much memory to the JVM, did you try bumping the memory
> on your client?
> 
> Best
> Erick
> 
> On Tue, Sep 25, 2012 at 5:23 AM,   
wrote:
> > Thank you Erick for your respone,
> >
> > I've already tried what you've suggested and got some out of memory
> > exceptions. Because of this i like the solution with solr Cell where i 
can
> > send the file directly to solr via stream and don't collect them in my
> > memory.
> >
> > And another question that came to my mind, how many documents per 
minute,
> > second, what ever can i put into solr. Say XML format and from 100kb 
to
> > 100MB.
> > Is there a number or is it to dependent from hardware and settings?
> >
> >
> > Best
> > Johannes
> >
> > Erick Erickson  schrieb am 25.09.2012 
00:22:26:
> >
> >> Von:
> >>
> >> Erick Erickson 
> >>
> >> An:
> >>
> >> solr-user@lucene.apache.org
> >>
> >> Datum:
> >>
> >> 25.09.2012 00:23
> >>
> >> Betreff:
> >>
> >> Re: Solr Cell Questions
> >>
> >> If you're concerned about throughput, consider moving all the
> >> SolrCell (Tika) processing off the server. SolrCell is way cool
> >> for showing what can be done, but its downside is you're
> >> moving all the processing of the structured documents to the
> >> same machine doing the indexing. Pretty soon, especially
> >> with significant size files, you're spending all your CPU cycles
> >> parsing the files...
> >>
> >> Happens there's a blog about this:
> >> http://searchhub.org/dev/2012/02/14/indexing-with-solrj/
> >>
> >> By moving the indexing to N clients, you can increase
> >> throughput until you make Solr work hard to do the indexing
> >>
> >> Best
> >> Erick
> >>
> >> On Mon, Sep 24, 2012 at 10:04 AM,  
> > wrote:
> >> > Hi,
> >> >
> >> > Im currently experimenting with Solr Cell to index files to Solr.
> > During
> >> > this some questions came up.
> >> >
> >> > 1. Is it possible (and wise) to connect to Solr Cell with multiple
> > Threads
> >> > at the same time to index several documents at the same time?
> >> > This question came up because my prrogramm takes about 6hours to 
index
> >> > round 35000 docs. (no production environment, only example solr and 
a
> >> > little desktop machine but I think its very slow, and I know solr
> > isn't
> >> > the bottleneck (yet))
> >> >
> >> > 2. If 1 is possible, how many Threads should do this and how many
> > memory
> >> > Solr needs? I've tried it but i run into an out of memory 
exception.
> >> >
> >> > Thanks in advantage
> >> >
> >> > Best Regards
> >> > Johannes


Antwort: RE: Group.query

2012-09-26 Thread Johannes . Schwendinger
I think what you need is facetting, or is this another thing?
http://searchhub.org/dev/2009/09/02/faceted-search-with-solr/

Peter Kirk  schrieb am 26.09.2012 12:18:32:

> Von:
> 
> Peter Kirk 
> 
> An:
> 
> "solr-user@lucene.apache.org" 
> 
> Datum:
> 
> 26.09.2012 12:19
> 
> Betreff:
> 
> RE: Group.query
> 
> Thanks. Yes I can do this - but doesn't it mean I need to execute a 
> query per group?
> 
> What I really want to do (and I'm sorry I'm not so good at 
> explaining) is to execute one query for products, and receive 
> results grouped by the groups - but where a particular product may 
> be found in several groups.
> 
> For example, I'd like to execute a query for all products which 
> match "bucket".
> There are several products which are "buckets", each of which can 
> belong to several groups.
> Would it be possible to generate a query which would return the 
> groups, each with a list of the buckets?
> 
> Example result, with 3 groups, and several products (which may occur
> in several groups).
> 
> Children_sand_toys
>   Castle bucket
>   Plain bucket
> 
> Boys_toys
>   Castle bucket
>   Truck bucket
> 
> Girls_toys
>   Castle bucket
>   Large Pony bucket
> 
> Thanks,
> Peter
> 
> -Original Message-
> From: Ingar Hov [mailto:ingar@gmail.com] 
> Sent: 26. september 2012 11:57
> To: solr-user@lucene.apache.org
> Subject: Re: Group.query
> 
> I hope I understood the question, if so this may be a solution:
> 
> Why don't you make the field group for product multiple?
> 
> Example:
> 
>  multiValued="true"/>
> 
> If the product is a member of group1 and group2, just add both for 
> the product document so that each product has an array of group. 
> Then you can easily get all products for group1 by doing query: 
group:group1
> 
> Regards,
> Ingar
> 
> 
> 
> On Wed, Sep 26, 2012 at 10:48 AM, Peter Kirk  
wrote:
> > Thanks. Yes, the only solution I could think of was to execute 
> several queries.
> > I would like it to be a single query if at all possible. If anyone
> has ideas I could look into that would be great.
> > Thanks,
> > Peter
> >
> >
> > -Original Message-
> > From: Aditya [mailto:findbestopensou...@gmail.com]
> > Sent: 26. september 2012 10:41
> > To: solr-user@lucene.apache.org
> > Subject: Re: Group.query
> >
> > Hi
> >
> > You are doing AND search, so you are getting results prod1 and 
> prod2. I guess, you should query only for group1 and another query for 
group2.
> >
> > Regards
> > Aditya
> > www.findbestopensource.com
> >
> >
> >
> > On Wed, Sep 26, 2012 at 12:26 PM, Peter Kirk  
wrote:
> >
> >> Hi
> >>
> >> I have "products" which belong to one or more "groups".
> >> Products are documents in Solr, while the groups are fields (eg.
> >> group_1_bool:true).
> >>
> >> For example:
> >>
> >> Prod1 => group1, group2
> >> Prod2 => group1, group2
> >> Prod3 => group1
> >> Prod4 => group2
> >>
> >> I would like to execute a query which results in the groups with 
> >> their products. That is, the result should be something like:
> >>
> >> Group1 => Prod1, Prod2, Prod3
> >> Group2 => Prod1, Prod2, Prod4
> >>
> >> How can I do this?
> >>
> >> I've been looking at group.query, but I don't think this is what I 
want.
> >>
> >> For example, 
"q=*:*&group.query=group_1_bool:true+AND+group_2_bool:true"
> >> Results in 1 group called "group_1_bool:true AND group_2_bool:true", 
> >> which contains prod1 and prod2.
> >>
> >>
> >> Thanks,
> >> Peter
> >>
> >>
> >
> 
>