date:20110122

Re: SolrJ Tutorial

2011-01-22 Thread Bing Li

I got the solution. Attach one complete sample code I made as follows.

Thanks,
LB

package com.greatfree.Solr;

import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.params.ModifiableSolrParams;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.common.SolrDocumentList;
import org.apache.solr.client.solrj.beans.Field;

import java.net.MalformedURLException;

public class SolrJExample
{
public static void main(String[] args) throws MalformedURLException,
SolrServerException
{
SolrServer solr = new CommonsHttpSolrServer("
http://192.168.210.195:8080/solr/CategorizedHub";);

SolrQuery query = new SolrQuery();
query.setQuery("*:*");
QueryResponse rsp = solr.query(query);
SolrDocumentList docs = rsp.getResults();
System.out.println(docs.getNumFound());

try
{
SolrServer solrScore = new CommonsHttpSolrServer("
http://192.168.210.195:8080/solr/score";);
Score score = new Score();
score.id = "4";
score.type = "modern";
score.name = "iphone";
score.score = 97;
solrScore.addBean(score);
solrScore.commit();
}
catch (Exception e)
{
System.out.println(e.toString());
}

}
}


On Sat, Jan 22, 2011 at 3:58 PM, Lance Norskog  wrote:

> The unit tests are simple and show the steps.
>
> Lance
>
> On Fri, Jan 21, 2011 at 10:41 PM, Bing Li  wrote:
> > Hi, all,
> >
> > In the past, I always used SolrNet to interact with Solr. It works great.
> > Now, I need to use SolrJ. I think it should be easier to do that than
> > SolrNet since Solr and SolrJ should be homogeneous. But I cannot find a
> > tutorial that is easy to follow. No tutorials explain the SolrJ
> programming
> > step by step. No complete samples are found. Could anybody offer me some
> > online resources to learn SolrJ?
> >
> > I also noticed Solr Cell and SolrJ POJO. Do you have detailed resources
> to
> > them?
> >
> > Thanks so much!
> > LB
> >
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-22 Thread Sami Siren

> Where do you get your Lucene/Solr downloads from?
>
> [] ASF Mirrors (linked in our release announcements or via the Lucene website)
>
> [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.)
>
> [X] I/we build them from source via an SVN/Git checkout.
>
> [] Other (someone in your company mirrors them internally or via a downstream 
> project)

--
 Sami Siren

Re: Is solr 4.0 ready for prime time? (or other ways to use geo distance in search)

2011-01-22 Thread Robert Muir

On Fri, Jan 21, 2011 at 11:53 PM, Lance Norskog  wrote:
> The Solr 4 branch is nowhere near ready for prime time. For example,
> within the past week code was added that forces you to completely
> reindex all of the documents you had. Solr 4 is really the "trunk".
> The low-level stuff is being massively changed to allow very big
> performance improvements and new features.

Changing the index format is not a sign of instability, we did this to
improve performance. So, changing the index format is in no way a bad
sign, nor indicative of whether or not the trunk is good for
production use.

You aren't forced to re-index all your documents if you are riding
trunk -- its your decision to make that tradeoff when you type 'svn
update'. If you want stability you can take a snapshot (e.g. nightly
build), and just stick with it.

Re: Is solr 4.0 ready for prime time? (or other ways to use geo distance in search)

2011-01-22 Thread Estrada Groups

I tried to build yeaterdays svn trunk of 4.0 and got massive failures... The 
Hudson zipped up version seems to work without any issues. Has anyone else seem 
this build issue on the Mac? I guess this also has to do with Grants recent 
poll...

Adam


On Jan 22, 2011, at 6:34 AM, Robert Muir  wrote:

> On Fri, Jan 21, 2011 at 11:53 PM, Lance Norskog  wrote:
>> The Solr 4 branch is nowhere near ready for prime time. For example,
>> within the past week code was added that forces you to completely
>> reindex all of the documents you had. Solr 4 is really the "trunk".
>> The low-level stuff is being massively changed to allow very big
>> performance improvements and new features.
> 
> Changing the index format is not a sign of instability, we did this to
> improve performance. So, changing the index format is in no way a bad
> sign, nor indicative of whether or not the trunk is good for
> production use.
> 
> You aren't forced to re-index all your documents if you are riding
> trunk -- its your decision to make that tradeoff when you type 'svn
> update'. If you want stability you can take a snapshot (e.g. nightly
> build), and just stick with it.

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-22 Thread Jan-Olav Eide


> 
> [] ASF Mirrors (linked in our release announcements or via the Lucene website)
> 
> [x] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.)
> 
> [] I/we build them from source via an SVN/Git checkout.
> 
> [] Other (someone in your company mirrors them internally or via a downstream 
> project)
> 
> Please put an X in the box that applies to you.  Multiple selections are OK 
> (for instance, if one project uses a mirror and another uses Maven)
> 
> Please do not turn this thread into a discussion on Maven and it's 
> (de)merits, I simply want to know, informally, where people get their JARs 
> from.  In other words, no discussion is necessary (we already have that going 
> on d...@lucene.apache.org which you are welcome to join.)
> 
> Thanks,
> Grant

SolrCloud Questions for MultiCore Setup

2011-01-22 Thread Em

Hello list,

i want to experiment with the new SolrCloud feature. So far, I got
absolutely no experience in distributed search with Solr.
However, there are some things that remain unclear to me:

1 ) What is the usecase of a collection?
As far as I understood: A collection is the same as a core but in a
distributed sense. It contains a set of cores on one or multiple machines.
It makes sense that all the cores in a collection got the same schema and
solrconfig - right?
Can someone tell me if I understood the concept of a collection correctly?

2 ) The wiki says this will cause an update
-Durl=http://localhost:8983/solr/collection1/update
However, as far as I know this cause an update to a CORE named "collection1"
at localhost:8983, not to the full collection. Am I correct here?
So *I* have to care about consistency between the different replicas inside
my cloud?

3 ) If I got replicas of the same shard inside a collection, how does
SolrCloud determine that two documents in a result set are equal? Is it
neccessary to define a unique key? Is it random which of the two documents
is picked into the final resultset?

---
I think these are my most basic questions.
However, there is one more tricky thing:

If I understood the collection-idea correctly: What happens if I create two
cores and each core belongs to a different collection and THEN I do a SWAP.
Say: core1->collection1, core2->collection2
SWAP core1,core2
Does core2 now maps to collection1?

Thank you!
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-Questions-for-MultiCore-Setup-tp2309443p2309443.html
Sent from the Solr - User mailing list archive at Nabble.com.

api key filtering

2011-01-22 Thread Matt Mitchell

Just wanted to see if others are handling this in some special way, but I
think this is pretty simple.

We have a database of api keys that map to "allowed" db records. I'm
planning on indexing the db records into solr, along with their api keys in
an indexed, non-stored, multi-valued field. Then, to query for docs that
belong to a particular api key, they'll be queried using a filter query on
api_key.

The only concern of mine is that, what if we end up with 100k api_keys?
Would it be a problem to have 100k non-stored keys in each document? We have
about 500k documents total.

Matt

Re: api key filtering

2011-01-22 Thread Dennis Gearon

The only way that you would have that many api keys per record, is if one of 
them represented 'public', right? 'public' is a ROLE. Your answer is to use 
RBAC 
style techniques.


Here are some links that I have on the subject. What I'm thinking of doing is:
Sorry for formatting, Firefox is freaking out. I cut and pasted these from an 
email from my sent box. I hope the links came out.


Part 1

http://www.xaprb.com/blog/2006/08/16/how-to-build-role-based-access-control-in-sql/


Part2
Role-based access control in SQL, part 2 at Xaprb 





ACL/RBAC Bookmarks ALL

UserRbac - symfony - Trac 
A Role-Based Access Control (RBAC) system for PHP 
Appendix C: Task-Field Access 
Role-based access control in SQL, part 2 at Xaprb 
PHP Access Control - PHP5 CMS Framework Development | PHP Zone 
Linux file and directory permissions 
MySQL :: MySQL 5.0 Reference Manual :: C.5.4.1 How to Reset the Root Password 
per RECORD/Entity permissions? - symfony users | Google Groups 
Special Topics: Authentication and Authorization | The Definitive Guide to Yii 
| 
Yii Framework 

att.net Mail (gear...@sbcglobal.net) 
Solr - User - Modelling Access Control 
PHP Generic Access Control Lists 
Row-level Model Access Control for CakePHP « some flot, some jet 
Row-level Model Access Control for CakePHP « some flot, some jet 
Yahoo! GeoCities: Get a web site with easy-to-use site building tools. 
Class that acts as a client to a JSON service : JSON « GWT « Java 
Juozas Kaziukėnas devBlog 
Re: [symfony-users] Implementing an existing ACL API in symfony 
php - CakePHP ACL Database Setup: ARO / ACO structure? - Stack Overflow 
W3C ACL System 
makeAclTables.sql 
SchemaWeb - Classes And Properties - ACL Schema 
Reardon's Ruminations: Spring Security ACL Schema for Oracle 
trunk/modules/auth/libraries/Khacl.php | Source/SVN | Assembla 
Acl.php - kohana-mptt - Project Hosting on Google Code 
Asynchronous JavaScript Technology and XML (Ajax) With the Java Platform 
The page cannot be found 
 

 Dennis Gearon


Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



- Original Message 
From: Matt Mitchell 
To: solr-user@lucene.apache.org
Sent: Sat, January 22, 2011 11:48:22 AM
Subject: api key filtering

Just wanted to see if others are handling this in some special way, but I
think this is pretty simple.

We have a database of api keys that map to "allowed" db records. I'm
planning on indexing the db records into solr, along with their api keys in
an indexed, non-stored, multi-valued field. Then, to query for docs that
belong to a particular api key, they'll be queried using a filter query on
api_key.

The only concern of mine is that, what if we end up with 100k api_keys?
Would it be a problem to have 100k non-stored keys in each document? We have
about 500k documents total.

Matt

Re: api key filtering

2011-01-22 Thread Matt Mitchell

Hey thanks I'll definitely have a read. The only problem with this though,
is that our api is a thin layer of app-code, with solr only (no db), we
index data from our sql db into solr, and push the index off for
consumption.

The only other idea I had was to send a list of the allowed document ids
along with every solr query, but then I'm sure I'd run into a filter query
limit. Each key could be associated with up to 2k documents, so that's 2k
values in an fq which would probably be too many for lucene (I think its
limit 1024).

Matt

On Sat, Jan 22, 2011 at 3:40 PM, Dennis Gearon wrote:

> The only way that you would have that many api keys per record, is if one
> of
> them represented 'public', right? 'public' is a ROLE. Your answer is to use
> RBAC
> style techniques.
>
>
> Here are some links that I have on the subject. What I'm thinking of doing
> is:
> Sorry for formatting, Firefox is freaking out. I cut and pasted these from
> an
> email from my sent box. I hope the links came out.
>
>
> Part 1
>
>
> http://www.xaprb.com/blog/2006/08/16/how-to-build-role-based-access-control-in-sql/
>
>
> Part2
> Role-based access control in SQL, part 2 at Xaprb
>
>
>
>
>
> ACL/RBAC Bookmarks ALL
>
> UserRbac - symfony - Trac
> A Role-Based Access Control (RBAC) system for PHP
> Appendix C: Task-Field Access
> Role-based access control in SQL, part 2 at Xaprb
> PHP Access Control - PHP5 CMS Framework Development | PHP Zone
> Linux file and directory permissions
> MySQL :: MySQL 5.0 Reference Manual :: C.5.4.1 How to Reset the Root
> Password
> per RECORD/Entity permissions? - symfony users | Google Groups
> Special Topics: Authentication and Authorization | The Definitive Guide to
> Yii |
> Yii Framework
>
> att.net Mail (gear...@sbcglobal.net)
> Solr - User - Modelling Access Control
> PHP Generic Access Control Lists
> Row-level Model Access Control for CakePHP « some flot, some jet
> Row-level Model Access Control for CakePHP « some flot, some jet
> Yahoo! GeoCities: Get a web site with easy-to-use site building tools.
> Class that acts as a client to a JSON service : JSON « GWT « Java
> Juozas Kaziukėnas devBlog
> Re: [symfony-users] Implementing an existing ACL API in symfony
> php - CakePHP ACL Database Setup: ARO / ACO structure? - Stack Overflow
> W3C ACL System
> makeAclTables.sql
> SchemaWeb - Classes And Properties - ACL Schema
> Reardon's Ruminations: Spring Security ACL Schema for Oracle
> trunk/modules/auth/libraries/Khacl.php | Source/SVN | Assembla
> Acl.php - kohana-mptt - Project Hosting on Google Code
> Asynchronous JavaScript Technology and XML (Ajax) With the Java Platform
> The page cannot be found
>
>
>  Dennis Gearon
>
>
> Signature Warning
> 
> It is always a good idea to learn from your own mistakes. It is usually a
> better
> idea to learn from others’ mistakes, so you do not have to make them
> yourself.
> from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
>
>
> EARTH has a Right To Life,
> otherwise we all die.
>
>
>
> - Original Message 
> From: Matt Mitchell 
> To: solr-user@lucene.apache.org
> Sent: Sat, January 22, 2011 11:48:22 AM
> Subject: api key filtering
>
> Just wanted to see if others are handling this in some special way, but I
> think this is pretty simple.
>
> We have a database of api keys that map to "allowed" db records. I'm
> planning on indexing the db records into solr, along with their api keys in
> an indexed, non-stored, multi-valued field. Then, to query for docs that
> belong to a particular api key, they'll be queried using a filter query on
> api_key.
>
> The only concern of mine is that, what if we end up with 100k api_keys?
> Would it be a problem to have 100k non-stored keys in each document? We
> have
> about 500k documents total.
>
> Matt
>
>

Re: Solr with many indexes

2011-01-22 Thread Erick Erickson

See below.

On Wed, Jan 19, 2011 at 7:26 PM, Joscha Feth  wrote:

> Hello Erick,
>
> Thanks for your answer!
>
> But I question why you *require* many different indexes. [...] including
> > isolating one
> > users'
> > data from all others, [...]
>
>
> Yes, thats exactly what I am after - I need to make sure that indexes don't
> mix, as every user shall only be able to query his own data (index).
>

well, this can also be handled by simply appending the equivalent of
+user:theuser
to each query. This solution does have some "interesting" side effects
though.
In particular if you autosuggest based on combined documents, users will see
terms NOT in documents they own.


>
> And even using lots of cores can be made to work if you don't pre-warm
> > newly-opened
> > cores, assuming that the response time when using "cold searchers" is
> > adequate.
> >
>
> Could you explain that further or point me to some documentation? Are you
> talking about: http://wiki.apache.org/solr/CoreAdmin#UNLOAD? if yes, LOAD
> does not seem to be implemented, yet. Or has this something to do with
> http://wiki.apache.org/solr/SolrCaching#autowarmCount only? About what
> time
> per X documents are we talking here for delay if auto warming is disabled?
> Is there more documentation about this setting?
>
>
It's the autoWarm parameter. When you open a core the first few queries that
run
on it will pay some penalty for filling caches etc. If your cores are small
enough,
then this penalty may not be noticeable to your users, in which case you can
just
not bother autowarming (see  , ). You might also
be able to get away with having very small caches, it mostly depends on your
usage patterns. If your pattern as that a user signs on, makes one search
and
signs off, there may not be much good in having large caches. On the other
and,
if users sign on and search for hours continually, their experience may be
enhanced
by having significant caches. It all depends.

Hopt that helps
Erick


> Kind regards,
> Joscha
>

Re: api key filtering

2011-01-22 Thread Dennis Gearon

The links didn't work, so here the are again, NOT from a sent folder:

PHP Access Control - PHP5 CMS Framework Development | PHP Zone
 A Role-Based Access Control (RBAC) system for PHP 
Appendix C: Task-Field Access 
Role-based access control in SQL, part 2 at Xaprb 
PHP Access Control - PHP5 CMS Framework Development | PHP Zone
 UserRbac - symfony - Trac 
A Role-Based Access Control (RBAC) system for PHP 
Appendix C: Task-Field Access 
Role-based access control in SQL, part 2 at Xaprb 
UserRbac - symfony - Trac 
Acl.php - kohana-mptt - Project Hosting on Google Code 
CANDIDATE-PHP Generic Access Control Lists 
http://dev.w3.org/perl/modules/W3C/Rnodes/bin/makeAclTables.sql 
makeAclTables.sql 
php - CakePHP ACL Database Setup: ARO / ACO structure? - Stack Overflow 
PHP Generic Access Control Lists 
Reardon's Ruminations: Spring Security ACL Schema for Oracle 
Re: [symfony-users] Implementing an existing ACL API in symfony 
SchemaWeb - Classes And Properties - ACL Schema 
trunk/modules/auth/libraries/Khacl.php | Source/SVN | Assembla 
Using Zend_Acl with a database backend - Zend Framework Wiki 
W3C ACL System 

 Dennis Gearon

Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'

EARTH has a Right To Life,
otherwise we all die.

- Original Message 
From: Matt Mitchell 
To: solr-user@lucene.apache.org
Sent: Sat, January 22, 2011 12:50:24 PM
Subject: Re: api key filtering

Hey thanks I'll definitely have a read. The only problem with this though,
is that our api is a thin layer of app-code, with solr only (no db), we
index data from our sql db into solr, and push the index off for
consumption.

The only other idea I had was to send a list of the allowed document ids
along with every solr query, but then I'm sure I'd run into a filter query
limit. Each key could be associated with up to 2k documents, so that's 2k
values in an fq which would probably be too many for lucene (I think its
limit 1024).

Matt

On Sat, Jan 22, 2011 at 3:40 PM, Dennis Gearon wrote:

> The only way that you would have that many api keys per record, is if one
> of
> them represented 'public', right? 'public' is a ROLE. Your answer is to use
> RBAC
> style techniques.
>
>
> Here are some links that I have on the subject. What I'm thinking of doing
> is:
> Sorry for formatting, Firefox is freaking out. I cut and pasted these from
> an
> email from my sent box. I hope the links came out.
>
>
> Part 1
>
>
>http://www.xaprb.com/blog/2006/08/16/how-to-build-role-based-access-control-in-sql/
>/
>
>
> Part2
> Role-based access control in SQL, part 2 at Xaprb
>
>
>
>
>
> ACL/RBAC Bookmarks ALL
>
> UserRbac - symfony - Trac
> A Role-Based Access Control (RBAC) system for PHP
> Appendix C: Task-Field Access
> Role-based access control in SQL, part 2 at Xaprb
> PHP Access Control - PHP5 CMS Framework Development | PHP Zone
> Linux file and directory permissions
> MySQL :: MySQL 5.0 Reference Manual :: C.5.4.1 How to Reset the Root
> Password
> per RECORD/Entity permissions? - symfony users | Google Groups
> Special Topics: Authentication and Authorization | The Definitive Guide to
> Yii |
> Yii Framework
>
> att.net Mail (gear...@sbcglobal.net)
> Solr - User - Modelling Access Control
> PHP Generic Access Control Lists
> Row-level Model Access Control for CakePHP « some flot, some jet
> Row-level Model Access Control for CakePHP « some flot, some jet
> Yahoo! GeoCities: Get a web site with easy-to-use site building tools.
> Class that acts as a client to a JSON service : JSON « GWT « Java
> Juozas Kaziukėnas devBlog
> Re: [symfony-users] Implementing an existing ACL API in symfony
> php - CakePHP ACL Database Setup: ARO / ACO structure? - Stack Overflow
> W3C ACL System
> makeAclTables.sql
> SchemaWeb - Classes And Properties - ACL Schema
> Reardon's Ruminations: Spring Security ACL Schema for Oracle
> trunk/modules/auth/libraries/Khacl.php | Source/SVN | Assembla
> Acl.php - kohana-mptt - Project Hosting on Google Code
> Asynchronous JavaScript Technology and XML (Ajax) With the Java Platform
> The page cannot be found
>
>
>  Dennis Gearon
>
>
> Signature Warning
> 
> It is always a good idea to learn from your own mistakes. It is usually a
> better
> idea to learn from others’ mistakes, so you do not have to make them
> yourself.
> from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
>
>
> EARTH has a Right To Life,
> otherwise we all die.
>
>
>
> - Original Message 
> From: Matt Mitchell 
> To: solr-user@lucene.apache.org
> Sent: Sat, January 22, 2011 11:48:22 AM
> Subject: api key filtering
>
> Just wanted to see if others are handling this in some special way, but I
> think this is pretty simple.
>
> We have a database of api keys that map to "allowed" db records. I'm
> planning

Re: solrconfig.xml settings question

2011-01-22 Thread Erick Erickson

Yep, that's about it. By far the main constraint is memory and the caches
are what eats it up. So by minimizing the caches on the master (since they
are filled by searching) you speed that part up.

By maximizing the cache settings on the servers, you make them go as fast
as possible.

RamBufferSize is irrelevant on the searcher. It governs how much data
is stored in RAM when *indexing* before flushing to disk. This usually
gets to diminishing returns at 128M BTW.

Oh, there is one other thing on the searchers that can really hurt to
frequent polling of the master if the master is "furiously indexing" polling
too often can lead to thrashing if the time it takes to autowarm is longer
than the polling interval, which you can only figure out by measuring...

Best
Erick

On Thu, Jan 20, 2011 at 8:34 AM, kenf_nc  wrote:

>
> Is that it? Of all the strange, esoteric, little understood configuration
> settings available in solrconfig.xml, the only thing that affects Index
> Speed vs Query Speed is turning on/off the Query Cache and RamBufferSize?
> And for the latter, why wouldn't RamBufferSize be the same for both...that
> is, as high as you can make it?
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/solrconfig-xml-settings-question-tp2271594p2294668.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: api key filtering

2011-01-22 Thread Dennis Gearon

Dang! There were hot, clickable links in the web mail I put them in. I guess 
you 
guys can search for those strings on google and find them. Sorry.

- Original Message 
From: Dennis Gearon 
To: solr-user@lucene.apache.org
Sent: Sat, January 22, 2011 1:09:26 PM
Subject: Re: api key filtering

The links didn't work, so here the are again, NOT from a sent folder:

PHP Access Control - PHP5 CMS Framework Development | PHP Zone
A Role-Based Access Control (RBAC) system for PHP 
Appendix C: Task-Field Access 
Role-based access control in SQL, part 2 at Xaprb 
PHP Access Control - PHP5 CMS Framework Development | PHP Zone
UserRbac - symfony - Trac 
A Role-Based Access Control (RBAC) system for PHP 
Appendix C: Task-Field Access 
Role-based access control in SQL, part 2 at Xaprb 
UserRbac - symfony - Trac 
Acl.php - kohana-mptt - Project Hosting on Google Code 
CANDIDATE-PHP Generic Access Control Lists 
http://dev.w3.org/perl/modules/W3C/Rnodes/bin/makeAclTables.sql 
makeAclTables.sql 
php - CakePHP ACL Database Setup: ARO / ACO structure? - Stack Overflow 
PHP Generic Access Control Lists 
Reardon's Ruminations: Spring Security ACL Schema for Oracle 
Re: [symfony-users] Implementing an existing ACL API in symfony 
SchemaWeb - Classes And Properties - ACL Schema 
trunk/modules/auth/libraries/Khacl.php | Source/SVN | Assembla 
Using Zend_Acl with a database backend - Zend Framework Wiki 
W3C ACL System 

Dennis Gearon

Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 

idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'

EARTH has a Right To Life,
otherwise we all die.

- Original Message 
From: Matt Mitchell 
To: solr-user@lucene.apache.org
Sent: Sat, January 22, 2011 12:50:24 PM
Subject: Re: api key filtering

Hey thanks I'll definitely have a read. The only problem with this though,
is that our api is a thin layer of app-code, with solr only (no db), we
index data from our sql db into solr, and push the index off for
consumption.

The only other idea I had was to send a list of the allowed document ids
along with every solr query, but then I'm sure I'd run into a filter query
limit. Each key could be associated with up to 2k documents, so that's 2k
values in an fq which would probably be too many for lucene (I think its
limit 1024).

Matt

On Sat, Jan 22, 2011 at 3:40 PM, Dennis Gearon wrote:

> The only way that you would have that many api keys per record, is if one
> of
> them represented 'public', right? 'public' is a ROLE. Your answer is to use
> RBAC
> style techniques.
>
>
> Here are some links that I have on the subject. What I'm thinking of doing
> is:
> Sorry for formatting, Firefox is freaking out. I cut and pasted these from
> an
> email from my sent box. I hope the links came out.
>
>
> Part 1
>
>
>http://www.xaprb.com/blog/2006/08/16/how-to-build-role-based-access-control-in-sql/
>
>/
>
>
> Part2
> Role-based access control in SQL, part 2 at Xaprb
>
>
>
>
>
> ACL/RBAC Bookmarks ALL
>
> UserRbac - symfony - Trac
> A Role-Based Access Control (RBAC) system for PHP
> Appendix C: Task-Field Access
> Role-based access control in SQL, part 2 at Xaprb
> PHP Access Control - PHP5 CMS Framework Development | PHP Zone
> Linux file and directory permissions
> MySQL :: MySQL 5.0 Reference Manual :: C.5.4.1 How to Reset the Root
> Password
> per RECORD/Entity permissions? - symfony users | Google Groups
> Special Topics: Authentication and Authorization | The Definitive Guide to
> Yii |
> Yii Framework
>
> att.net Mail (gear...@sbcglobal.net)
> Solr - User - Modelling Access Control
> PHP Generic Access Control Lists
> Row-level Model Access Control for CakePHP « some flot, some jet
> Row-level Model Access Control for CakePHP « some flot, some jet
> Yahoo! GeoCities: Get a web site with easy-to-use site building tools.
> Class that acts as a client to a JSON service : JSON « GWT « Java
> Juozas Kaziukėnas devBlog
> Re: [symfony-users] Implementing an existing ACL API in symfony
> php - CakePHP ACL Database Setup: ARO / ACO structure? - Stack Overflow
> W3C ACL System
> makeAclTables.sql
> SchemaWeb - Classes And Properties - ACL Schema
> Reardon's Ruminations: Spring Security ACL Schema for Oracle
> trunk/modules/auth/libraries/Khacl.php | Source/SVN | Assembla
> Acl.php - kohana-mptt - Project Hosting on Google Code
> Asynchronous JavaScript Technology and XML (Ajax) With the Java Platform
> The page cannot be found
>
>
>  Dennis Gearon
>
>
> Signature Warning
> 
> It is always a good idea to learn from your own mistakes. It is usually a
> better
> idea to learn from others’ mistakes, so you do not have to make them
> yourself.
> from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
>
>
> EARTH has a Right To Life,
> otherwise we all die.
>
>
>
> - Original Message 
> From: Matt Mi

Re: Indexing all permutations of words from the input

2011-01-22 Thread Erick Erickson

OK, Idea from left field off the top of my head, so don't take it for
gospel...

Create a second index where you send your data, each phrase is really a
"document"
and query *that* index for your autosuggest. Perhaps this could be a
secondary core.
It could even be a set of *special* documents in your existing index that
had orthogonal
fields to the normal ones.

The idea is that you'd have a "document" consisting of one stored and
indexed field
that contained "abc xyz foo". Searching for "+abc +foo" (no quotes) would
return it,
as would searching for "+abc +xyz" or +abc or +foo... You could even do
some
interesting things with dismax if you required some rule like "at least two
terms
must match if there are three" I think...

You'd have to do something about duplicates here...

Best
Erick

On Thu, Jan 20, 2011 at 4:58 PM, Steven A Rowe  wrote:

> Hi Martin,
>
> The co-occurrence filter I'm working on at
> https://issues.apache.org/jira/browse/LUCENE-2749 would do what you want
> (among other things).  Still vaporware at this point, as I've only put a
> couple of hours into it, so don't hold your breath :)
>
> Steve
>
> > -Original Message-
> > From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
> > Sent: Thursday, January 20, 2011 4:46 PM
> > To: Martin Jansen
> > Cc: solr-user@lucene.apache.org
> > Subject: Re: Indexing all permutations of words from the input
> >
> > Aha, I have no idea if there actually is a better way of achieving that,
> > auto-completion with Solr is always tricky and I personally have not
> > been happy with any of the designs I've seen suggested for it.  But I'm
> > also not entirely sure your design will actually work, but neither am I
> > sure it won't!
> >
> > I am thinking maybe for that auto-complete use, you will actually need
> > your field to be NOT tokenized, so you won't want to use the WhiteSpace
> > tokenizer after all (I think!) -- unless maybe there's another filter
> > you can put at the end of the chain that will take all the tokens and
> > join them back together,  seperated by a single space,  as a single
> > token.  But I do think you'll need the whole multi-word string to be a
> > single token in order to use terms.prefix how you want.
> >
> > If you can't make ShingleFilter do it though, I don't think there is any
> > built in analyzers that will do the transformation you want. You could
> > write your own in Java, perhaps based on ShingleFilter -- or it might be
> > easier to have your own software make the transformations you want and
> > then simply send the pre-transformed strings to Solr when indexing. Then
> > you could simply send them to a 'string' type field that won't tokenize.
> >
> > On 1/20/2011 4:40 PM, Martin Jansen wrote:
> > > On 20.01.11 22:19, Jonathan Rochkind wrote:
> > >> On 1/20/2011 4:03 PM, Martin Jansen wrote:
> > >>> I'm looking for an   configuration for Solr 1.4 that
> > >>> accomplishes the following:
> > >>>
> > >>> Given the input "abc xyz foo" I would like to add at least the
> > following
> > >>> token combinations to the index:
> > >>>
> > >>>  abc
> > >>>  abc xyz
> > >>>  abc xyz foo
> > >>>  abc foo
> > >>>  xyz
> > >>>  xyz foo
> > >>>  foo
> > >>>
> > >> Why do you want to do this, what is it meant to accomplish?  There
> > might be a better way to accomplish what it is you are trying to do; I
> > can't think of anything (which doesn't mean it doesn't exist) that what
> > you're actually trying to do would be required in order to do.  What
> sorts
> > of queries do you intend to serve with this setup?
> > > I'm in the process of setting up an index for term suggestion. In my
> use
> > > case people should get the suggestion "abc foo" for the search query
> > > "abc fo" and under the assumption that "abc xyz foo" has been submitted
> > > to the index.
> > >
> > > My current plan is to use TermsComponent with the terms.prefix=
> > > parameter for this, because it seems to be pretty efficient and I get
> > > things like correct sorting for free.
> > >
> > > I assume there is a better way for achieving this then?
> > >
> > > - Martin
>

Re: Indexing same data in multiple fields with different filters

2011-01-22 Thread Erick Erickson

I'm assuming that this is just one example of many different
kinds of transformations you could do. It *seems* like a variant
of a synonym analyzer, so you could write a custom analyzer
(it's not actuall hard) to create a bunch of synonyms
for your "special" terms at index time. Or you could use the
synonyms at query time (query time is more flexible)

Best
Erick

On Thu, Jan 20, 2011 at 5:38 AM, shm  wrote:

> Hi, I have a little problem regarding indexing, that i don't know
> how to solve, i need to index the same data in different ways
> into the same field. The problem is a normalization problem, and
> here is an example:
>
> I have a special character \uA732, which i need to normalize in
> two different ways for phrase searching. So if i encounter this
> character in, for example, title field I would like it to result
> in these two phrase fields:
>
>raw data = "\uA732lborg"
>phrase.title= "ålborg"
>phrase.title= "aalborg"
>
> Because both ways are valid representations of tyhe phrase.
>
> I can copy the field from the raw data, but then i cannot
> normalize them differently, so i am at a loss.
>
> Does anyone have a solution or a good idea?
>
> Regards
>   shm
>
>

Re: Multicore Relaod Theoretical Question

2011-01-22 Thread Erick Erickson

This seems far too complex to me. Why not just optimize on the master
and let replication do all the rest for you?

Best
Erick

On Fri, Jan 21, 2011 at 1:07 PM, Em  wrote:

>
> Hi,
>
> are there no experiences or thoughts?
> How would you solve this at Lucene-Level?
>
> Regards
>
>
> Em wrote:
> >
> > Hello list,
> >
> > I got a theoretical question about a Multicore-Situation:
> >
> > I got two cores: active, inactive
> >
> > The active core serves all the queries.
> >
> > The inactive core is the tricky thing:
> > I create an optimized index outside the environment and want to insert
> > that optimized index 1 to 1 into the inactive core, which means replacing
> > everything inside the index-directory.
> > After this is done, I would like to reload the inactive core, so that it
> > is ready for a core-swap and ready for serving queries on top of the new
> > inserted optimized index.
> >
> > Is it possible to handle such a situation?
> >
> > Thank you.
> >
> >
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Multicore-Relaod-Theoretical-Question-tp2293999p2303585.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: api key filtering

2011-01-22 Thread Erick Erickson

1024 is the default number, it can be increased. See MaxBooleanClauses
in solrconfig.xml

This shouldn't be a problem with 2K clauses, but expanding it to tens of
thousands is probably a mistake (but test to be sure).

Best
Erick

On Sat, Jan 22, 2011 at 3:50 PM, Matt Mitchell  wrote:

> Hey thanks I'll definitely have a read. The only problem with this though,
> is that our api is a thin layer of app-code, with solr only (no db), we
> index data from our sql db into solr, and push the index off for
> consumption.
>
> The only other idea I had was to send a list of the allowed document ids
> along with every solr query, but then I'm sure I'd run into a filter query
> limit. Each key could be associated with up to 2k documents, so that's 2k
> values in an fq which would probably be too many for lucene (I think its
> limit 1024).
>
> Matt
>
> On Sat, Jan 22, 2011 at 3:40 PM, Dennis Gearon  >wrote:
>
> > The only way that you would have that many api keys per record, is if one
> > of
> > them represented 'public', right? 'public' is a ROLE. Your answer is to
> use
> > RBAC
> > style techniques.
> >
> >
> > Here are some links that I have on the subject. What I'm thinking of
> doing
> > is:
> > Sorry for formatting, Firefox is freaking out. I cut and pasted these
> from
> > an
> > email from my sent box. I hope the links came out.
> >
> >
> > Part 1
> >
> >
> >
> http://www.xaprb.com/blog/2006/08/16/how-to-build-role-based-access-control-in-sql/
> >
> >
> > Part2
> > Role-based access control in SQL, part 2 at Xaprb
> >
> >
> >
> >
> >
> > ACL/RBAC Bookmarks ALL
> >
> > UserRbac - symfony - Trac
> > A Role-Based Access Control (RBAC) system for PHP
> > Appendix C: Task-Field Access
> > Role-based access control in SQL, part 2 at Xaprb
> > PHP Access Control - PHP5 CMS Framework Development | PHP Zone
> > Linux file and directory permissions
> > MySQL :: MySQL 5.0 Reference Manual :: C.5.4.1 How to Reset the Root
> > Password
> > per RECORD/Entity permissions? - symfony users | Google Groups
> > Special Topics: Authentication and Authorization | The Definitive Guide
> to
> > Yii |
> > Yii Framework
> >
> > att.net Mail (gear...@sbcglobal.net)
> > Solr - User - Modelling Access Control
> > PHP Generic Access Control Lists
> > Row-level Model Access Control for CakePHP « some flot, some jet
> > Row-level Model Access Control for CakePHP « some flot, some jet
> > Yahoo! GeoCities: Get a web site with easy-to-use site building tools.
> > Class that acts as a client to a JSON service : JSON « GWT « Java
> > Juozas Kaziukėnas devBlog
> > Re: [symfony-users] Implementing an existing ACL API in symfony
> > php - CakePHP ACL Database Setup: ARO / ACO structure? - Stack Overflow
> > W3C ACL System
> > makeAclTables.sql
> > SchemaWeb - Classes And Properties - ACL Schema
> > Reardon's Ruminations: Spring Security ACL Schema for Oracle
> > trunk/modules/auth/libraries/Khacl.php | Source/SVN | Assembla
> > Acl.php - kohana-mptt - Project Hosting on Google Code
> > Asynchronous JavaScript Technology and XML (Ajax) With the Java Platform
> > The page cannot be found
> >
> >
> >  Dennis Gearon
> >
> >
> > Signature Warning
> > 
> > It is always a good idea to learn from your own mistakes. It is usually a
> > better
> > idea to learn from others’ mistakes, so you do not have to make them
> > yourself.
> > from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
> >
> >
> > EARTH has a Right To Life,
> > otherwise we all die.
> >
> >
> >
> > - Original Message 
> > From: Matt Mitchell 
> > To: solr-user@lucene.apache.org
> > Sent: Sat, January 22, 2011 11:48:22 AM
> > Subject: api key filtering
> >
> > Just wanted to see if others are handling this in some special way, but I
> > think this is pretty simple.
> >
> > We have a database of api keys that map to "allowed" db records. I'm
> > planning on indexing the db records into solr, along with their api keys
> in
> > an indexed, non-stored, multi-valued field. Then, to query for docs that
> > belong to a particular api key, they'll be queried using a filter query
> on
> > api_key.
> >
> > The only concern of mine is that, what if we end up with 100k api_keys?
> > Would it be a problem to have 100k non-stored keys in each document? We
> > have
> > about 500k documents total.
> >
> > Matt
> >
> >
>

Re: Multicore Relaod Theoretical Question

2011-01-22 Thread Em


Hi Erick,

thanks for your response.

Yes, it's really not that easy.

However, the target is to avoid any kind of master-slave-setup.

The most recent idea i got is to create a new core with a data-dir pointing
to an already existing directory with a fully optimized index.

Regards,
Em
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Multicore-Relaod-Theoretical-Question-tp2293999p2310709.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: api key filtering

2011-01-22 Thread Dennis Gearon

Got it, here are the links that I have on RBAC/ACL/Access Control. Some of 
these 
are specific to Solr.

http://www.xaprb.com/blog/2006/08/16/how-to-build-role-based-access-control-in-sql/
 
http://www.xaprb.com/blog/2006/08/18/role-based-access-control-in-sql-part-2/ 


http://php.dzone.com/articles/php-access-control?page=0,1 
http://www.tonymarston.net/php-mysql/role-based-access-control.html 
http://www.tonymarston.net/php-mysql/menuguide/appendixc.html 
http://php.dzone.com/articles/php-access-control?page=0,1 
http://trac.symfony-project.org/wiki/UserRbac 
http://www.tonymarston.net/php-mysql/role-based-access-control.html 
http://www.tonymarston.net/php-mysql/menuguide/appendixc.html 
http://trac.symfony-project.org/wiki/UserRbac
http://code.google.com/p/kohana-mptt/source/browse/trunk/acl/libraries/Acl.php?r=82
 
http://www.oracle.com/technetwork/articles/javaee/ajax-135201.html 
http://phpgacl.sourceforge.net/ 
http://www.java2s.com/Code/Java/GWT/ClassthatactsasaclienttoaJSONservice.htm 
http://dev.w3.org/perl/modules/W3C/Rnodes/bin/makeAclTables.sql 
http://dev.juokaz.com/ 
http://dev.w3.org/perl/modules/W3C/Rnodes/bin/makeAclTables.sql 
http://stackoverflow.com/questions/54230/cakephp-acl-database-setup-aro-aco-structure
 
http://phpgacl.sourceforge.net/ 
http://blog.reardonsoftware.com/2010/07/spring-security-acl-schema-for-oracle.html
 
http://www.mail-archive.com/symfony-users@googlegroups.com/msg29537.html 
http://www.schemaweb.info/schema/SchemaInfo.aspx?id=167 
http://www.assembla.com/code/backendpro/subversion/nodes/trunk/modules/auth/libraries/Khacl.php?rev=169
 
http://framework.zend.com/wiki/display/ZFUSER/Using+Zend_Acl+with+a+database+backend
 
http://www.w3.org/2001/04/20-ACLs#Structure
http://lucene.472066.n3.nabble.com/Modelling-Access-Control-td1756817.html#a1759372
 
http://www.tonymarston.net/php-mysql/role-based-access-control.html 
http://phpgacl.sourceforge.net/ 
http://jmcneese.wordpress.com/2009/04/05/row-level-model-access-control-for-cakephp/#comment-112
 
http://jmcneese.wordpress.com/2009/04/05/row-level-model-access-control-for-cakephp/
 
http://www.xaprb.com/blog/2006/08/18/role-based-access-control-in-sql-part-2/ 
http://php.dzone.com/articles/php-access-control?page=0,1 
https://issues.apache.org/jira/browse/SOLR-1834 
http://www.tonymarston.net/php-mysql/role-based-access-control.html 
http://php.dzone.com/articles/php-access-control?page=0,1 
http://www.yiiframework.com/doc/guide/1.1/en/topics.auth#role-based-access-control
 
http://lucene.472066.n3.nabble.com/Modelling-Access-Control-td1756817.html#a1759372
 
http://phpgacl.sourceforge.net/ 
http://jmcneese.wordpress.com/2009/04/05/row-level-model-access-control-for-cakephp/#comment-112
 
http://jmcneese.wordpress.com/2009/04/05/row-level-model-access-control-for-cakephp/
 
http://www.yiiframework.com/doc/guide/topics.auth#role-based-access-control

 


- Original Message 
From: Dennis Gearon 
To: solr-user@lucene.apache.org
Sent: Sat, January 22, 2011 1:22:04 PM
Subject: Re: api key filtering

Dang! There were hot, clickable links in the web mail I put them in. I guess 
you 

guys can search for those strings on google and find them. Sorry.




- Original Message 
From: Dennis Gearon 
To: solr-user@lucene.apache.org
Sent: Sat, January 22, 2011 1:09:26 PM
Subject: Re: api key filtering

The links didn't work, so here the are again, NOT from a sent folder:

PHP Access Control - PHP5 CMS Framework Development | PHP Zone
A Role-Based Access Control (RBAC) system for PHP 
Appendix C: Task-Field Access 
Role-based access control in SQL, part 2 at Xaprb 
PHP Access Control - PHP5 CMS Framework Development | PHP Zone
UserRbac - symfony - Trac 
A Role-Based Access Control (RBAC) system for PHP 
Appendix C: Task-Field Access 
Role-based access control in SQL, part 2 at Xaprb 
UserRbac - symfony - Trac 
Acl.php - kohana-mptt - Project Hosting on Google Code 
CANDIDATE-PHP Generic Access Control Lists 
http://dev.w3.org/perl/modules/W3C/Rnodes/bin/makeAclTables.sql 
makeAclTables.sql 
php - CakePHP ACL Database Setup: ARO / ACO structure? - Stack Overflow 
PHP Generic Access Control Lists 
Reardon's Ruminations: Spring Security ACL Schema for Oracle 
Re: [symfony-users] Implementing an existing ACL API in symfony 
SchemaWeb - Classes And Properties - ACL Schema 
trunk/modules/auth/libraries/Khacl.php | Source/SVN | Assembla 
Using Zend_Acl with a database backend - Zend Framework Wiki 
W3C ACL System 

Dennis Gearon


Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 


idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



- Original Message 
From: Matt Mitchell 
To: solr-user@lucene.apache.org
Sent: Sat, January 22, 2011 12:50:24 PM
Subject: Re: api key filtering

Hey thanks I'll defi

Re: Multicore Relaod Theoretical Question

2011-01-22 Thread Alexander Kanarsky

Em,

yes, you can replace the index (get the new one into a separate folder
like index.new and then rename it to the index folder) outside the
Solr, then just do the http call to reload the core.

Note that the old index files may still be in use (continue to serve
the queries while reloading), even if the old index folder is deleted
- that is on Linux filesystems, not sure about NTFS.
That means the space on disk will be freed only when the old files are
not referenced by Solr searcher any longer.

-Alexander

On Sat, Jan 22, 2011 at 1:51 PM, Em  wrote:
>
> Hi Erick,
>
> thanks for your response.
>
> Yes, it's really not that easy.
>
> However, the target is to avoid any kind of master-slave-setup.
>
> The most recent idea i got is to create a new core with a data-dir pointing
> to an already existing directory with a fully optimized index.
>
> Regards,
> Em
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Multicore-Relaod-Theoretical-Question-tp2293999p2310709.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: old index files not deleted on slave

2011-01-22 Thread feedly team

The file system checked out, I also tried creating a slave on a
different machine and could reproduce the issue. I logged SOLR-2329.

On Sat, Dec 18, 2010 at 8:01 PM, Lance Norskog  wrote:
> This could be a quirk of the native locking feature. What's the file
> system? Can you fsck it?
>
> If this error keeps happening, please file this. It should not happen.
> Add the text above and also your solrconfigs if you can.
>
> One thing you could try is to change from the native locking policy to
> the simple locking policy - but only on the child.
>
> On Sat, Dec 18, 2010 at 4:44 PM, feedly team  wrote:
>> I have set up index replication (triggered on optimize). The problem I
>> am having is the old index files are not being deleted on the slave.
>> After each replication, I can see the old files still hanging around
>> as well as the files that have just been pulled. This causes the data
>> directory size to increase by the index size every replication until
>> the disk fills up.
>>
>> Checking the logs, I see the following error:
>>
>> SEVERE: SnapPull failed
>> org.apache.solr.common.SolrException: Index fetch failed :
>>        at 
>> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329)
>>        at 
>> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:265)
>>        at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
>>        at 
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>        at 
>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>>        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>>        at 
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>>        at 
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
>>        at 
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
>>        at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:619)
>> Caused by: org.apache.lucene.store.LockObtainFailedException: Lock
>> obtain timed out:
>> NativeFSLock@/var/solrhome/data/index/lucene-cdaa80c0fefe1a7dfc7aab89298c614c-write.lock
>>        at org.apache.lucene.store.Lock.obtain(Lock.java:84)
>>        at org.apache.lucene.index.IndexWriter.(IndexWriter.java:1065)
>>        at org.apache.lucene.index.IndexWriter.(IndexWriter.java:954)
>>        at 
>> org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:192)
>>        at 
>> org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:99)
>>        at 
>> org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
>>        at 
>> org.apache.solr.update.DirectUpdateHandler2.forceOpenWriter(DirectUpdateHandler2.java:376)
>>        at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:471)
>>        at 
>> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:319)
>>        ... 11 more
>>
>> lsof reveals that the file is still opened from the java process.
>>
>> I am running 4.0 rev 993367 with patch SOLR-1316. Otherwise, the setup
>> is pretty vanilla. The OS is linux, the indexes are on local
>> directories, write permissions look ok, nothing unusual in the config
>> (default deletion policy, etc.). Contents of the index data dir:
>>
>> master:
>> -rw-rw-r-- 1 feeddo feeddo  191 Dec 14 01:06 _1lg.fnm
>> -rw-rw-r-- 1 feeddo feeddo  26M Dec 14 01:07 _1lg.fdx
>> -rw-rw-r-- 1 feeddo feeddo 1.9G Dec 14 01:07 _1lg.fdt
>> -rw-rw-r-- 1 feeddo feeddo 474M Dec 14 01:12 _1lg.tis
>> -rw-rw-r-- 1 feeddo feeddo  15M Dec 14 01:12 _1lg.tii
>> -rw-rw-r-- 1 feeddo feeddo 144M Dec 14 01:12 _1lg.prx
>> -rw-rw-r-- 1 feeddo feeddo 277M Dec 14 01:12 _1lg.frq
>> -rw-rw-r-- 1 feeddo feeddo  311 Dec 14 01:12 segments_1ji
>> -rw-rw-r-- 1 feeddo feeddo  23M Dec 14 01:12 _1lg.nrm
>> -rw-rw-r-- 1 feeddo feeddo  191 Dec 18 01:11 _24e.fnm
>> -rw-rw-r-- 1 feeddo feeddo  26M Dec 18 01:12 _24e.fdx
>> -rw-rw-r-- 1 feeddo feeddo 1.9G Dec 18 01:12 _24e.fdt
>> -rw-rw-r-- 1 feeddo feeddo 483M Dec 18 01:23 _24e.tis
>> -rw-rw-r-- 1 feeddo feeddo  15M Dec 18 01:23 _24e.tii
>> -rw-rw-r-- 1 feeddo feeddo 146M Dec 18 01:23 _24e.prx
>> -rw-rw-r-- 1 feeddo feeddo 283M Dec 18 01:23 _24e.frq
>> -rw-rw-r-- 1 feeddo feeddo  311 Dec 18 01:24 segments_1xz
>> -rw-rw-r-- 1 feeddo feeddo  23M Dec 18 01:24 _24e.nrm
>> -rw-rw-r-- 1 feeddo feeddo  191 Dec 18 13:15 _25z.fnm
>> -rw-rw-r-- 1 feeddo feeddo  26M Dec 18 13:16 _25z.fdx
>> -rw-rw-r-- 1 feeddo feeddo 1.9G Dec 18 13:16 _25z.fdt
>> -rw-rw-r-- 1 feeddo feeddo 484M Dec 18 13:35 _25z.tis
>> -rw-rw-r-- 1 feeddo feeddo  15M Dec 18 13:35 _25z.tii
>> -rw-rw-r-- 1 feeddo feeddo 146M

Re: SolrCloud Questions for MultiCore Setup

2011-01-22 Thread Lance Norskog

A "collection" is your data, like newspaper articles or movie titles.
It is a user-level concept, not really a Solr design concept.

A "core" is a Solr/Lucene index. It is addressable as
solr/collection-name on one machine.

You can use a core to store a collection, or you can break it up among
multiple cores (usually for performance reasons). When you use a core
like this, it is called a "shard". All of the different shards of a
collection form the collection.

Solr has a feature called Distributed Search that presents the
separate shards as if it were one Solr collection. You should set up
Distributed Search first. It does not use SolrCloud, but shows you how
these ideas work. After that, Solr Cloud will make more sense.

Lance

On Sat, Jan 22, 2011 at 9:35 AM, Em  wrote:
>
> Hello list,
>
> i want to experiment with the new SolrCloud feature. So far, I got
> absolutely no experience in distributed search with Solr.
> However, there are some things that remain unclear to me:
>
> 1 ) What is the usecase of a collection?
> As far as I understood: A collection is the same as a core but in a
> distributed sense. It contains a set of cores on one or multiple machines.
> It makes sense that all the cores in a collection got the same schema and
> solrconfig - right?
> Can someone tell me if I understood the concept of a collection correctly?
>
> 2 ) The wiki says this will cause an update
> -Durl=http://localhost:8983/solr/collection1/update
> However, as far as I know this cause an update to a CORE named "collection1"
> at localhost:8983, not to the full collection. Am I correct here?
> So *I* have to care about consistency between the different replicas inside
> my cloud?
>
> 3 ) If I got replicas of the same shard inside a collection, how does
> SolrCloud determine that two documents in a result set are equal? Is it
> neccessary to define a unique key? Is it random which of the two documents
> is picked into the final resultset?
>
> ---
> I think these are my most basic questions.
> However, there is one more tricky thing:
>
> If I understood the collection-idea correctly: What happens if I create two
> cores and each core belongs to a different collection and THEN I do a SWAP.
> Say: core1->collection1, core2->collection2
> SWAP core1,core2
> Does core2 now maps to collection1?
>
> Thank you!
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/SolrCloud-Questions-for-MultiCore-Setup-tp2309443p2309443.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Lance Norskog
goks...@gmail.com

Re: old index files not deleted on slave

2011-01-22 Thread Alexander Kanarsky

I see the file

-rw-rw-r-- 1 feeddo feeddo0 Dec 15 01:19
lucene-cdaa80c0fefe1a7dfc7aab89298c614c-write.lock

was created on Dec. 15. At the end of the replication, as far as I
remember, the SnapPuller tries to open the writer to ensure the old
files are deleted, and in
your case it cannot obtain a lock on the index folder on Dec 16,
17,18. Can you reproduce the problem if you delete the lock file,
restart the slave
and try replication again? Do you have any other Writer(s) open for
this folder outside of this core?

-Alexander

On Sat, Jan 22, 2011 at 3:52 PM, feedly team  wrote:
> The file system checked out, I also tried creating a slave on a
> different machine and could reproduce the issue. I logged SOLR-2329.
>
> On Sat, Dec 18, 2010 at 8:01 PM, Lance Norskog  wrote:
>> This could be a quirk of the native locking feature. What's the file
>> system? Can you fsck it?
>>
>> If this error keeps happening, please file this. It should not happen.
>> Add the text above and also your solrconfigs if you can.
>>
>> One thing you could try is to change from the native locking policy to
>> the simple locking policy - but only on the child.
>>
>> On Sat, Dec 18, 2010 at 4:44 PM, feedly team  wrote:
>>> I have set up index replication (triggered on optimize). The problem I
>>> am having is the old index files are not being deleted on the slave.
>>> After each replication, I can see the old files still hanging around
>>> as well as the files that have just been pulled. This causes the data
>>> directory size to increase by the index size every replication until
>>> the disk fills up.
>>>
>>> Checking the logs, I see the following error:
>>>
>>> SEVERE: SnapPull failed
>>> org.apache.solr.common.SolrException: Index fetch failed :
>>>        at 
>>> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329)
>>>        at 
>>> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:265)
>>>        at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
>>>        at 
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>>        at 
>>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>>>        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>>>        at 
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>>>        at 
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
>>>        at 
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
>>>        at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>        at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>        at java.lang.Thread.run(Thread.java:619)
>>> Caused by: org.apache.lucene.store.LockObtainFailedException: Lock
>>> obtain timed out:
>>> NativeFSLock@/var/solrhome/data/index/lucene-cdaa80c0fefe1a7dfc7aab89298c614c-write.lock
>>>        at org.apache.lucene.store.Lock.obtain(Lock.java:84)
>>>        at org.apache.lucene.index.IndexWriter.(IndexWriter.java:1065)
>>>        at org.apache.lucene.index.IndexWriter.(IndexWriter.java:954)
>>>        at 
>>> org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:192)
>>>        at 
>>> org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:99)
>>>        at 
>>> org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
>>>        at 
>>> org.apache.solr.update.DirectUpdateHandler2.forceOpenWriter(DirectUpdateHandler2.java:376)
>>>        at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:471)
>>>        at 
>>> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:319)
>>>        ... 11 more
>>>
>>> lsof reveals that the file is still opened from the java process.
>>>
>>> I am running 4.0 rev 993367 with patch SOLR-1316. Otherwise, the setup
>>> is pretty vanilla. The OS is linux, the indexes are on local
>>> directories, write permissions look ok, nothing unusual in the config
>>> (default deletion policy, etc.). Contents of the index data dir:
>>>
>>> master:
>>> -rw-rw-r-- 1 feeddo feeddo  191 Dec 14 01:06 _1lg.fnm
>>> -rw-rw-r-- 1 feeddo feeddo  26M Dec 14 01:07 _1lg.fdx
>>> -rw-rw-r-- 1 feeddo feeddo 1.9G Dec 14 01:07 _1lg.fdt
>>> -rw-rw-r-- 1 feeddo feeddo 474M Dec 14 01:12 _1lg.tis
>>> -rw-rw-r-- 1 feeddo feeddo  15M Dec 14 01:12 _1lg.tii
>>> -rw-rw-r-- 1 feeddo feeddo 144M Dec 14 01:12 _1lg.prx
>>> -rw-rw-r-- 1 feeddo feeddo 277M Dec 14 01:12 _1lg.frq
>>> -rw-rw-r-- 1 feeddo feeddo  311 Dec 14 01:12 segments_1ji
>>> -rw-rw-r-- 1 feeddo feeddo  23M Dec 14 01:12 _1lg.nrm
>>> -rw-rw-r-- 1 feeddo feeddo  191 Dec 18 01:11 _24e.fnm
>>> -rw-rw-r-- 1 feeddo feeddo  26M Dec 18 01:12 _24e.fdx
>>> -rw-rw-r-- 1 feeddo feeddo 1.9G Dec 18 01

Re: DIH with full-import and cleaning still keeps old index

2011-01-22 Thread Espen Amble Kolstad

Your not doing optimize, I think optimize would delete your old index.
Try it out with additional parameter optimize=true

- Espen

On Thu, Jan 20, 2011 at 11:30 AM, Bernd Fehling
 wrote:
> Hi list,
>
> after sending full-import=true&clean=true&commit=true
> Solr 4.x (apache-solr-4.0-2010-11-24_09-25-17) responds with:
> - DataImporter doFullImport
> - DirectUpdateHandler2 deleteAll
> ...
> - DocBuilder finish
> - SolrDeletionPolicy.onCommit: commits:num=2
> - SolrDeletionPolicy updateCommits
> - SolrIndexSearcher 
> - INFO: end_commit_flush
> - SolrIndexSearcher warm
> ...
> - QuerySenderListener newSearcher
> - SolrCore registerSearcher
> - SolrIndexSearcher close
> ...
>
> This all looks good to me but why is the old index not deleted?
>
> Am I missing a parameter?
>
> Regards,
> Bernd
>

RE: api key filtering

2011-01-22 Thread Jonathan Rochkind

If you COULD solve your problem by indexing 'public', or other tokens from a 
limited vocabulary of document roles, in a field -- then I'd definitely suggest 
you look into doing that, rather than doing odd things with Solr instead. If 
the only barrier is not currently having sufficient logic at the indexing stage 
to do that, then it is going to end up being a lot less of a headache in the 
long term to simply add a layer at the indexing stage to add that in, then 
trying to get Solr to do things outside of it's, well, 'comfort zone'. 

Of course, depending on your requirements, it might not be possible to do that, 
maybe you can't express the semantics in terms of a limited set of roles 
applied to documents. And then maybe your best option really is sending an up 
to 2k element list (not exactly the same list every time, presumably) of 
acceptable documents to Solr with every query, and maybe you can get that to 
work reasonably.  Depending on how many different complete lists of documents 
you have, maybe there's a way to use Solr caches effectively in that situation, 
or maybe that's not even neccesary since lookup by unique id should be pretty 
quick anyway, not really sure. 

But if the semantics are possible, much better to work with Solr rather than 
against it, it's going to take a lot less tinkering to get Solr to perform well 
if you can just send an fq=role:public or something, instead of a list of 
document IDs.  You won't need to worry about it, it'll just work, because you 
know you're having Solr do what it's built to do. Totally worth a bit of work 
to add a logic layer at the indexing stage. IMO. 

From: Erick Erickson [erickerick...@gmail.com]
Sent: Saturday, January 22, 2011 4:50 PM
To: solr-user@lucene.apache.org
Subject: Re: api key filtering

1024 is the default number, it can be increased. See MaxBooleanClauses
in solrconfig.xml

This shouldn't be a problem with 2K clauses, but expanding it to tens of
thousands is probably a mistake (but test to be sure).

Best
Erick

On Sat, Jan 22, 2011 at 3:50 PM, Matt Mitchell  wrote:

> Hey thanks I'll definitely have a read. The only problem with this though,
> is that our api is a thin layer of app-code, with solr only (no db), we
> index data from our sql db into solr, and push the index off for
> consumption.
>
> The only other idea I had was to send a list of the allowed document ids
> along with every solr query, but then I'm sure I'd run into a filter query
> limit. Each key could be associated with up to 2k documents, so that's 2k
> values in an fq which would probably be too many for lucene (I think its
> limit 1024).
>
> Matt
>
> On Sat, Jan 22, 2011 at 3:40 PM, Dennis Gearon  >wrote:
>
> > The only way that you would have that many api keys per record, is if one
> > of
> > them represented 'public', right? 'public' is a ROLE. Your answer is to
> use
> > RBAC
> > style techniques.
> >
> >
> > Here are some links that I have on the subject. What I'm thinking of
> doing
> > is:
> > Sorry for formatting, Firefox is freaking out. I cut and pasted these
> from
> > an
> > email from my sent box. I hope the links came out.
> >
> >
> > Part 1
> >
> >
> >
> http://www.xaprb.com/blog/2006/08/16/how-to-build-role-based-access-control-in-sql/
> >
> >
> > Part2
> > Role-based access control in SQL, part 2 at Xaprb
> >
> >
> >
> >
> >
> > ACL/RBAC Bookmarks ALL
> >
> > UserRbac - symfony - Trac
> > A Role-Based Access Control (RBAC) system for PHP
> > Appendix C: Task-Field Access
> > Role-based access control in SQL, part 2 at Xaprb
> > PHP Access Control - PHP5 CMS Framework Development | PHP Zone
> > Linux file and directory permissions
> > MySQL :: MySQL 5.0 Reference Manual :: C.5.4.1 How to Reset the Root
> > Password
> > per RECORD/Entity permissions? - symfony users | Google Groups
> > Special Topics: Authentication and Authorization | The Definitive Guide
> to
> > Yii |
> > Yii Framework
> >
> > att.net Mail (gear...@sbcglobal.net)
> > Solr - User - Modelling Access Control
> > PHP Generic Access Control Lists
> > Row-level Model Access Control for CakePHP « some flot, some jet
> > Row-level Model Access Control for CakePHP « some flot, some jet
> > Yahoo! GeoCities: Get a web site with easy-to-use site building tools.
> > Class that acts as a client to a JSON service : JSON « GWT « Java
> > Juozas Kaziukėnas devBlog
> > Re: [symfony-users] Implementing an existing ACL API in symfony
> > php - CakePHP ACL Database Setup: ARO / ACO structure? - Stack Overflow
> > W3C ACL System
> > makeAclTables.sql
> > SchemaWeb - Classes And Properties - ACL Schema
> > Reardon's Ruminations: Spring Security ACL Schema for Oracle
> > trunk/modules/auth/libraries/Khacl.php | Source/SVN | Assembla
> > Acl.php - kohana-mptt - Project Hosting on Google Code
> > Asynchronous JavaScript Technology and XML (Ajax) With the Java Platform
> > The page cannot be found
> >
> >
> >  Dennis Gearon
> >
> >
> > Signature Warning
>

Re: api key filtering

2011-01-22 Thread Dennis Gearon

Totally agree, do it at indexing time, in the index.

 Dennis Gearon

Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'

EARTH has a Right To Life,
otherwise we all die.

- Original Message 
From: Jonathan Rochkind 
To: "solr-user@lucene.apache.org" 
Sent: Sat, January 22, 2011 5:28:50 PM
Subject: RE: api key filtering

If you COULD solve your problem by indexing 'public', or other tokens from a 
limited vocabulary of document roles, in a field -- then I'd definitely suggest 
you look into doing that, rather than doing odd things with Solr instead. If 
the 
only barrier is not currently having sufficient logic at the indexing stage to 
do that, then it is going to end up being a lot less of a headache in the long 
term to simply add a layer at the indexing stage to add that in, then trying to 
get Solr to do things outside of it's, well, 'comfort zone'. 

Of course, depending on your requirements, it might not be possible to do that, 
maybe you can't express the semantics in terms of a limited set of roles 
applied 
to documents. And then maybe your best option really is sending an up to 2k 
element list (not exactly the same list every time, presumably) of acceptable 
documents to Solr with every query, and maybe you can get that to work 
reasonably.  Depending on how many different complete lists of documents you 
have, maybe there's a way to use Solr caches effectively in that situation, or 
maybe that's not even neccesary since lookup by unique id should be pretty 
quick 
anyway, not really sure. 

But if the semantics are possible, much better to work with Solr rather than 
against it, it's going to take a lot less tinkering to get Solr to perform well 
if you can just send an fq=role:public or something, instead of a list of 
document IDs.  You won't need to worry about it, it'll just work, because you 
know you're having Solr do what it's built to do. Totally worth a bit of work 
to 
add a logic layer at the indexing stage. IMO. 

From: Erick Erickson [erickerick...@gmail.com]
Sent: Saturday, January 22, 2011 4:50 PM
To: solr-user@lucene.apache.org
Subject: Re: api key filtering

1024 is the default number, it can be increased. See MaxBooleanClauses
in solrconfig.xml

This shouldn't be a problem with 2K clauses, but expanding it to tens of
thousands is probably a mistake (but test to be sure).

Best
Erick

On Sat, Jan 22, 2011 at 3:50 PM, Matt Mitchell  wrote:

> Hey thanks I'll definitely have a read. The only problem with this though,
> is that our api is a thin layer of app-code, with solr only (no db), we
> index data from our sql db into solr, and push the index off for
> consumption.
>
> The only other idea I had was to send a list of the allowed document ids
> along with every solr query, but then I'm sure I'd run into a filter query
> limit. Each key could be associated with up to 2k documents, so that's 2k
> values in an fq which would probably be too many for lucene (I think its
> limit 1024).
>
> Matt
>
> On Sat, Jan 22, 2011 at 3:40 PM, Dennis Gearon  >wrote:
>
> > The only way that you would have that many api keys per record, is if one
> > of
> > them represented 'public', right? 'public' is a ROLE. Your answer is to
> use
> > RBAC
> > style techniques.
> >
> >
> > Here are some links that I have on the subject. What I'm thinking of
> doing
> > is:
> > Sorry for formatting, Firefox is freaking out. I cut and pasted these
> from
> > an
> > email from my sent box. I hope the links came out.
> >
> >
> > Part 1
> >
> >
> >
>http://www.xaprb.com/blog/2006/08/16/how-to-build-role-based-access-control-in-sql/
>/
> >
> >
> > Part2
> > Role-based access control in SQL, part 2 at Xaprb
> >
> >
> >
> >
> >
> > ACL/RBAC Bookmarks ALL
> >
> > UserRbac - symfony - Trac
> > A Role-Based Access Control (RBAC) system for PHP
> > Appendix C: Task-Field Access
> > Role-based access control in SQL, part 2 at Xaprb
> > PHP Access Control - PHP5 CMS Framework Development | PHP Zone
> > Linux file and directory permissions
> > MySQL :: MySQL 5.0 Reference Manual :: C.5.4.1 How to Reset the Root
> > Password
> > per RECORD/Entity permissions? - symfony users | Google Groups
> > Special Topics: Authentication and Authorization | The Definitive Guide
> to
> > Yii |
> > Yii Framework
> >
> > att.net Mail (gear...@sbcglobal.net)
> > Solr - User - Modelling Access Control
> > PHP Generic Access Control Lists
> > Row-level Model Access Control for CakePHP « some flot, some jet
> > Row-level Model Access Control for CakePHP « some flot, some jet
> > Yahoo! GeoCities: Get a web site with easy-to-use site building tools.
> > Class that acts as a client to a JSON service : JSON « GWT « Java
> > Juozas Kaziukėnas devBlog
> > Re: [symfony-users] Impleme

Re: api key filtering

2011-01-22 Thread Matt Mitchell

I think that indexing the access information is going to work nicely, and I
agree that sticking with the simplest/solr way is best. The constraint is
super simple... you can view this set of documents or you can't... based on
an api key: fq=api_key:xxx

Thanks for the feedback on this guys!
Matt

2011/1/22 Jonathan Rochkind 

> If you COULD solve your problem by indexing 'public', or other tokens from
> a limited vocabulary of document roles, in a field -- then I'd definitely
> suggest you look into doing that, rather than doing odd things with Solr
> instead. If the only barrier is not currently having sufficient logic at the
> indexing stage to do that, then it is going to end up being a lot less of a
> headache in the long term to simply add a layer at the indexing stage to add
> that in, then trying to get Solr to do things outside of it's, well,
> 'comfort zone'.
>
> Of course, depending on your requirements, it might not be possible to do
> that, maybe you can't express the semantics in terms of a limited set of
> roles applied to documents. And then maybe your best option really is
> sending an up to 2k element list (not exactly the same list every time,
> presumably) of acceptable documents to Solr with every query, and maybe you
> can get that to work reasonably.  Depending on how many different complete
> lists of documents you have, maybe there's a way to use Solr caches
> effectively in that situation, or maybe that's not even neccesary since
> lookup by unique id should be pretty quick anyway, not really sure.
>
> But if the semantics are possible, much better to work with Solr rather
> than against it, it's going to take a lot less tinkering to get Solr to
> perform well if you can just send an fq=role:public or something, instead of
> a list of document IDs.  You won't need to worry about it, it'll just work,
> because you know you're having Solr do what it's built to do. Totally worth
> a bit of work to add a logic layer at the indexing stage. IMO.
> 
> From: Erick Erickson [erickerick...@gmail.com]
> Sent: Saturday, January 22, 2011 4:50 PM
> To: solr-user@lucene.apache.org
> Subject: Re: api key filtering
>
> 1024 is the default number, it can be increased. See MaxBooleanClauses
> in solrconfig.xml
>
> This shouldn't be a problem with 2K clauses, but expanding it to tens of
> thousands is probably a mistake (but test to be sure).
>
> Best
> Erick
>
> On Sat, Jan 22, 2011 at 3:50 PM, Matt Mitchell 
> wrote:
>
> > Hey thanks I'll definitely have a read. The only problem with this
> though,
> > is that our api is a thin layer of app-code, with solr only (no db), we
> > index data from our sql db into solr, and push the index off for
> > consumption.
> >
> > The only other idea I had was to send a list of the allowed document ids
> > along with every solr query, but then I'm sure I'd run into a filter
> query
> > limit. Each key could be associated with up to 2k documents, so that's 2k
> > values in an fq which would probably be too many for lucene (I think its
> > limit 1024).
> >
> > Matt
> >
> > On Sat, Jan 22, 2011 at 3:40 PM, Dennis Gearon  > >wrote:
> >
> > > The only way that you would have that many api keys per record, is if
> one
> > > of
> > > them represented 'public', right? 'public' is a ROLE. Your answer is to
> > use
> > > RBAC
> > > style techniques.
> > >
> > >
> > > Here are some links that I have on the subject. What I'm thinking of
> > doing
> > > is:
> > > Sorry for formatting, Firefox is freaking out. I cut and pasted these
> > from
> > > an
> > > email from my sent box. I hope the links came out.
> > >
> > >
> > > Part 1
> > >
> > >
> > >
> >
> http://www.xaprb.com/blog/2006/08/16/how-to-build-role-based-access-control-in-sql/
> > >
> > >
> > > Part2
> > > Role-based access control in SQL, part 2 at Xaprb
> > >
> > >
> > >
> > >
> > >
> > > ACL/RBAC Bookmarks ALL
> > >
> > > UserRbac - symfony - Trac
> > > A Role-Based Access Control (RBAC) system for PHP
> > > Appendix C: Task-Field Access
> > > Role-based access control in SQL, part 2 at Xaprb
> > > PHP Access Control - PHP5 CMS Framework Development | PHP Zone
> > > Linux file and directory permissions
> > > MySQL :: MySQL 5.0 Reference Manual :: C.5.4.1 How to Reset the Root
> > > Password
> > > per RECORD/Entity permissions? - symfony users | Google Groups
> > > Special Topics: Authentication and Authorization | The Definitive Guide
> > to
> > > Yii |
> > > Yii Framework
> > >
> > > att.net Mail (gear...@sbcglobal.net)
> > > Solr - User - Modelling Access Control
> > > PHP Generic Access Control Lists
> > > Row-level Model Access Control for CakePHP « some flot, some jet
> > > Row-level Model Access Control for CakePHP « some flot, some jet
> > > Yahoo! GeoCities: Get a web site with easy-to-use site building tools.
> > > Class that acts as a client to a JSON service : JSON « GWT « Java
> > > Juozas Kaziukėnas devBlog
> > > Re: [symfony-users] Implementing an existing ACL AP

Re: SolrJ Tutorial

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

Re: Is solr 4.0 ready for prime time? (or other ways to use geo distance in search)

Re: Is solr 4.0 ready for prime time? (or other ways to use geo distance in search)

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

SolrCloud Questions for MultiCore Setup

api key filtering

Re: api key filtering

Re: api key filtering

Re: Solr with many indexes

Re: api key filtering

Re: solrconfig.xml settings question

Re: api key filtering

Re: Indexing all permutations of words from the input

Re: Indexing same data in multiple fields with different filters

Re: Multicore Relaod Theoretical Question

Re: api key filtering

Re: Multicore Relaod Theoretical Question

Re: api key filtering

Re: Multicore Relaod Theoretical Question

Re: old index files not deleted on slave

Re: SolrCloud Questions for MultiCore Setup

Re: old index files not deleted on slave

Re: DIH with full-import and cleaning still keeps old index

RE: api key filtering

Re: api key filtering

Re: api key filtering

27 matches

Site Navigation

Mail list logo

Footer information