Query facet count and its matching documents

2012-07-14 Thread Gnanakumar
Hi,

We're running Apache Solr v3.1 and SolrJ is our client.

We're passing multiple Arbitrary Faceting Query (facet.query) to get the
number of matching documents (the facet count) evaluated over the search
results in a *single* Solr query.  My use case demands the actual matching
facet results/documents/fields also along with facet count.

My question is, is it possible to get facet query matching results along
with facet count in a single Solr query call?

Regards,
Gnanam




Re: Help with user file searching with custom permissions

2012-07-14 Thread Ahmet Arslan
> Can anyone give me some general advice or pointers for
> setting up such an index that enforces user access
> permissions for this type of file/folder search?

May be you can make use of http://manifoldcf.apache.org/en_US/index.html


Query results vs. facets results

2012-07-14 Thread tudor
Hello,

I am new to Solr and I running some tests with our data in Solr. We are
using version 3.6 and the data is imported form a DB2 database using Solr's
DIH. We have defined a single entity in the db-data-config.xml, which is an
equivalent of the following query:



This might lead to some names appearing multiple times in the result set.
This is OK.

For the unique ID in the schema, we are using a solr.UUIDField:


http://localhost:8983/solr/db/select?indent=on&version=2.2&q=CITY:MILTON&fq=&start=0&rows=100&fl=*&wt=&explainOther=&hl.fl=&group=true&group.field=NAME&group.ngroups=true&group.truncate=true

yields 

134

as a result, which is exactly what we expect. 

On the other hand, running

http://localhost:8983/solr/db/select?indent=on&version=2.2&q=*&fq=&start=0&rows=10&fl=*&wt=&explainOther=&hl.fl=&group=true&group.field=NAME&group.truncate=true&facet=true&facet.field=CITY&group.ngroups=true

yields 


   
 
  
103

I would expect to have the same number (134) in this facet result as well.
Could you please let me know why these two results are different?

Thank you,
Tudor



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-results-vs-facets-results-tp3994988.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Multivalued attibute grouping in SOLR

2012-07-14 Thread Ahmet Arslan
> I came across a problem where one of
> my column is multivalued. eg: value can
> be (11,22) (11,33) (11,55) , (22,44) , (22,99)
> 
> I want to perform a grouping operation that will yield:
> 
>     * 11 : count 3
>     * 22 : count 3
>     * 33 : 1
>     * 44 : 1
>     * 55 : 1
>     * 99 : 1

According to wiki, "Support for grouping on a multi-valued field has not yet 
been implemented."

http://wiki.apache.org/solr/FieldCollapsing#Known_Limitations


Re: DIH include Fieldset in query

2012-07-14 Thread Ahmet Arslan


--- On Fri, 7/13/12, stockii  wrote:

> From: stockii 
> Subject: DIH include Fieldset in query
> To: solr-user@lucene.apache.org
> Date: Friday, July 13, 2012, 11:42 AM
> hello..
> 
> i have many big entities in my data-config.xml. in the many
> entities is the
> same query.
> the entities look like this:
> 
>  pk="id"
> query="
> SELECT 
>  field as fielname,
>  IF(bla NOT NULL, 1, 0) AS blob,
>  fieldname,
>  fieldname AS field, ...
>  
> 
> more and more.
> 
> is it possible to include text from a file or something like
> this, in
> data-config.xml???

So you want to re-use same SQL sentence in many entities? 
I think you can do it with :

http://wiki.apache.org/solr/DataImportHandler#Custom_Functions




Pb installation Solr/Tomcat6

2012-07-14 Thread Bruno Mannina

Dear Solr users,

I try to run solr/ with tomcat but I have always this error:
Can't find resource 'schema.xml' in classpath or 
'/home/solr/apache-solr-3.6.0/example/solr/./conf/', cwd='/var/lib/tomcat6


but schema.xml is inside the directory 
'/home/solr/apache-solr-3.6.0/example/solr/./conf/'


http://localhost:8080/manager/html => works fine, I see Applications 
/solr, fonctionnelle True


but when I click on solr/ (http://localhost:8080/solr/) I get this error.

Could you help me to solve this problem, it makes me crazy.

thanks a lot,
Bruno


Tomcat6
Ubuntu 12.04
Solr 3.6


Re: Help with user file searching with custom permissions

2012-07-14 Thread Jack Krupansky
If I recall properly, MCF had a proposed patch in this area that was never 
fully accepted by the community. The LucidWorks Enterprise product has 
support for "search filters for access control" as well:


http://lucidworks.lucidimagination.com/display/lweug20/Search+Filters+for+Access+Control

The basic concept is that each user needs a secure login, their user profile 
specifies what "roles" they are permitted, and each documents is then tagged 
with the roles that are permitted or denied for accessing that document. A 
custom search component then filters the documents based on that combination 
of user and access control filters.


-- Jack Krupansky

-Original Message- 
From: Ahmet Arslan

Sent: Saturday, July 14, 2012 4:09 AM
To: solr-user@lucene.apache.org
Subject: Re: Help with user file searching with custom permissions


Can anyone give me some general advice or pointers for
setting up such an index that enforces user access
permissions for this type of file/folder search?


May be you can make use of http://manifoldcf.apache.org/en_US/index.html 



Re: Pb installation Solr/Tomcat6

2012-07-14 Thread Bruno Mannina

I found the problem I think, It was a permission problem on the schema.xml

schema.xml was only readable by the solr user.

Now I have the same problem with the solr index directory

Le 14/07/2012 14:00, Bruno Mannina a écrit :

Dear Solr users,

I try to run solr/ with tomcat but I have always this error:
Can't find resource 'schema.xml' in classpath or 
'/home/solr/apache-solr-3.6.0/example/solr/./conf/', 
cwd='/var/lib/tomcat6


but schema.xml is inside the directory 
'/home/solr/apache-solr-3.6.0/example/solr/./conf/'


http://localhost:8080/manager/html => works fine, I see Applications 
/solr, fonctionnelle True


but when I click on solr/ (http://localhost:8080/solr/) I get this error.

Could you help me to solve this problem, it makes me crazy.

thanks a lot,
Bruno


Tomcat6
Ubuntu 12.04
Solr 3.6







Re: Pb installation Solr/Tomcat6

2012-07-14 Thread Vadim Kisselmann
same problem.
but here should tomcat6 have the right to read/write your index.
regards
vadim


2012/7/14 Bruno Mannina :
> I found the problem I think, It was a permission problem on the schema.xml
>
> schema.xml was only readable by the solr user.
>
> Now I have the same problem with the solr index directory
>
> Le 14/07/2012 14:00, Bruno Mannina a écrit :
>
>> Dear Solr users,
>>
>> I try to run solr/ with tomcat but I have always this error:
>> Can't find resource 'schema.xml' in classpath or
>> '/home/solr/apache-solr-3.6.0/example/solr/./conf/', cwd='/var/lib/tomcat6
>>
>> but schema.xml is inside the directory
>> '/home/solr/apache-solr-3.6.0/example/solr/./conf/'
>>
>> http://localhost:8080/manager/html => works fine, I see Applications
>> /solr, fonctionnelle True
>>
>> but when I click on solr/ (http://localhost:8080/solr/) I get this error.
>>
>> Could you help me to solve this problem, it makes me crazy.
>>
>> thanks a lot,
>> Bruno
>>
>>
>> Tomcat6
>> Ubuntu 12.04
>> Solr 3.6
>>
>>
>
>


Re: Is it possible to alias a facet field?

2012-07-14 Thread Jamie Johnson
So this got me close

facet.field=testfield&facet.field=%7B!key=mylabel%7Dtestfield&f.mylabel.limit=1

but the limit on the alias didn't seem to work.  Is this expected?

On Sat, Jul 14, 2012 at 10:03 AM, Jamie Johnson  wrote:
> I am looking to facet on a field in more than one way.  My data is of the form
>
> 7abcdefgh
> 7abcdefgi
> 7abcdefgj
> 7abcdefgk
> 7bbcdefgl
> 7bbcdefgm
> 7bbcdefgn
>
> I want to get all of the counts for 7ab* and the max count for 7*.  I
> had thought I could do this with an alias but I don't see how to do
> this currently.  Is there a way to do this or should I just execute 2
> queries?


Re: Is it possible to alias a facet field?

2012-07-14 Thread Yonik Seeley
On Sat, Jul 14, 2012 at 10:12 AM, Jamie Johnson  wrote:
> So this got me close
>
> facet.field=testfield&facet.field=%7B!key=mylabel%7Dtestfield&f.mylabel.limit=1
>
> but the limit on the alias didn't seem to work.  Is this expected?

Per-field params don't currently look under the alias.  I believe
there's a JIRA open for this.

-Yonik
http://lucidimagination.com


Re: Pb installation Solr/Tomcat6

2012-07-14 Thread Bruno Mannina
Yes, I actually do a backup of my index before changing/testing chgrp 
action.


As I'm a newbie on Linux, is it the right commands:
sudo chgrp tomcat6 /solrindex
sudo chmod g+s /solrindex

?



Le 14/07/2012 15:31, Vadim Kisselmann a écrit :

same problem.
but here should tomcat6 have the right to read/write your index.
regards
vadim


2012/7/14 Bruno Mannina :

I found the problem I think, It was a permission problem on the schema.xml

schema.xml was only readable by the solr user.

Now I have the same problem with the solr index directory

Le 14/07/2012 14:00, Bruno Mannina a écrit :


Dear Solr users,

I try to run solr/ with tomcat but I have always this error:
Can't find resource 'schema.xml' in classpath or
'/home/solr/apache-solr-3.6.0/example/solr/./conf/', cwd='/var/lib/tomcat6

but schema.xml is inside the directory
'/home/solr/apache-solr-3.6.0/example/solr/./conf/'

http://localhost:8080/manager/html => works fine, I see Applications
/solr, fonctionnelle True

but when I click on solr/ (http://localhost:8080/solr/) I get this error.

Could you help me to solve this problem, it makes me crazy.

thanks a lot,
Bruno


Tomcat6
Ubuntu 12.04
Solr 3.6











Re: Pb installation Solr/Tomcat6

2012-07-14 Thread Bruno Mannina

Le 14/07/2012 16:37, Bruno Mannina a écrit :

As I'm a newbie on Linux, is it the right commands:
sudo chgrp tomcat6 /solrindex 

If I do this command line, no change has been done?!

For solrindex/
Before I have: solr:solr
After I have: solr:solr

??

chgrp has no action?!



Re: Pb installation Solr/Tomcat6

2012-07-14 Thread Bruno Mannina

If I use nautilus (with sudo)

I have the list of User and the list of Group but when I choose tomcat6, 
instantanly the selection change to "solr"

and I can't modify it

Help please,
Bruno

Le 14/07/2012 20:03, Bruno Mannina a écrit :

Le 14/07/2012 16:37, Bruno Mannina a écrit :

As I'm a newbie on Linux, is it the right commands:
sudo chgrp tomcat6 /solrindex 

If I do this command line, no change has been done?!

For solrindex/
Before I have: solr:solr
After I have: solr:solr

??

chgrp has no action?!








Re: Pb installation Solr/Tomcat6

2012-07-14 Thread Bruno Mannina
humm it seems to be a NTFS problem because it's a external HDD that I 
use also with Windows


Le 14/07/2012 20:11, Bruno Mannina a écrit :

If I use nautilus (with sudo)

I have the list of User and the list of Group but when I choose 
tomcat6, instantanly the selection change to "solr"

and I can't modify it

Help please,
Bruno

Le 14/07/2012 20:03, Bruno Mannina a écrit :

Le 14/07/2012 16:37, Bruno Mannina a écrit :

As I'm a newbie on Linux, is it the right commands:
sudo chgrp tomcat6 /solrindex 

If I do this command line, no change has been done?!

For solrindex/
Before I have: solr:solr
After I have: solr:solr

??

chgrp has no action?!













AjaxSolr + Solr + Nutch question

2012-07-14 Thread praful
I referred  https://github.com/evolvingweb/ajax-solr/wiki/reuters-tutorial  
for Ajax-Solr setup.

I want to know that although ajax-solr is running but it's searching under
only reuters data. If I want to crawl the web using nutch and integrate it
with solr,then i have to replace solr's schema.xml file with nutch's
schema.xml file which will not be according to ajax-solr configuration. By
replacing the schema.xml files, ajax-solr wont work(correct me if I am
wrong)!!!

How would I now integrate Solr with Nutch along with Ajax-Solr so ajax-Solr
can search other data on the web as well??

Thanks
Regards
Praful Bagai

--
View this message in context: 
http://lucene.472066.n3.nabble.com/AjaxSolr-Solr-Nutch-question-tp3995030.html
Sent from the Solr - User mailing list archive at Nabble.com.


SOLR 4 Alpha Out Of Mem Err

2012-07-14 Thread Nick Koton
I have been experiencing out of memory errors when indexing via solrj into a
4 alpha cluster.  It seems when I delegate commits to the server (either
auto commit or commit within) there is nothing to throttle the solrj clients
and the server struggles to fan out the work.  However, when I handle
commits entirely within the client, the indexing rate is very restricted.

Any suggestions would be appreciated

Nick Cotton
nick.ko...@gmail.com







Re: Help with user file searching with custom permissions

2012-07-14 Thread Matt Palermo
With the file manager system, the users don’t have an account on the server 
itself. They have user accounts stored in a MySQL database and all permissions 
are tied to the user id from the database. I don’t know if this complicates 
things at all, but I thought I’d throw that out there in case anyone has ideas 
for this.

Thanks,

Matt

From: Matt Palermo 
Sent: Saturday, July 14, 2012 2:45 AM
To: solr-user@lucene.apache.org 
Subject: Help with user file searching with custom permissions

I'm looking to get some advice on setting up a workflow and schema for a 
project I'm working on. Here is a bit of background on the project... The 
project is an online file management system. It has hundreds of users at the 
moment. There are 1,000,000+ files and folders already existing in the system 
and it's growing rapidly. I want all the files/folders to be indexed and 
searchable. I have no problem adding file data (i.e. file/folder name, creation 
date, author name, etc) to the index. The part that I'm struggling with is that 
the system implements file/folder permissions for all the users. So users won't 
be able to access all 1,000,000+ files/folders on the site. So the user's 
search results should only return things they can access.

I keep a MySQL database of users' permissions. On the site, they will see the 
initial list of all top-level folders they can access, then they can navigate 
into sub-folders and they will only see the sub-folders they can access. So the 
permissions for all the files and folders are set and working. I just need to 
enforce these access permissions when the user runs a search on the index.

I must also keep in mind that user permissions might be changed periodically. A 
user might be granted access to some newly created folders, or their access 
permission for existing folder might be revoked. So I need to keep this as 
"dynamic" as possible.

Can anyone give me some general advice or pointers for setting up such an index 
that enforces user access permissions for this type of file/folder search?

Thanks,

Matt


Re: SOLR 4 Alpha Out Of Mem Err

2012-07-14 Thread Mark Miller
Can you give more info? How much RAM are you giving Solr with Xmx? 

Can you be more specific about the behavior you are seeing with auto commit vs 
client commit? 

How often are you trying to commit? With the client? With auto commit?

Are you doing soft commits? Std commits? A mix?

What's the stack trace for the OOM?

What OS are you using?

Anything else you can add? 

-- 
Mark Miller



On Saturday, July 14, 2012 at 4:21 PM, Nick Koton wrote:

> I have been experiencing out of memory errors when indexing via solrj into a
> 4 alpha cluster. It seems when I delegate commits to the server (either
> auto commit or commit within) there is nothing to throttle the solrj clients
> and the server struggles to fan out the work. However, when I handle
> commits entirely within the client, the indexing rate is very restricted.
> 
> Any suggestions would be appreciated
> 
> Nick Cotton
> nick.ko...@gmail.com
> 
> 




SolrCloud survey

2012-07-14 Thread Mark Miller
I know there are a variety of people out there already using SolrCloud - some 
are testing and investigating, others are already in production. I don't know 
many details about these installs though.  

Could anyone that is able/willing share details about your SolrCloud 
experience/setup?

I'm very interested in things like: how many nodes are you using? How many 
shards? How many replicas? How many docs? What kind of indexing/search load. Or 
anything else you might add. 

Thanks, 

-- 
Mark Miller




RE: SOLR 4 Alpha Out Of Mem Err

2012-07-14 Thread Nick Koton
> Can you give more info? How much RAM are you giving Solr with Xmx?
RHEL 5.4
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
java  -Xmx16384m -Xms8192m
SOLR shards=6, each with a replica (i.e. total 12 JVMs)
Solrj multi-threaded client sends several 1,000 docs/sec with auto commit or 
commit within
Same client with explicit client commit tops out around 1,000 / sec

> How often are you trying to commit? With the client? With auto commit?
With auto commit or commit within, I've tried things in the range of 
10,000-60,000 ms.  I have tried both with and without soft commit.  When I 
tried soft, it was in the range of 500-5,000 ms.  With explicit client commit I 
have tried in the range of 10,000 to 200,000 documents.

>Can you be more specific about the behavior you are seeing with auto commit vs 
>client commit?
With auto commit, the client threads run without pausing and quickly ramp up to 
several 1,000s docs per second.  After a variable number of minutes, but seldom 
as long as an hour, the server(s) to which the client(s) are attached get 
exceptions.  I have included one stack trace below.  Clients continue to run 
without error.

With explicit client commit, the client will pause for a few seconds at the 
commit with a single thread and hit overall index rates around 500/sec.  With a 
second thread, the commit is around 5 seconds and overall rate is around 
1,000/sec.  Adding a third thread causes the commit time to increase to a 
minute or more.  Overall rate increases only slightly with the third and 
subsequent threads.

> Anything else you can add?
I have seen this with a configuration as small as a single shard and a replica, 
but I've been always working with SolrCloud.  When there is a single shard 
without a replica, I have not seen the problem.


INFO: [shipment] webapp=/solr path=/update 
params={update.distrib=FROMLEADER&wt=javabin&version=2} status=0 QTime=2 
Jul 14, 2012 9:20:57 PM org.apache.solr.core.SolrCore execute
INFO: [shipment] webapp=/solr path=/update 
params={update.distrib=FROMLEADER&wt=javabin&version=2} status=0 QTime=3 
Jul 14, 2012 9:20:57 PM org.apache.solr.common.SolrException log
SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError: unable to 
create new native thread
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:456)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:284)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:351)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
at 
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:900)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:954)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:952)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:640

Re: Index version on slave incrementing to higher than master

2012-07-14 Thread Erick Erickson
Gotta admit it's a bit puzzling, and surely you want to move to the 3x
versions ..

But at a guess, things might be getting confused on the slaves given
you have a merge policy on them. There's no reason to have any
policies on the slaves; slaves should just be about copying the files
from the master, all the policies,commits,optimizes should be done on
the master. About all the slave does is copy the current state of the index
from the master.

So I'd try removing everything but the replication from the slaves, including
any autocommit stuff and just let replication do it's thing.

And I'd replicate after the optimize if you keep the optimize going. You should
end up with one segment in the index after that, on both the master and slave.
You can't get any more merged than that.

Of course you'll also copy the _entire_ index every time after you've
optimized...

Best
Erick

On Fri, Jul 13, 2012 at 12:31 AM, Andrew Davidoff  wrote:
> Hi,
>
> I am running solr 1.4.0+ds1-1ubuntu1. I have a master server that has a
> number of solr instances running on it (150 or so), and nightly most of
> them have documents written to them. The script that does these writes
> (adds) does a commit and an optimize on the indexes when it's entirely
> finished updating them, then initiates replication on the slave per
> instance. In this configuration, the index versions between master and
> slave remain in synch.
>
> The optimize portion, which, again, happens nightly, is taking a lot of
> time and I think it's unnecessary. I was hoping to stop doing this explicit
> optimize, and to let my merge policy handle that. However, if I don't do an
> optimize, and only do a commit before initiating slave replication, some
> hours later the slave is, for reasons that are unclear to me, incrementing
> its index version to 1 higher than the master.
>
> I am not really sure I understand the logs, but it looks like the
> incremented index version is the result of an optimize on the slave, but I
> am never issuing any commands against the slave aside from initiating
> replication, and I don't think there's anything in my solr configuration
> that would be initiating this. I do have autoCommit on with maxDocs of
> 1000, but since I am initiating slave replication after doing a commit on
> the master, I don't think there would ever be any uncommitted documents on
> the slave. I do have a merge policy configured, but it's not clear to me
> that it has anything to do with this. And if it did, I'd expect to see
> similar behavior on the master (right?).
>
> I have included a snipped from my slave logs that shows this issue. In this
> snipped index version 1286065171264 is what the master has,
> and 1286065171265 is what the slave increments itself to, which is then out
> of synch with the master in terms of version numbers. Nothing that I know
> of is issuing any commands to the slave at this time. If I understand these
> logs (I might not), it looks like something issued an optimize that took
> 1023720ms? Any ideas?
>
> Thanks in advance.
>
> Andy
>
>
>
> Jul 12, 2012 12:21:14 PM org.apache.solr.update.SolrIndexWriter close
> FINE: Closing Writer DirectUpdateHandler2
> Jul 12, 2012 12:21:14 PM org.apache.solr.core.SolrDeletionPolicy onCommit
> INFO: SolrDeletionPolicy.onCommit: commits:num=2
>
> commit{dir=/var/lib/ontolo/solr/o_3952/index,segFN=segments_h8,version=1286065171264,generation=620,filenames=[_h6.fnm,
> _h5.nrm, segments_h8, _h4.nrm, _h5.tii, _h4
> .tii, _h5.tis, _h4.tis, _h4.fdx, _h5.fnm, _h6.tii, _h4.fdt, _h5.fdt,
> _h5.fdx, _h5.frq, _h4.fnm, _h6.frq, _h6.tis, _h4.prx, _h4.frq, _h6.nrm,
> _h5.prx, _h6.prx, _h6.fdt, _h6
> .fdx]
>
> commit{dir=/var/lib/ontolo/solr/o_3952/index,segFN=segments_h9,version=1286065171265,generation=621,filenames=[_h7.tis,
> _h7.fdx, _h7.fnm, _h7.fdt, _h7.prx, segment
> s_h9, _h7.nrm, _h7.tii, _h7.frq]
> Jul 12, 2012 12:21:14 PM org.apache.solr.core.SolrDeletionPolicy
> updateCommits
> INFO: newest commit = 1286065171265
> Jul 12, 2012 12:21:14 PM org.apache.solr.search.SolrIndexSearcher 
> INFO: Opening Searcher@4ac62082 main
> Jul 12, 2012 12:21:14 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> Jul 12, 2012 12:21:14 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming Searcher@4ac62082 main from Searcher@48d901f7 main
>
> fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative
> _inserts=0,cumulative_evictions=0}
> Jul 12, 2012 12:21:14 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming result for Searcher@4ac62082 main
>
> fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Jul 12, 2012 12:21:14 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming Searcher@4ac62082 main

Re: Solr facet multiple constraint

2012-07-14 Thread Erick Erickson
What do you get when you attach &debugQuery=on? That should show you the parsed
query. You might paste that.

Erick

On Fri, Jul 13, 2012 at 1:14 AM, davidbougearel
 wrote:
> Ok well i know about the complexity that i can put into fq with AND and OR
> conditions but at the moment when i put fq=user:10,facet.field=user, the
> query returns me all the facets not taking into account the fq=user:10
> that's the problem.
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-facet-multiple-constraint-tp3992974p3994783.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Computed fields - can I put a function in fl?

2012-07-14 Thread Erick Erickson
I think in 4.0 you can, but not 3.x as I remember. Your example has
the fl as part
of the highlight though, is that a typo?

Best
Erick

On Fri, Jul 13, 2012 at 5:21 AM, maurizio1976
 wrote:
> Hi,
> I have 2 fields, one containing a string (product) and another containing a
> boolean (show_product).
>
> Is there a way of returning the product field with a value of null when the
> show_product field is false?
>
> I can make another field (product_computed) and index that with null where I
> need but I would like to understand if there is a better approach like
> putting a function query in the fl and make a computed field.
>
> something like:
> q=*:*&fq=&start=0&rows=10&fl=&qt=&wt=&explainOther=&hl.fl=/*product:(if(show_product:true,
> product, "")*/
>
> that obviously doesn't work.
>
> thanks for any help
>
> Maurizio
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Computed-fields-can-I-put-a-function-in-fl-tp3994799.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Groups count in distributed grouping is wrong in some case

2012-07-14 Thread Erick Erickson
what version of Solr are you using? There's been quite a bit of work
on this lately,
I'm not even sure how much has made it into 3.6. You might try searching the
JIRA list, Martijn van Groningen has done a bunch of work lately, look for
his name. Fortunately, it's not likely to get a bunch of false hits ..

Best
Erick

On Fri, Jul 13, 2012 at 7:50 AM, Agnieszka Kukałowicz
 wrote:
> Hi,
>
> I have problem with faceting count in distributed grouping. It appears only
> when I make query that returns almost all of the documents.
>
> My SOLR implementation has 4 shards and my queries looks like:
>
> http://host:port
> /select/q?=*:*&shards=shard1,shard2,shard3,shard4&group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1
>
> With query like above I get strange counts for field category1.
> The counts for values are very big:
> 9659
> 7015
> 5676
> 1180
> 1105
> 979
> 770
> 701
> 612
> 422
> 358
>
> When I make query to narrow the results adding to query
> fq=category1:"val1", etc. I get different counts than facet category1 shows
> for a few first values:
>
> fq=category1:"val1" - counts: 22
> fq=category1:"val2" - counts: 22
> fq=category1:"val3" - counts: 21
> fq=category1:"val4" - counts: 19
> fq=category1:"val5" - counts: 19
> fq=category1:"val6" - counts: 20
> fq=category1:"val7" - counts: 20
> fq=category1:"val8" - counts: 25
> fq=category1:"val9" - counts: 422
> fq=category1:"val10" - counts: 358
>
> From val9 the count is ok.
>
> First I thought that for some values in facet "category1" groups count does
> not work and it returns counts of all documents not group by field id.
> But the number of all documents matches query  fq=category1:"val1" is
> 45468. So the numbers are not the same.
>
> I check the queries on each shard for val1 and the results are:
>
> shard1:
> query:
> http://shard1/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1
>
> 
> 11
>
> query:
> http://shard1/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1&fq=category1
> :"val1"
>
> shard 2:
> query:
> http://shard2/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1
>
> there is no value "val1" in category1 facet.
>
> query:
> http://shard2/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1&fq=category1
> :"val1"
>
> 7
>
> shard3:
> query:
> http://shard3/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1
>
> there is no value val1 in category1 facet
>
> query:
> http://shard3/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1&fq=category1
> :"val1"
>
> 4
>
> So it looks that detail query with fq=category1:"val1" returns the relevant
> results. But Solr has problem with faceting counts when one of the shard
> does not return the faceting value (in this scenario "val1") that exists on
> other shards.
>
> I checked shards for "val10" and I got:
>
> shard1: count for val10 - 142
> shard2: count for val10 - 131
> shard3: count for val10 -  149
> sum of counts 422 - ok.
>
> I'm not sure how to resolve that situation. For sure the counts of val1 to
> val9 should be different and they should not be on the top of the category1
> facet because this is very confusing. Do you have any idea how to fix this
> problem?
>
> Best regards
> Agnieszka


Re: edismax not working in a core

2012-07-14 Thread Erick Erickson
Really hard to say. Try executing your query on the cores with
&debugQuery=on and compare the parsed results (for this you
can probably just ignore the explain bits of the output, concentrate
on the parsed query).

Best
Erick

On Fri, Jul 13, 2012 at 11:25 AM, Richard Frovarp  wrote:
> I'm having trouble with edismax not working in one of my cores. I have three
> cores up and running, including the demo in Solr 3.6 on Tomcat 7.0.27 on
> Java 1.6.
>
> I can't get edismax to work on one of those cores, and it's configured very
> similar to the demo, which does work. I have different fields, but overall
> I'm not doing much different. I'm testing using a query with "OR" in it to
> try to get a union. On two of the cores, I get the union, on my third one I
> get a much smaller set than either term should return. If I tell the
> misbehaving core to have a defType of lucene, that does honor the "OR".
>
> What could I be possibly missing?
>
> Thanks,
> Richard


Re: Sort by date field = outofmemory?

2012-07-14 Thread Lance Norskog
Sorting requires an array of 4-byte ints, one for each document. If
the field is a number or date, this is the only overhead. 80M docs * 4
bytes = 320 mbytes for each sorted field. If it is something else like
a string, Lucene also creates an array with one of every unique value.

If your query result sets are small, you can sort on a function. This
does not create these large array.

On Thu, Jul 12, 2012 at 8:09 AM, Erick Erickson  wrote:
> Bruno:
>
> You can also reduce your memory requirements by storing fewer unique values.
> All the _unique_ values for a field in the index are read in for
> sorting. People often
> store timestamps in milliseconds, which essentially means that every
> document has
> a unique value.
>
> Storing your timestamps in the coarsest granularity that suits your use-case 
> is
> always a good idea, see the date math:
> http://lucene.apache.org/solr/api-4_0_0-ALPHA/org/apache/solr/util/DateMathParser.html
>
> Best
> Erick
>
> On Wed, Jul 11, 2012 at 12:44 PM, Yury Kats  wrote:
>> This solves the problem by allocating memory up front, instead of at some
>> point later when JVM needs it. At that later point in time there may not
>> be enough free memory left on the system to allocate.
>>
>> On 7/11/2012 11:04 AM, Michael Della Bitta wrote:
>>> There is a school of thought that suggests you should always set Xms
>>> and Xmx to the same thing if you expect your heap to hit Xms. This
>>> results in your process only needing to allocate the memory once,
>>> rather in a series of little allocations as the heap expands.
>>>
>>> I can't explain how this fixed your problem, but just a datapoint that
>>> might suggest that doing what you did is not such a bad thing.
>>>
>>> Michael Della Bitta
>>>
>>> 
>>> Appinions, Inc. -- Where Influence Isn’t a Game.
>>> http://www.appinions.com
>>>
>>>
>>> On Wed, Jul 11, 2012 at 4:05 AM, Bruno Mannina  wrote:
 Hi, some news this morning...

 I added -Xms1024m option and now it works?! no outofmemory ?!

 java -jar -Xms1024m -Xmx2048m start.jar

 Le 11/07/2012 09:55, Bruno Mannina a écrit :

> Hi Yury,
>
> Thanks for your anwer.
>
> ok for to increase memory but I have a problem with that,
> I have 8Go on my computer but the JVM accepts only 2Go max with the option
> -Xmx
> is it normal?
>
> Thanks,
> Bruno
>
> Le 11/07/2012 03:42, Yury Kats a écrit :
>>
>> Sorting is a memory-intensive operation indeed.
>> Not sure what you are asking, but it may very well be that your
>> only option is to give JVM more memory.
>>
>> On 7/10/2012 8:25 AM, Bruno Mannina wrote:
>>>
>>> Dear Solr Users,
>>>
>>> Each time I try to do a request with &sort=pubdate+desc
>>>
>>> I get:
>>> GRAVE: java.lang.OutOfMemoryError: Java heap space
>>>
>>> I use Solr3.6, I have around 80M docs and my request gets around 160
>>> results.
>>>
>>> Actually for my test, i use jetty
>>>
>>> java -jar -Xmx2g start.jar
>>>
>>> PS: If I write 3g i get an error, I have 8go Ram
>>>
>>> Thanks a lot for your help,
>>> Bruno
>>>
>>>
>>
>>
>>
>
>
>
>


>>>
>>
>>



-- 
Lance Norskog
goks...@gmail.com


Facet on all the dynamic fields with *_s feature

2012-07-14 Thread Rajani Maski
Hi All,

   Is this issue fixed in solr 3.6 or 4.0:  Faceting on all Dynamic field
with facet.field=*_s

   Link  :  https://issues.apache.org/jira/browse/SOLR-247



  If it is not fixed, any suggestion on how do I achieve this?


My requirement is just same as this one :
http://lucene.472066.n3.nabble.com/Dynamic-facet-field-tc2979407.html#none


Regards
Rajani


Custom Hit Collector

2012-07-14 Thread Mike Schultz
As far as I can tell, using field collapsing prevents the use of the
queryResultCache from being checked. It's important for our application to
have both.  There are threads on incorporating custom hit collectors which
seems like it could be a way to implement the simplified collapsing I need
(just deduping based on the fieldCache value) but still consult the
queryResultCache.

Does anyone know the state being able to incorporate a custom hit collector,
say, in 4.0.  Or probably better, how to get caching to work with field
collapsing?

Mike

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Custom-Hit-Collector-tp3995073.html
Sent from the Solr - User mailing list archive at Nabble.com.