field range (min and max term)

2009-02-02 Thread Ben Incani
Hi Solr users,

Is there a method of retrieving a field range i.e. the min and max
values of that fields term enum.

For example I would like to know the first and last date entry of N
documents.

Regards,

-Ben


lowercase text/strings to be used in list box

2007-10-19 Thread Ben Incani
I have a field which will only contain several values (that include
spaces).

I want to display a list box with all possible values by browsing the
lucene terms.

I have setup a field in the schema.xml file.


  


  


I also tried;


  


  



This allows me to browse all the values no problem, but when it comes to
search the documents I have to use the lucene
org.apache.lucene.analysis.KeywordAnalyzer, when I would rather use the
org.apache.lucene.analysis.standard.StandardAnalyzer and the power of
the default query parser to perform a phrase query such as my_field:(the
value) or my_field:"the value", which don't work?

So is there a way to prevent tokenisation of a field using the
StandardAnalyzer, without implementing your own TokenizerFactory?

Regards

Ben


RE: lowercase text/strings to be used in list box

2007-10-21 Thread Ben Incani
sorry - this should have been posted on the Lucene user list.

...the solution is to use the lucene PerFieldAnalyzerWrapper and add the
field with the KeywordAnalyzer then pass the PerFieldAnalyzerWrapper to
the QueryParser.

-Ben

> -Original Message-
> From: Ben Incani [mailto:[EMAIL PROTECTED] 
> Sent: Friday, 19 October 2007 5:52 PM
> To: solr-user@lucene.apache.org
> Subject: lowercase text/strings to be used in list box
> 
> I have a field which will only contain several values (that 
> include spaces).
> 
> I want to display a list box with all possible values by 
> browsing the lucene terms.
> 
> I have setup a field in the schema.xml file.
> 
> 
>   
> 
> 
>   
> 
> 
> I also tried;
> 
> 
>   
> 
> 
>   
> 
> 
> 
> This allows me to browse all the values no problem, but when 
> it comes to search the documents I have to use the lucene 
> org.apache.lucene.analysis.KeywordAnalyzer, when I would 
> rather use the 
> org.apache.lucene.analysis.standard.StandardAnalyzer and the 
> power of the default query parser to perform a phrase query 
> such as my_field:(the
> value) or my_field:"the value", which don't work?
> 
> So is there a way to prevent tokenisation of a field using 
> the StandardAnalyzer, without implementing your own TokenizerFactory?
> 
> Regards
> 
> Ben
> 


retrieve lucene "doc id"

2007-12-16 Thread Ben Incani
how do I retrieve the lucene "doc id" in a query?

-Ben


RE: retrieve lucene "doc id"

2007-12-16 Thread Ben Incani
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf 
> Of Yonik Seeley
> Sent: Monday, 17 December 2007 4:44 PM
> To: solr-user@lucene.apache.org
> Subject: Re: retrieve lucene "doc id"
> 
> On Dec 16, 2007 11:40 PM, Ben Incani 
> <[EMAIL PROTECTED]> wrote:
> > how do I retrieve the lucene "doc id" in a query?
> 
> Currently that's not doable... if it was though, it would be 
> a slightly dangerous feature since internal ids are transient.
> Can you explain a little more about what you are trying to do?
> 
> -Yonik
> 

Hi Yonik,

I have converted to using the Solr search interface and I am trying to
retrieve documents from a list of search results (where previously I had
used the doc id directly from the lucene query results) and the solr id
I have got currently indexed is unfortunately configured not be unique!

I do realise that lucene internal ids are transient, but for a read-only
requests (that are not cached) should be ok.  

i have hacked org.apache.solr.request.XMLWriter.writeDoc to do a
writeInt("docId", docId).

 code snippet 

SolrServer server = new CommonsHttpSolrServer(solrURL);
Map params = new HashMap();
params.put("q", searchQuery);
params.put("rows", "20");

MapSolrParams solrParams = new MapSolrParams(params);   
QueryResponse response = server.query(solrParams);

SolrDocumentList docs = response.getResults();
ArrayList hitsList = new ArrayList();

for (int i = 0; i < docs.getNumFound(); i++) {
HashMap resultMap = new HashMap();
SolrDocument doc = (SolrDocument)docs.get(i);
resultMap.put("id", doc.getFieldValue("docId"));
for(int j=0; j

solr web admin

2007-12-19 Thread Ben Incani
why does the web admin append "core=null" to all the requests?

e.g. admin/get-file.jsp?core=null&file=schema.xml


base64 support & containers

2006-07-04 Thread Ben Incani
Hi Solr users,
 
Does Solr support/or will in the future base64 encoded XML documents so
that binary blobs can be added to the index?

I have been using this solr client by Darren Vengroff successfully.  It
easily plugs-in into the Solr package and could also use binary
functions in org.apache.solr.util.XML
http://issues.apache.org/jira/browse/SOLR-20

So far I have been storing binary data in the lucene index, I realise
this is not an optimal solution, but so far I have not found a java
container system to manage documents.  Can anyone recommend one?

Regards,
 
Ben


RE: base64 support & containers

2006-07-05 Thread Ben Incani
 

> -Original Message-
> From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, 6 July 2006 9:52 AM
> To: solr-user@lucene.apache.org
> Subject: Re: base64 support & containers
> 
> 
> : Does Solr support/or will in the future base64 encoded XML documents
so
> : that binary blobs can be added to the index?
> 
> I'm not sure if I'm understanding your question completely 
> ... if you have binary data that you want to shove into a 
> stored field, you should certinaly be able to base64 encode 
> it in your client and shove it into Solr using a "string" 
> field type -- but your use of hte phrase "base64 encoded XML 
> documents" has me thinkingthatthere is more to your question 
> involving an "advanced" use of XML that i'm not familiar with 
> -- can you elaborate?
> 
> 
> 
> -Hoss
> 

No - no advanced use of XML has been implemented.
One of the fields in the add request would contain the original binary
document encoded in base64, then this would preferably be decoded to
binary and placed into a lucene binary field, which would need to be
defined in Solr.

Thanks
Ben


separate log files

2007-01-15 Thread Ben Incani
Hi Solr users,

I'm running multiple instances of Solr, which all using the same war
file to load from.

Below is an example of the servlet context file used for each
application.





Hence each application is using the same
WEB-INF/classes/logging.properties file to configure logging.

I would like to each instance to log to separate log files such as;
app1-solr.-mm-dd.log
app2-solr.-mm-dd.log
...

Is there an easy way to append the context path to
org.apache.juli.FileHandler.prefix
E.g. 
org.apache.juli.FileHandler.prefix = ${catalina.context}-solr.
 
Or would this require a code change?

Regards

-Ben


detecting duplicates using the field type 'text'

2007-02-14 Thread Ben Incani
Hi Solr users,

I have the following fields set in my 'schema.xml'.

*** schema.xml ***
 
  
  
  ...
 
 id
 document_title
 
*** schema.xml ***

When I add a document with a duplicate title, it gets duplicated (not
sure why)



 duplicate


 duplicate



When I add a document with a duplicate title (numeric only), it does not
get duplicated



 123


 123



I can ensure duplicates DO NOT get added when using the field type
'string'.
And I can also ensure that they DO get added when using .

Why is there a disparity detecting duplicates when using the field type
'text'?

Is this merely a documentation issue or have I missed something here...

Regards,

Ben