cleanup after OutOfMemoryError

2013-09-04 Thread Ryan McKinley
I have an application where I am calling DirectUpdateHandler2 directly with:

  update.addDoc(cmd);

This will sometimes hit:

java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.util.UnicodeUtil.UTF16toUTF8(UnicodeUtil.java:248)
at org.apache.lucene.store.DataOutput.writeString(DataOutput.java:234)
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.writeField(CompressingStoredFieldsWriter.java:273)
at
org.apache.lucene.index.StoredFieldsProcessor.finishDocument(StoredFieldsProcessor.java:126)
at
org.apache.lucene.index.TwoStoredFieldsConsumers.finishDocument(TwoStoredFieldsConsumers.java:65)
at
org.apache.lucene.index.DocFieldProcessor.finishDocument(DocFieldProcessor.java:264)
at
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:283)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:432)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1513)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:212)
at voyager.index.zmq.IndexingRunner.apply(IndexingRunner.java:303)

and then a little while later:

auto commit error...:java.lang.IllegalStateException: this writer hit an
OutOfMemoryError; cannot commit
at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2726)
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2897)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2872)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:549)
at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)


Is there anythign I can/should do to cleanup after the OOME?  At a minimum
I do not want any new requests using the same IndexWriter.  Should I use:


  catch(OutOfMemoryError ex) {

   update.getCommitTracker().cancelPendingCommit();
 update.newIndexWriter(false);
 ...

or perhaps 'true' for rollback?

Thanks
Ryan


Best approach to Intersect results with big Set?

2011-09-01 Thread Ryan McKinley
I have an application where I need to return all results that are not
in a Set  (the Set is managed from hazelcast... but that is
not relevant)

As a fist approach, i have a SerachComponent that injects a BooleanQuery:

  BooleanQuery bq = new BooleanQuery(true);
  for( String id : ids) {
bq.add(new BooleanClause(new TermQuery(new
Term("id",id)),Occur.MUST_NOT));
  }

This works, but i'm concerned about how many terms we could end up
with as the size grows.

Another possibility could be a Filter that iterates though FieldCache
and checks if each value is in the Set

Any thoughts/directions on things to look at?

thanks
ryan


Re: REST calls

2010-06-30 Thread Ryan McKinley
If there is a real desire/need to make things "restful" in the
official sense, it is worth looking at using a REST framework as the
controller rather then the current solution.  perhaps:

http://www.restlet.org/
https://jersey.dev.java.net/

These would be cool since they encapsulate lots of the request
plumbing work that it would be better if we could leverage more widely
used approaches then support our own.

That said, what we have is functional and powerful -- if you are
concerned about people editing the index (with GET/POST or whatever)
there are plenty of ways to solve this.

ryan


On Wed, Jun 30, 2010 at 5:31 PM, Lance Norskog  wrote:
> I've looked at the problem. It's fairly involved. It probably would
> take several iterations. (But not as many as field collapsing :)
>
> On Wed, Jun 30, 2010 at 2:11 PM, Yonik Seeley
>  wrote:
>> On Wed, Jun 30, 2010 at 4:55 PM, Lance Norskog  wrote:
>>>  Apparently this is not ReStFuL It is IMVHO insane.
>>
>> Patches welcome...
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>


Sort by index order desc?

2010-07-23 Thread Ryan McKinley
Any pointers on how to sort by reverse index order?
http://search.lucidimagination.com/search/document/4a59ded3966271ca/sort_by_index_order_desc

it seems like it should be easy to do with the function query stuff,
but i'm not sure what to sort by (unless I add a new field for indexed
time)


Any pointers?

Thanks
Ryan


Re: Sort by index order desc?

2010-07-23 Thread Ryan McKinley
Looks like you can sort by _docid_ to get things in index order or
reverse index order.

?sort=_docid_ asc

thank you solr!


On Fri, Jul 23, 2010 at 2:23 PM, Ryan McKinley  wrote:
> Any pointers on how to sort by reverse index order?
> http://search.lucidimagination.com/search/document/4a59ded3966271ca/sort_by_index_order_desc
>
> it seems like it should be easy to do with the function query stuff,
> but i'm not sure what to sort by (unless I add a new field for indexed
> time)
>
>
> Any pointers?
>
> Thanks
> Ryan
>


help refactoring from 3.x to 4.x

2010-08-22 Thread Ryan McKinley
I have a function that works well in 3.x, but when I tried to
re-implement in 4.x it runs very very slow (~20ms vs 45s on an index w
~100K items).

Big picture, I am trying to calculate a bounding box for items that
match the query.  To calculate this, I have two fields bboxNS, and
bboxEW that get filled with the min and max values for that doc.  To
get the bounding box, I just need the first matching term in the index
and the last matching term.

In 3.x the code looked like this:

public class FirstLastMatchingTerm
{
  String first = null;
  String last = null;

  public static FirstLastMatchingTerm read(SolrIndexSearcher searcher,
String field, DocSet docs) throws IOException
  {
FirstLastMatchingTerm firstLast = new FirstLastMatchingTerm();
if( docs.size() > 0 ) {
  IndexReader reader = searcher.getReader();
  TermEnum te = reader.terms(new Term(field,""));
  do {
Term t = te.term();
if( null == t || !t.field().equals(field) ) {
  break;
}

if( searcher.numDocs(new TermQuery(t), docs) > 0 ) {
  firstLast.last = t.text();
  if( firstLast.first == null ) {
firstLast.first = firstLast.last;
  }
}
  }
  while( te.next() );
}
return firstLast;
  }
}


In 4.x, I tried:

public class FirstLastMatchingTerm
{
  String first = null;
  String last = null;

  public static FirstLastMatchingTerm read(SolrIndexSearcher searcher,
String field, DocSet docs) throws IOException
  {
FirstLastMatchingTerm firstLast = new FirstLastMatchingTerm();
if( docs.size() > 0 ) {
  IndexReader reader = searcher.getReader();

  Terms terms = MultiFields.getTerms(reader, field);
  TermsEnum te = terms.iterator();
  BytesRef term = te.next();
  while( term != null ) {
if( searcher.numDocs(new TermQuery(new Term(field,term)), docs) > 0 ) {
  firstLast.last = term.utf8ToString();
  if( firstLast.first == null ) {
firstLast.first = firstLast.last;
  }
}
term = te.next();
  }
}
return firstLast;
  }
}

but the results are slow (and incorrect).  I tried some variations of
using ReaderUtil.Gather(), but the real hit seems to come from
  if( searcher.numDocs(new TermQuery(new Term(field,term)), docs) > 0 )

Any ideas?  I'm not tied to the approach or indexing strategy, so if
anyone has other suggestions that would be great.  Looking at it
again, it seems crazy that you have to run a query for each term, but
in 3.x

thanks
ryan


Re: help refactoring from 3.x to 4.x

2010-08-23 Thread Ryan McKinley
On Mon, Aug 23, 2010 at 7:00 AM, Michael McCandless
 wrote:
> Spooky that you see incorrect results!  The code looks correct.  What
> are the specifics on when it produces an invalid result?

Figured this out -- the above code is not invalid, however i tried
versions that movedthe utf8ToString() the end -- however the BytesRef
reuse made this not accurate.

no need to get spooked here -- user error.

>
> Also spooky that you see it running slower -- how much slower?

much slower -- this component took ~30-100ms in 3.x and 30-45 sec in 4.x

> Did you rebuild the index in 4.x (if not, you are using the preflex
> codec)?  And is the index otherwise identical?

I have tried both:
3.x index loaded into 4.x then run optimize
rebuild 3.x index in 4.x

these have the same performance. (bad)

re preflex codec?  How could I tell?  Do I need to do anything to explicit?



>
> You could improve perf by not using SolrIndexSearcher.numDocs?  Ie you
> don't need the count; you just need to know if it's > 0.  So you could
> make your own loop that breaks out on the first docID in common.  You
> could also stick w/ BytesRef the whole time (only do .utf8ToString()
> in the end on the first/last), though this is presumably a net/nets
> tiny cost.
>

Ah yes -- this helps a lot!

The following code gets similar performance to the 3.x version.  I
kept the 'utf8ToString' in the loop since the alternative was to copy
it out anyway to avoid reuse.

  public static FirstLastMatchingTerm read(final SolrIndexSearcher
searcher, final String field, final DocSet docs) throws IOException
  {
FirstLastMatchingTerm firstLast = new FirstLastMatchingTerm();
if( docs.size() > 0 ) {
  IndexReader reader = searcher.getReader();

  DocsEnum denum = null;
  Terms terms = MultiFields.getTerms(reader, field);
  TermsEnum te = terms.iterator();
  BytesRef bytes = te.next();
  while( bytes != null ) {
denum = terms.docs(null, bytes, denum);
if( denum != null ) {
  // find if any doc matches our result set
  while( denum.nextDoc() != DocsEnum.NO_MORE_DOCS ) {
if( docs.exists( denum.docID() ) ) {
  String v = bytes.utf8ToString();
  if( v.length() > 0 ) {
firstLast.last = v;
if( firstLast.first == null ) {
  firstLast.first = v;
}
break;
  }
}
  }
}
bytes = te.next();
  }
}
return firstLast;
  }




> But, we should still dig down on why numDocs is slower in 4.x; that's
> unexpected; Yonik any ideas?  I'm not familiar with this part of
> Solr...
>
> Mike
>
> On Mon, Aug 23, 2010 at 2:38 AM, Ryan McKinley  wrote:
>> I have a function that works well in 3.x, but when I tried to
>> re-implement in 4.x it runs very very slow (~20ms vs 45s on an index w
>> ~100K items).
>>
>> Big picture, I am trying to calculate a bounding box for items that
>> match the query.  To calculate this, I have two fields bboxNS, and
>> bboxEW that get filled with the min and max values for that doc.  To
>> get the bounding box, I just need the first matching term in the index
>> and the last matching term.
>>
>> In 3.x the code looked like this:
>>
>> public class FirstLastMatchingTerm
>> {
>>  String first = null;
>>  String last = null;
>>
>>  public static FirstLastMatchingTerm read(SolrIndexSearcher searcher,
>> String field, DocSet docs) throws IOException
>>  {
>>    FirstLastMatchingTerm firstLast = new FirstLastMatchingTerm();
>>    if( docs.size() > 0 ) {
>>      IndexReader reader = searcher.getReader();
>>      TermEnum te = reader.terms(new Term(field,""));
>>      do {
>>        Term t = te.term();
>>        if( null == t || !t.field().equals(field) ) {
>>          break;
>>        }
>>
>>        if( searcher.numDocs(new TermQuery(t), docs) > 0 ) {
>>          firstLast.last = t.text();
>>          if( firstLast.first == null ) {
>>            firstLast.first = firstLast.last;
>>          }
>>        }
>>      }
>>      while( te.next() );
>>    }
>>    return firstLast;
>>  }
>> }
>>
>>
>> In 4.x, I tried:
>>
>> public class FirstLastMatchingTerm
>> {
>>  String first = null;
>>  String last = null;
>>
>>  public static FirstLastMatchingTerm read(SolrIndexSearcher searcher,
>> String field, DocSet docs) throws IOException
>>  {
>>    FirstLastMatchingTerm firstLast = new FirstLastMatchingTerm();
>>    if( docs.size() > 0 ) {
>>      IndexReader reader = searche

Re: Problem in setting the request writer in SolrJ (wiki page wrong?)

2010-08-23 Thread Ryan McKinley
Note that the 'setRequestWriter' is not part of the SolrServer API, it
is on the CommonsHttpSolrServer:
http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/impl/CommonsHttpSolrServer.html#setRequestWriter%28org.apache.solr.client.solrj.request.RequestWriter%29

If you are using EmbeddedSolrServer, the params are not serialized via
RequestWriter, so you don't have any options there.

ryan


On Mon, Aug 23, 2010 at 9:24 AM, Constantijn Visinescu
 wrote:
> Hello,
>
> I'm using an embedded solrserver in my Java webapp, but as far as i
> can tell it's defaulting to sending updates in XML, which seems like a
> huge waste compared to sending it in Java binary format.
>
> According to this page:
> http://wiki.apache.org/solr/Solrj#Setting_the_RequestWriter
>
> I'm supposed to be able to set the requestwriter like so:
> server.setRequestWriter(new BinaryRequestWriter());
>
> However this method doesn't seem to exists in the SolrServer class of
> SolrJ 1.4.1 ?
>
> How do i set it to process updates in the java binary format?
>
> Thanks in advance,
> Constantijn Visinescu
>
> P.S.
> I'm creating my SolrServer instance like this:
>        private SolrServer solrServer;
>        CoreContainer container = new CoreContainer.Initializer().initialize();
>        solrServer = new EmbeddedSolrServer(container, "");
>
> this solrServer wont let me set a request writer.
>


Re: Logic behind Solr creating files in .../data/index path.

2010-09-07 Thread Ryan McKinley
Check:
http://lucene.apache.org/java/3_0_2/fileformats.html


On Tue, Sep 7, 2010 at 3:16 AM, rajini maski  wrote:
> All,
>
> While we post data to Solr... The data get stored in   "//data/index"  path
> in some multiple files with different file extensions...
> Not worrying about the extensions, I want to know how are these number of
> files created ?
> Does anyone know on what logic are these multiple index files  created in
> data/index  path ... ? If we do an optimize , The number of files get
> reduced...
> Else, say some N number of files are  created.. Based on what parameter it
> creates? And how are the sizes of file varies there?
>
>
> Hope I am clear about the doubt I have...
>


Re: No more trunk support for 2.9 indexes

2010-09-12 Thread Ryan McKinley
> I suppose an index 'remaker' might be something like a DIH reader for
> a Solr index - streams everything out of the existing index, writing
> it into the new one?

This works fine if all fields are stored (and copy field does not go
to a stored field), otherwise you would need/want to start with the
orignial source.

ryan


Re: Field names

2010-09-13 Thread Ryan McKinley
check:
http://wiki.apache.org/solr/LukeRequestHandler



On Mon, Sep 13, 2010 at 7:00 PM, Peter A. Kirk  wrote:
> Hi
>
> is it possible to issue a query to solr, to get a list which contains all the 
> field names in the index?
>
> What about to get a list of the freqency of individual words in each field?
>
> thanks,
> Peter
>


Re: is indexing single-threaded?

2010-09-22 Thread Ryan McKinley
Multiple threads work well.

If you are using solrj, check the StreamingSolrServer for an
implementation that will keep X number of threads busy.

Your mileage will very, but in general I find a reasonable thread
count is ~ (number of cores)+1


On Wed, Sep 22, 2010 at 5:52 AM, Andy  wrote:
> Does Solr index data in a single thread or can data be indexed concurrently 
> in multiple threads?
>
> Thanks
> Andy
>
>
>
>


Re: How can I delete the entire contents of the index?

2010-09-22 Thread Ryan McKinley
*:*

will leave you a fresh index


On Thu, Sep 23, 2010 at 12:50 AM, xu cheng  wrote:
> the query that fetch the data you wanna
> delete
> I did like this to delete my data
> best regards
>
> 2010/9/23 Igor Chudov 
>
>> Let's say that I added a number of elements to Solr (I use
>> Webservice::Solr as the interface to do so).
>>
>> Then I change my mind and want to delete them all.
>>
>> How can I delete all contents of the database, but leave the database
>> itself, just empty?
>>
>> Thanks
>>
>> i
>>
>


Re: API for using Multi cores with SolrJ

2010-10-18 Thread Ryan McKinley
On Mon, Oct 18, 2010 at 10:12 AM, Tharindu Mathew  wrote:
> Thanks Peter. That helps a lot. It's weird that this not documented anywhere. 
> :(

Feel free to edit the wiki :)


Re: how can i use solrj binary format for indexing?

2010-10-18 Thread Ryan McKinley
Do you already have the files as solr XML?  If so, I don't think you need solrj

If you need to build SolrInputDocuments from your existing structure,
solrj is a good choice.  If you are indexing lots of stuff, check the
StreamingUpdateSolrServer:
http://lucene.apache.org/solr/api/solrj/org/apache/solr/client/solrj/impl/StreamingUpdateSolrServer.html


On Sun, Oct 17, 2010 at 11:01 PM, Jason, Kim  wrote:
>
> Hi all
> I have a huge amount of xml files for indexing.
> I want to index using solrj binary format to get performance gain.
> Because I heard that using xml files to index is quite slow.
> But I don't know how to use index through solrj binary format and can't find
> examples.
> Please give some help.
> Thanks,
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/how-can-i-use-solrj-binary-format-for-indexing-tp1722612p1722612.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


query pending commits?

2010-10-18 Thread Ryan McKinley
I have an indexing pipeline that occasionally needs to check if a
document is already in the index (even if not commited yet).

Any suggestions on how to do this without calling  before each check?

I have a list of document ids and need to know which ones are in the
index (actually I need to know which ones are not in the index)  I
figured I would write a custome RequestHandler that would check the
main Reader and the UpdateHander reader, but it now looks like
'update' is handled directly within IndexWriter.

Any ideas?

thanks
ryan


NRT persistant flags?

2013-03-13 Thread Ryan McKinley
I'm looking for a way to quickly flag/unflag documents.

This could be one at a time or by query (even *:*)

I have hacked together something based on ExternalFileField that is
essentially a FST holding all the ids (solr not lucene).  Like the
FieldCache, it holds a WeakHashMap where the
OpenBitSet is loaded by iterating the FST on the reader (just like
ExternalFileField)

This seems to work OK, but there *must* be something better!

Any ideas on the right approach for something like this?  This feels like
it should be related to DocValues or the FieldCache

Thanks for any pointers!

ryan


Re: Improving performance for SOLR geo queries?

2012-02-08 Thread Ryan McKinley
Hi Matthias-

I'm trying to understand how you have your data indexed so we can give
reasonable direction.

What field type are you using for your locations?  Is it using the
solr spatial field types?  What do you see when you look at the debug
information from &debugQuery=true?

>From my experience, there is no single best practice for spatial
queries -- it will depend on your data density and distribution if.

You may also want to look at:
http://code.google.com/p/lucene-spatial-playground/
but note this is off lucene trunk -- the geohash queries are super fast though

ryan




2012/2/8 Matthias Käppler :
> Hi Erick,
>
> if we're not doing geo searches, we filter by "location tags" that we
> attach to places. This is simply a hierachical regional id, which is
> simple to filter for, but much less flexible. We use that on Web a
> lot, but not on mobile, where we want to performance searches in
> arbitrary radii around arbitrary positions. For those location tag
> kind of queries, the average time spent in SOLR is 43msec (I'm looking
> at the New Relic snapshot of the last 12 hours). I have disabled our
> "optimization" again just yesterday, so for the bbox queries we're now
> at an avg of 220ms (same time window). That's a 5 fold increase in
> response time, and in peak hours it's worse than that.
>
> I've also found a blog post from 3 years ago which outlines the inner
> workings of the SOLR spatial indexing and searching:
> http://www.searchworkings.org/blog/-/blogs/23842
> From that it seems as if SOLR already performs a similar optimization
> we had in mind during the index step, so if I understand correctly, it
> doesn't even search over all records, only those that were mapped to
> the grid box identified during indexing.
>
> What I would love to see is what the suggested way is to perform a geo
> query on SOLR, considering that they're so difficult to cache and
> expensive to run. Is the best approach to restrict the candidate set
> as much as possible using cheap filter queries, so that SOLR merely
> has to do the geo search against these subsets? How does the query
> planner work here? I see there's a cost attached to a filter query,
> but one can only set it when cache is set to false? Are cached geo
> queries executed last when there are cheaper filter queries to cut
> down on documents? If you have a real world practical setup to share,
> one that performs well in a production environment that serves
> requests in the Millions per day, that would be great.
>
> I'd love to contribute documentation by the way, if you knew me you'd
> know I'm an avid open source contributor and actually run several open
> source projects myself. But tell me, how can I possibly contribute
> answer to questions I don't have an answer to? That's why I'm here,
> remember :) So please, these kinds of snippy replies are not helping
> anyone.
>
> Thanks
> -Matthias
>
> On Tue, Feb 7, 2012 at 3:06 PM, Erick Erickson  
> wrote:
>> So the obvious question is "what is your
>> performance like without the distance filters?"
>>
>> Without that knowledge, we have no clue whether
>> the modifications you've made had any hope of
>> speeding up your response times
>>
>> As for the docs, any improvements you'd like to
>> contribute would be happily received
>>
>> Best
>> Erick
>>
>> 2012/2/6 Matthias Käppler :
>>> Hi,
>>>
>>> we need to perform fast geo lookups on an index of ~13M places, and
>>> were running into performance problems here with SOLR. We haven't done
>>> a lot of query optimization / SOLR tuning up until now so there's
>>> probably a lot of things we're missing. I was wondering if you could
>>> give me some feedback on the way we do things, whether they make
>>> sense, and especially why a supposed optimization we implemented
>>> recently seems to have no effect, when we actually thought it would
>>> help a lot.
>>>
>>> What we do is this: our API is built on a Rails stack and talks to
>>> SOLR via a Ruby wrapper. We have a few filters that almost always
>>> apply, which we put in filter queries. Filter cache hit rate is
>>> excellent, about 97%, and cache size caps at 10k filters (max size is
>>> 32k, but it never seems to reach that many, probably because we
>>> replicate / delta update every few minutes). Still, geo queries are
>>> slow, about 250-500msec on average. We send them with cache=false, so
>>> as to not flood the fq cache and cause undesirable evictions.
>>>
>>> Now our idea was this: while the actual geo queries are poorly
>>> cacheable, we could clearly identify geographical regions which are
>>> more often queried than others (naturally, since we're a user driven
>>> service). Therefore, we dynamically partition Earth into a static grid
>>> of overlapping boxes, where the grid size (the distance of the nodes)
>>> depends on the maximum allowed search radius. That way, for every user
>>> query, we would always be able to identify a single bounding box that
>>> covers it. This larger bounding box

Re: solr geospatial / spatial4j

2012-03-08 Thread Ryan McKinley
On Wed, Mar 7, 2012 at 7:25 AM, Matt Mitchell  wrote:
> Hi,
>
> I'm researching options for handling a better geospatial solution. I'm
> currently using Solr 3.5 for a read-only "database", and the
> point/radius searches work great. But I'd like to start doing point in
> polygon searches as well. I've skimmed through some of the geospatial
> jira issues, and read about spaitial4j, which is very interesting. I
> see on the github page that this will soon be part of lucene, can
> anyone confirm this?

perhaps -- see the discussion on:
https://issues.apache.org/jira/browse/LUCENE-3795

This will involve a few steps before it is actually integrated with
the lucene project -- and then a few more to be usable from solr

>
> I attempted to build the spatial4j demo but no luck. It had problems
> finding lucene 4.0-SNAPSHOT, which I guess is because there are no
> 4.0-SNAPSHOT nightly builds? If anyone knows how I can get around
> this, please let me know!
>

ya they are published -- you just have to specify where you want to
pull them from.  If you use the 'updateLucene' profile, it will pull
them from:  https://repository.apache.org/content/groups/snapshots/

use:  mvn clean install -P updateLucene


> Other than spatial4j, is there a way to do point in polgyon searches
> with solr 3.5.0 right now? Is there some tricky indexing/querying
> strategy that would allow this?
>

I don't know of anything else -- and note that polygon stuff has a
ways to go before it is generally ready for prime-time.

ryan


Re: SolrCloud Zookeeper view does not work on latest snapshot

2012-04-06 Thread Ryan McKinley
There have been a bunch of changes getting the zookeeper info and UI
looking good.  The info moved from being on the core to using a
servlet at the root level.

Note, it is not a request handler anymore, so the wt=XXX has no
effect.  It is always JSON

ryan


On Fri, Apr 6, 2012 at 7:01 AM, Jamie Johnson  wrote:
> I looked at our old system and indeed it used to make a call to
> /solr/zookeeper not /solr/corename/zookeeper.  I am making a change
> locally so I can run with this but is this a bug or did I much
> something up with my configuration?
>
> On Fri, Apr 6, 2012 at 9:33 AM, Jamie Johnson  wrote:
>> I just downloaded the latest snapshot and fired it up to take a look
>> around and I'm getting the following error when looking at the Cloud
>> view.
>>
>> Loading of undefined failed with HTTP-Status 404
>>
>> The request I see going out is as follows
>>
>> http://localhost:8501/solr/slice1_shard1/zookeeper?wt=json
>>
>> this doesn't work but this does
>>
>> http://localhost:8501/solr/zookeeper?wt=json
>>
>> Any thoughts why this would happen?


Re: 'No JSP support' error in embedded Jetty for solrCloud as of apache-solr-4.0-2012-04-02_11-54-55

2012-04-09 Thread Ryan McKinley
zookeeper.jsp was removed (along with all JSP stuff) in trunk

Take a look at the cloud tab in the UI, or check the /zookeeper
servlet for the JSON raw output

ryan


On Mon, Apr 9, 2012 at 6:42 AM, Benson Margulies  wrote:
> Starting the leader with:
>
>  java -Dbootstrap_confdir=./solr/conf -Dcollection.configName=rnicloud
> -DzkRun -DnumShards=3 -Djetty.port=9167  -jar start.jar
>
> and browsing to
>
> http://localhost:9167/solr/rnicloud/admin/zookeeper.jsp
>
> I get:
>
> HTTP ERROR 500
>
> Problem accessing /solr/rnicloud/admin/zookeeper.jsp. Reason:
>
>    JSP support not configured
> Powered by Jetty://


Re: EmbeddedSolrServer and StreamingUpdateSolrServer

2012-04-26 Thread Ryan McKinley
In general -- i would not suggest mixing EmbeddedSolrServer with a
different style (unless the other instances are read only).  If you
have multiple instances writing to the same files on disk you are
asking for problems.

Have you tried just using StreamingUpdateSolrServer for daily update?
I would suspect that it would be faster then EmbeddedSolrServer
anyway.

ryan



On Wed, Apr 25, 2012 at 11:32 PM, pcrao  wrote:
> Hi,
>
> Any more thoughts??
>
> Thanks,
> PC Rao.
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-and-StreamingUpdateSolrServer-tp3889073p3940383.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Boosting fields in SOLR using Solrj

2012-04-26 Thread Ryan McKinley
I would suggest debugging with browser requests -- then switching to
Solrj after you are at 1st base.

In particular, try adding the &debugQuery=true parameter to the
request and see what solr thinks is happening.

The value that will "work" for the 'qt' parameter depends on what is
configured in solrconfig.xml -- I suspect you want to point to a
requestHandler that is configured to use edismax query parser.  This
can be configured by default with:


edismax


ryan


On Wed, Apr 25, 2012 at 3:57 PM, Joe  wrote:
> Hi,
>
> I'm using the solrj API to query my SOLR 3.6 index. I have multiple text
> fields, which I would like to weight differently. From what I've read, I
> should be able to do this using the dismax or edismax query types. I've
> tried the following:
>
> SolrQuery query = new SolrQuery();
> query.setQuery( "title:apples oranges content:apples oranges");
> query.setQueryType("edismax");
> query.set("qf", "title^10.0 content^1.0");
> QueryResponse rsp = m_Server.query( query );
>
> But this doesn't work. I've tried the following variations to set the query
> type, but it doesn't seem to make a difference.
>
> query.setQueryType("dismax");
> query.set("qt","dismax");
> query.set("type","edismax");
> query.set("qt","edismax");
> query.set("type","dismax");
>
> I'd like to retain the full Lucene query syntax, so I prefer ExtendedDisMax
> to DisMax. Boosting individual terms in the query (as shown below) does
> work, but is not a valid solution, since the queries are automatically
> generated and can get arbitrarily complex is syntax.
>
> query.setQuery( "title:apples^10.0 oranges^10.0 content:apples oranges");
>
> Any help would be much appreciated.
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Boosting-fields-in-SOLR-using-Solrj-tp3939789p3939789.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Latest solr4 snapshot seems to be giving me a lot of unhappy logging about 'Log4j', should I be concerned?

2012-05-01 Thread Ryan McKinley
check a release since r1332752

If things still look problematic, post a comment on:
https://issues.apache.org/jira/browse/SOLR-3426

this should now have a less verbose message with an older SLF4j and with Log4j


On Tue, May 1, 2012 at 10:14 AM, Gopal Patwa  wrote:
> I have similar issue using log4j for logging with trunk build, the
> CoreConatainer class print big stack trace on our jboss 4.2.2 startup, I am
> using sjfj 1.5.2
>
> 10:07:45,918 WARN  [CoreContainer] Unable to read SLF4J version
> java.lang.NoSuchMethodError:
> org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/StaticLoggerBinder;
> at org.apache.solr.core.CoreContainer.load(CoreContainer.java:395)
> at org.apache.solr.core.CoreContainer.load(CoreContainer.java:355)
> at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:304)
> at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:101)
>
>
>
> On Tue, May 1, 2012 at 9:25 AM, Benson Margulies wrote:
>
>> On Tue, May 1, 2012 at 12:16 PM, Mark Miller 
>> wrote:
>> > There is a recent JIRA issue about keeping the last n logs to display in
>> the admin UI.
>> >
>> > That introduced a problem - and then the fix introduced a problem - and
>> then the fix mitigated the problem but left that ugly logging as a by
>> product.
>> >
>> > Don't remember the issue # offhand. I think there was a dispute about
>> what should be done with it.
>> >
>> > On May 1, 2012, at 11:14 AM, Benson Margulies wrote:
>> >
>> >> CoreContainer.java, in the method 'load', finds itself calling
>> >> loader.NewInstance with an 'fname' of Log4j of the slf4j backend is
>> >> 'Log4j'.
>>
>> Couldn't someone just fix the if statement to say, 'OK, if we're doing
>> log4j, we have no log watcher' and skip all the loud failing on the
>> way?
>>
>>
>>
>> >>
>> >> e.g.:
>> >>
>> >> 2012-05-01 10:40:32,367 org.apache.solr.core.CoreContainer  - Unable
>> >> to load LogWatcher
>> >> org.apache.solr.common.SolrException: Error loading class 'Log4j'
>> >>
>> >> What is it actually looking for? Have I misplaced something?
>> >
>> > - Mark Miller
>> > lucidimagination.com
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>


Re: Ampersand issue

2012-05-01 Thread Ryan McKinley
If your json value is & the proper xml value is &

What is the value you are setting on the stored field?  is is & or &?


On Mon, Apr 30, 2012 at 12:57 PM, William Bell  wrote:
> One idea was to wrap the field with CDATA. Or base64 encode it.
>
>
>
> On Fri, Apr 27, 2012 at 7:50 PM, Bill Bell  wrote:
>> We are indexing a simple XML field from SQL Server into Solr as a stored 
>> field. We have noticed that the & is outputed as & when using 
>> wt=XML. When using wt=JSON we get the normal &. If there a way to 
>> indicate that we don't want to encode the field since it is already XML when 
>> using wt=XML ?
>>
>> Bill Bell
>> Sent from mobile
>>
>
>
>
> --
> Bill Bell
> billnb...@gmail.com
> cell 720-256-8076


Re: syntax for negative query OR something

2012-05-02 Thread Ryan McKinley
thanks!



On Wed, May 2, 2012 at 4:43 PM, Chris Hostetter
 wrote:
>
> : How do I search for things that have no value or a specified value?
>
> Things with no value...
>        (*:* -fieldName:[* TO *])
> Things with a specific value...
>        fieldName:A
> Things with no value or a specific value...
>        (*:* -fieldName:[* TO *]) fieldName:A
> ..."or" if you aren't using "OR" as your default op
>        (*:* -fieldName:[* TO *]) OR fieldName:A
>
> : I have a few variations of:
> : -fname:[* TO *] OR fname:(A B C)
>
> that is just syntacitic sugar for...
>        -fname:[* TO *] fname:(A B C)
>
> which is an empty set.
>
> you need to be explicit that the "exclude docs with a value in this field"
> clause should applied to the "set of all documents"
>
>
> -Hoss


Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Ryan McKinley
In 4.0, solr no longer uses JSP, so it is not enabled in the example setup.

You can enable JSP in your servlet container using whatever method
they provide.  For Jetty, using start.jar, you need to add the command
line: java -jar start.jar -OPTIONS=jsp

ryan



On Mon, May 14, 2012 at 2:34 PM, Naga Vijayapuram  wrote:
> Hello,
>
> How do I enable JSP support in Solr 4.0 ?
>
> Thanks
> Naga


Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Ryan McKinley
just use the admin UI -- look at the 'cloud' tab


On Tue, May 15, 2012 at 12:53 PM, Naga Vijayapuram  wrote:
> Alright; thanks.  Tried with "-OPTIONS=jsp" and am still seeing this on
> console Š
>
> 2012-05-15 12:47:08.837:INFO:solr:No JSP support.  Check that JSP jars are
> in lib/jsp and that the JSP option has been specified to start.jar
>
> I am trying to go after
> http://localhost:8983/solr/collection1/admin/zookeeper.jsp (or its
> equivalent in 4.0) after going through
> http://wiki.apache.org/solr/SolrCloud
>
> May I know the right zookeeper url in 4.0 please?
>
> Thanks
> Naga
>
>
> On 5/15/12 10:56 AM, "Ryan McKinley"  wrote:
>
>>In 4.0, solr no longer uses JSP, so it is not enabled in the example
>>setup.
>>
>>You can enable JSP in your servlet container using whatever method
>>they provide.  For Jetty, using start.jar, you need to add the command
>>line: java -jar start.jar -OPTIONS=jsp
>>
>>ryan
>>
>>
>>
>>On Mon, May 14, 2012 at 2:34 PM, Naga Vijayapuram 
>>wrote:
>>> Hello,
>>>
>>> How do I enable JSP support in Solr 4.0 ?
>>>
>>> Thanks
>>> Naga
>


Re: ContentStreamUpdateRequest method addFile in 4.0 release.

2012-06-08 Thread Ryan McKinley
for the ExtractingRequestHandler, you can put anything into the
request contentType.

try:
addFile( file, "application/octet-stream" )

but anything should work

ryan




On Thu, Jun 7, 2012 at 2:32 PM, Koorosh Vakhshoori
 wrote:
> In latest 4.0 release, the addFile() method has a new argument 'contentType':
>
> addFile(File file, String contentType)
>
> In context of Solr Cell how should addFile() method be called? Specifically
> I refer to the Wiki example:
>
> ContentStreamUpdateRequest up = new
> ContentStreamUpdateRequest("/update/extract");
> up.addFile(new File("mailing_lists.pdf"));
> up.setParam("literal.id", "mailing_lists.pdf");
> up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
> result = server.request(up);
> assertNotNull("Couldn't upload mailing_lists.pdf", result);
> rsp = server.query( new SolrQuery( "*:*") );
> Assert.assertEquals( 1, rsp.getResults().getNumFound() );
>
> given at URL: http://wiki.apache.org/solr/ExtractingRequestHandler
>
> Since Solr Cell is calling Tika under the hood, doesn't the file
> content-type is already identified by Tika? Looking at the code, it seems
> passing NULL would do the job, is that correct? Also for Solr Cell, is the
> ContentStreamUpdateRequest class is the right one to use or there is a
> different class that is more appropriate here?
>
> Thanks
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/ContentStreamUpdateRequest-method-addFile-in-4-0-release-tp3988344.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Different behavior for q=goo.com vs q=@goo.com in queries?

2010-12-31 Thread Ryan McKinley
also try &debugQuery=true and see why each result matched



On Thu, Dec 30, 2010 at 4:10 PM, mrw  wrote:
>
>
> Basically, just what you've suggested.  I did the field/query analysis piece
> with verbose output.  Not entirely sure how to interpret the results, of
> course.  Currently reading anything I can find on that.
>
>
> Thanks
>
>
> Erick Erickson wrote:
>>
>> What steps have you taken to figure out whether the
>> contents of your index are what you think? I suspect
>> that the fields you're indexing aren't being
>> analyzed/tokenized quite the way you expect either at
>> query time or index time (or maybe both!).
>>
>> Take a look at the admin/analysis page for the field you're indexing
>> the data into. If that doesn't shed any light on the problem,
>> please paste in the  definition for the field in question,
>> maybe another set of eyes can see the issue.
>>
>> Best
>> Erick
>>
>>
>>
>>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Different-behavior-for-q-goo-com-vs-q-goo-com-in-queries-tp2168935p2169478.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-18 Thread Ryan McKinley
>
> Where do you get your Lucene/Solr downloads from?
>
> [] ASF Mirrors (linked in our release announcements or via the Lucene website)
>
> [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.)
>
> [X] I/we build them from source via an SVN/Git checkout.
>


edismax with windows path input?

2011-02-10 Thread Ryan McKinley
I am using the edismax query parser -- its awesome!  works well for
standard dismax type queries, and allows explicit fields when
necessary.

I have hit a snag when people enter something that looks like a windows path:

 F:\path\to\a\file

this gets parsed as:
F:\path\to\a\file
F:\path\to\a\file
+()

Putting it in quotes makes the not-quite right query:
"F:\path\to\a\file"
"F:\path\to\a\file"

+DisjunctionMaxQuery((path:f:pathtoafile^4.0 | name:"f (pathtoafile
fpathtoafile)"^7.0)~0.01)


+(path_path:f:pathtoafile^4.0 | name:"f (pathtoafile fpathtoafile)"^7.0)~0.01


Telling people to escape the query:
q=F\:\\path\\to\\a\\file
is unrealistic, but gives the proper parsed query:

+DisjunctionMaxQuery((path_path:f:/path/to/a/file^4.0 | name:"f path
to a (file fpathtoafile)"^7.0)~0.01)

Any ideas on how to support this?  I could try looking for things like
paths in the app, and then modify the query, or maybe look at
extending edismax.  Perhaps when F: does not match a given field, it
could auto escape the rest of the word?

thanks
ryan


Re: edismax with windows path input?

2011-02-10 Thread Ryan McKinley
ah -- that makes sense.

Yonik... looks like you were assigned to it last week -- should I take
a look, or do you already have something in the works?


On Thu, Feb 10, 2011 at 2:52 PM, Chris Hostetter
 wrote:
>
> : extending edismax.  Perhaps when F: does not match a given field, it
> : could auto escape the rest of the word?
>
> that's actually what yonik initially said it was suppose to do, but when i
> tried to add a param to let you control which fields would be supported
> using the ":" syntax i discovered it didn't work but oculdn't figure out
> why ... details are in the SOLR-1553 comments
>
>
> -Hoss
>


Re: edismax with windows path input?

2011-02-10 Thread Ryan McKinley
>
> foo_s:foo\-bar
> is a valid lucene query (with only a dash between the foo and the
> bar), and presumably it should be treated the same in edismax.
> Treating it as foo_s:foo\\-bar (a backslash and a dash between foo and
> bar) might cause more problems than it's worth?
>

I don't think we should escape anything that has a valid field name.
If "foo_s" is a field, then foo_s:foo\-bar should be used as is.

If "foo_s" is not a field, I would want the whole thing escaped to:
foo_s\:foo\\-bar before getting passed to the rest of the dismax mojo.

Does that make sense?

marking edismax as experimental for 3.1 makes sense!

ryan


boosting results by a query?

2011-02-11 Thread Ryan McKinley
I have an odd need, and want to make sure I am not reinventing a wheel...

Similar to the QueryElevationComponent, I need to be able to move
documents to the top of a list that match a given query.

If there were no sort, then this could be implemented easily with
BooleanQuery (i think) but with sort it gets more complicated.  Seems
like I need:

  sortSpec.setSort( new Sort( new SortField[] {
new SortField( something that only sorts results in the boost query ),
new SortField( the regular sort )
  }));

Is there an existing FieldComparator I should look at?  Any other
pointers/ideas?

Thanks
ryan


Re: Monitor the QTime.

2011-02-11 Thread Ryan McKinley
You may want to check the stats via JMX.  For example,

http://localhost:8983/solr/core/admin/mbeans?stats=true&key=org.apache.solr.handler.StandardRequestHandler

shows some basic stats info for the handler.

If you are running nagios or similar, they have tools that can log
values from JMX.  this may be helpful:
http://wiki.apache.org/solr/SolrJmx

ryan



On Thu, Feb 10, 2011 at 5:10 PM, Stijn Vanhoorelbeke
 wrote:
> Hi,
>
> Is it possible to monitor the QTime of the queries.
> I know I could enable logging - but then all of my requests are logged,
> making big&nasty logs.
>
> I just want to log the QTime periodically, lets say once every minute.
> Is this possible using Solr or can this be set up in tomcat anyway?
>


Re: boosting results by a query?

2011-02-14 Thread Ryan McKinley
found something that works great!

in 3.1+ we can sort by a function query, so:

&sort=query({!lucene v='field:value'}) desc, score desc

will put everything that matches 'field:value' first, then order the
rest by score

check:
http://wiki.apache.org/solr/FunctionQuery#Sort_By_Function




On Fri, Feb 11, 2011 at 4:31 PM, Ryan McKinley  wrote:
> I have an odd need, and want to make sure I am not reinventing a wheel...
>
> Similar to the QueryElevationComponent, I need to be able to move
> documents to the top of a list that match a given query.
>
> If there were no sort, then this could be implemented easily with
> BooleanQuery (i think) but with sort it gets more complicated.  Seems
> like I need:
>
>  sortSpec.setSort( new Sort( new SortField[] {
>    new SortField( something that only sorts results in the boost query ),
>    new SortField( the regular sort )
>  }));
>
> Is there an existing FieldComparator I should look at?  Any other
> pointers/ideas?
>
> Thanks
> ryan
>


Re: Solr 4.0 trunk in production

2011-02-20 Thread Ryan McKinley
Not crazy -- but be aware of a few *key* caviates.

1. Do good testing on a stable snapshot.
2. Don't get surprised if you have to rebuild the index from scratch
to upgrade in the future.  The official releases will upgrade smoothly
-- but within dev builds, anything may happen.



On Sat, Feb 19, 2011 at 9:50 AM, Mark  wrote:
> Would I be crazy even to consider putting this in production? Thanks
>


Re: please make JSONWriter public

2011-03-01 Thread Ryan McKinley
You may have noticed the ResponseWriter code is pretty hairy!  Things
are package protected so that the API can change between minor release
without concern for back compatibility.

In 4.0 (/trunk) I hope to rework the whole ResponseWriter framework so
that it is more clean and hopefully stable enough that making parts
public is helpful.

For now, you can:
- copy the code
- put your class in the same package name
- make it public in your own distribution

ryan



On Mon, Feb 28, 2011 at 2:56 PM, Paul Libbrecht  wrote:
>
> Hello fellow SOLR experts,
>
> may I ask to make top-level and public the class
>    org.apache.solr.request.JSONWriter
> inside
>    org.apache.solr.request.JSONResponseWriter
> I am re-using it to output JSON search result to code that I wish not to 
> change on the client but the current visibility settings (JSONWriter is 
> package protected) makes it impossible for me without actually copying the 
> code (which is possible thanks to the good open-source nature).
>
> thanks in advance
>
> paul


Re: [WKT] Spatial Searching

2011-03-29 Thread Ryan McKinley
> Does anyone know of a patch or even when this functionality might be included 
> in to Solr4.0? I need to query for polygons ;-)

check:
http://code.google.com/p/lucene-spatial-playground/

This is my sketch / soon-to-be-proposal for what I think lucene
spatial should look like.  It includes a WKTField that can do complex
geometry queries:

https://lucene-spatial-playground.googlecode.com/svn/trunk/spatial-lucene/src/main/java/org/apache/lucene/spatial/search/jts/


ryan


Re: Solr: Images, Docs and Binary data

2011-04-06 Thread Ryan McKinley
You can store binary data using a binary field type -- then you need
to send the data base64 encoded.

I would strongly recommend against storing large binary files in solr
-- unless you really don't care about performance -- the file system
is a good option that springs to mind.

ryan




2011/4/6 Ezequiel Calderara :
> Another question that maybe is easier to answer, how can i store binary
> data? Any example schema?
>
> 2011/4/6 Ezequiel Calderara 
>
>> Hello everyone, i need to know if some has used solr for indexing and
>> storing images (upt to 16MB) or binary docs.
>>
>> How does solr behaves with this type of docs? How affects performance?
>>
>> Thanks Everyone
>>
>> --
>> __
>> Ezequiel.
>>
>> Http://www.ironicnet.com
>>
>
>
>
> --
> __
> Ezequiel.
>
> Http://www.ironicnet.com
>


JOIN, query on the parent?

2011-06-30 Thread Ryan McKinley
Hello-

I'm looking for a way to find all the links from a set of results.  Consider:


 id:1
 type:X
 link:a
 link:b



 id:2
 type:X
 link:a
 link:c



 id:3
 type:Y
 link:a


Is there a way to search for all the links from stuff of type X -- in
this case (a,b,c)

If I'm understanding the {!join stuff, it lets you search on the
children, but i don't really see how to limit the parent values.

Am I missing something, or is this a further extension to the JoinQParser?


thanks
ryan


Re: JOIN, query on the parent?

2011-07-01 Thread Ryan McKinley
On Fri, Jul 1, 2011 at 9:06 AM, Yonik Seeley  wrote:
> On Thu, Jun 30, 2011 at 6:19 PM, Ryan McKinley  wrote:
>> Hello-
>>
>> I'm looking for a way to find all the links from a set of results.  Consider:
>>
>> 
>>  id:1
>>  type:X
>>  link:a
>>  link:b
>> 
>>
>> 
>>  id:2
>>  type:X
>>  link:a
>>  link:c
>> 
>>
>> 
>>  id:3
>>  type:Y
>>  link:a
>> 
>>
>> Is there a way to search for all the links from stuff of type X -- in
>> this case (a,b,c)
>
> Do the links point to other documents somehow?
> Let's assume that there are documents with ids of a,b,c
>
> fq={!join from=link to=id}type:X
>
> Basically, you start with the set of documents that match type:X, then
> follow from "link" to "id" to arrive at the new set of documents.
>

Yup -- that works.  Thank you!

ryan


Re: Using FieldCache in SolrIndexSearcher - crazy idea?

2011-07-05 Thread Ryan McKinley
>
> Ah, thanks Hoss - I had meant to respond to the original email, but
> then I lost track of it.
>
> Via pseudo-fields, we actually already have the ability to retrieve
> values via FieldCache.
> fl=id:{!func}id
>
> But using CSF would probably be better here - no memory overhead for
> the FieldCache entry.
>

Not sure if this is related, but we should also consider using the
memory codec for id field
https://issues.apache.org/jira/browse/LUCENE-3209


Re: Is solrj 3.3.0 ready for field collapsing?

2011-07-05 Thread Ryan McKinley
patches are always welcome!


On Tue, Jul 5, 2011 at 3:04 PM, Yonik Seeley  wrote:
> On Mon, Jul 4, 2011 at 11:54 AM, Per Newgro  wrote:
>> i've tried to add the params for group=true and group.field=myfield by using
>> the SolrQuery.
>> But the result is null. Do i have to configure something? In wiki part for
>> field collapsing i couldn't
>> find anything.
>
> No specific (type-safe) support for grouping is in SolrJ currently.
> But you should still have access to the complete generic solr response
> via SolrJ regardless (i.e. use getResponse())
>
> -Yonik
> http://www.lucidimagination.com
>


Re: Solr-4.0.0-Beta Bug with "Load Term Info" in Schema Browser

2012-08-25 Thread Ryan McKinley
If you optimize the index, are the results the same?

maybe it is showing counts for deleted docs (i think it does... and
this is expected)

ryan


On Sat, Aug 25, 2012 at 9:57 AM, Fuad Efendi  wrote:
>
> This is bug in Solr 4.0.0-Beta Schema Browser: "Load Term Info" shows "9682
> News", but direct query shows 3577.
>
> /solr/core0/select?q=channel:News&facet=true&facet.field=channel&rows=0
>
> 
> 
> 0
> 1
> 
> true
> channel:News
> channel
> 0
> 
> 
> 
> 
> 
> 
> 
> 3577
> 0
> 0
> 0
> 
> 
> 
> 
> 
> 
>
>
> -Original Message-
> Sent: August-24-12 11:29 PM
> To: solr-user@lucene.apache.org
> Cc: sole-...@lucene.apache.org
> Subject: RE: Solr-4.0.0-Beta Bug with "Load Term Info" in Schema Browser
> Importance: High
>
> Any news?
> CC: Dev
>
>
> -Original Message-
> Subject: Solr-4.0.0-Beta Bug with "Load Term Info" in Schema Browser
>
> Hi there,
>
> "Load term Info" shows 3650 for a specific term "MyTerm", and when I execute
> query "channel:MyTerm" it shows 650 documents foundŠ possibly bugŠ it
> happens after I commit data too, nothing changes; and this field is
> single-valued non-tokenized string.
>
> -Fuad
>
> --
> Fuad Efendi
> 416-993-2060
> http://www.tokenizer.ca
>
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>


edismax bq, ignore tf/idf?

2012-10-26 Thread Ryan McKinley
Hi-

I am trying to add a setting that will boost results based on
existence in different buckets.  Using edismax, I added the bq
parameter:

location:A^5 location:B^3

I want this to put everything in location A above everything in
location B.  This mostly works, BUT depending on the number of matches
for each location, location:B can get a higher final score.

Is there a way to ignore tf/idf when boosting this location?

location from a field type:
 class="solr.StrField"  omitNorms="true"


Thanks for any pointers!

ryan


Re: edismax bq, ignore tf/idf?

2012-10-26 Thread Ryan McKinley
thanks!


On Fri, Oct 26, 2012 at 4:20 PM, Chris Hostetter
 wrote:
> : How about a boost function, "bf" or "boost"?
> :
> : bf=if(exists(query(location:A)),5,if(exists(query(location:B)),3,0))
>
> Right ... assuming you only want to ignore tf/idf on these fields in this
> specifc context, function queries are the way to go -- otherwise you could
> just use a per-field similarity to ignore tf/idf.
>
> I would suggest however that instead of using the "exists(query())"
> consider the "tf()" function ...
>
> bf=if(tf(location,A),5,0)&bf=if(tf(location,B),3,0)
>
> s/bf/boost/g && s/0/1/g if you wnat mutiplicitive boosts.
>
>
> -Hoss


Re: Posting data in JSON

2009-07-30 Thread Ryan McKinley

check:
https://issues.apache.org/jira/browse/SOLR-945

this will not likely make it into 1.4



On Jul 30, 2009, at 1:41 PM, Jérôme Etévé wrote:


Hi,

 Nope, I'm not using solrj (my client code is in Perl), and I'm with  
solr 1.3.


J.

2009/7/30 Shalin Shekhar Mangar :
On Thu, Jul 30, 2009 at 8:31 PM, Jérôme Etévé  


wrote:


Hi All,

I'm wondering if it's possible to post documents to solr in JSON  
format.


JSON is much faster than XML to get the queries results, so I think
it'd be great to be able to post data in JSON to speed up the  
indexing

and lower the network load.


If you are using Java,Solrj on 1.4 (trunk), you can use the binary  
format
which is extremely compact and efficient. Note that with Solr/Solrj  
1.3,

binary became the default response format for Solrj clients.

--
Regards,
Shalin Shekhar Mangar.





--
Jerome Eteve.

Chat with me live at http://www.eteve.net

jer...@eteve.net




Re: Solr-773 (GEO Module) question

2009-08-19 Thread Ryan McKinley


On Aug 19, 2009, at 6:45 AM, johan.sjob...@findwise.se wrote:


Hi,


we're glancing at the GEO search module known from the jira issue 773
(http://issues.apache.org/jira/browse/SOLR-773).


It seems to us that the issue is still open and not yet included in  
the

nightly builds.


correct



Is there a release plan for the nightly builds, and is this module
considered core or contrib?



activity on the nightly builds is winding down as we gear up for the  
1.4 release.


After 1.4 is out, I expect progress on the geo stuff.  It will be in  
contrib (not core) and will likely be marked "experimental" for a  
while.  That is, stuff will be added without the expectation that the  
interfaces will be set in stone.


best
ryan


Re: ${solr.abortOnConfigurationError:false} - does it defaults to false

2009-08-26 Thread Ryan McKinley


On Aug 26, 2009, at 3:33 PM, djain101 wrote:



I have one quick question...

If in solrconfig.xml, if it says ...

${solr.abortOnConfigurationError:false}abortOnConfigurationError>


does it mean  defaults to false if it is  
not set

as system property?



correct



Re: Why isn't this working?

2009-08-27 Thread Ryan McKinley


On Aug 27, 2009, at 10:35 PM, Paul Tomblin wrote:


Yesterday or the day before, I asked specifically if I would need to
restart the Solr server if somebody else loaded data into the Solr
index using the EmbeddedServer, and I was told confidently that no,
the Solr server would see the new data as soon as it was committed.
So today I fired up the Solr server (and after making
apache-tomcat-6.0.20/solr/data a symlink to where the Solr data really
lives and restarting the web server), and did some queries.  Then I
ran a program that loaded a bunch of data and committed it.  Then I
did the queries again.  And the new data is NOT showing.  Using Luke,
I can see 10022 documents in the index, but the Solr statistics page
(http://localhost:8080/solrChunk/admin/stats.jsp) is still showing
8677, which is how many there were before I reloaded the data.

So am I doing something wrong, or was the assurance I got yesterday
that this is possible wrong?



did not follow the advice from yesterday... but...

the "commit" word can be a but misleading, it could also be called  
"reload"


Say you have an embedded solr server and an http solr server pointed  
to the same location.

1.  make sure only is read only!  otherwise you can make a mess.
2. calling commit on the embedded solr instance, will not have any  
effect on the http instance UNTIL you call commit (reload) on the http  
instance.


ryan


Re: If field A is empty take field B. Functionality available?

2009-08-28 Thread Ryan McKinley

can you just add a new field that has the real or ave price?
Just populate that field at index time...  make it indexed but not  
stored


If you want the real or average price to be treated the same in  
faceting, you are really going to want them in the same field.



On Aug 28, 2009, at 1:16 PM, Britske wrote:



I have 2 fields:
realprice
avgprice

I'd like to be able to take the contents of avgprice if realprice is  
not

available.
due to design the average price cannot be encoded in the 'realprice'- 
field.


Since I need to be able to filter, sort and facet on these fields,  
it would
be really nice to be able to do that just on something like a  
virtual-field
called 'price' or something. That field should contain the  
conditional logic

to know from which actual field to take the contents from.

I was looking at using functionqueries, but to me knowledge these  
can't be

used to filter and facet on.

Would creating a custom field work for this or does a field know  
nothing
from its sibling-fields? What would performance impact be like,  
since this

is really important in this instance.

Any better ways? Subclassing standardrequestHandler and hacking it all
together seems rather ugly to me, but if it's needed...

Thanks,
Geert-Jan

--
View this message in context: 
http://www.nabble.com/If-field-A-is-empty-take-field-B.-Functionality-available--tp25193668p25193668.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Solr SVN build problem

2009-09-12 Thread Ryan McKinley

Should be fixed in trunk.  Try updating and see if it works for you

See:
https://issues.apache.org/jira/browse/SOLR-1424



On Sep 9, 2009, at 8:12 PM, Allahbaksh Asadullah wrote:


Hi ,
I am building Solr from source. During building it from source I am  
getting

following error.

generate-maven-artifacts:
   [mkdir] Created dir: c:\Downloads\solr_trunk\build\maven
   [mkdir] Created dir: c:\Downloads\solr_trunk\dist\maven
[copy] Copying 1 file to
c:\Downloads\solr_trunk\build\maven\c:\Downloads\s
olr_trunk\src\maven

BUILD FAILED
c:\Downloads\solr_trunk\build.xml:741: The following error occurred  
while

execut
ing this line:
c:\Downloads\solr_trunk\common-build.xml:261: Failed to copy
c:\Downloads\solr_t
runk\src\maven\solr-parent-pom.xml.template to
c:\Downloads\solr_trunk\build\mav
en\c:\Downloads\solr_trunk\src\maven\solr-parent-pom.xml.template  
due to

java.io
.FileNotFoundException
c:\Downloads\solr_trunk\build\maven\c:\Downloads\solr_tru
nk\src\maven\solr-parent-pom.xml.template (The filename, directory  
name, or

volu
me label syntax is incorrect)

Regards,
Allahbaksh




Re: Solrj possible deadlock

2009-09-23 Thread Ryan McKinley

do you have anything custom going on?

The fact that the lock is in java2d seems suspicious...


On Sep 23, 2009, at 7:01 PM, pof wrote:



I had the same problem again yesterday except the process halted  
after about

20mins this time.


pof wrote:


Hello, I was running a batch index the other day using the Solrj
EmbeddedSolrServer when the process abruptly froze in it's tracks  
after
running for about 4-5 hours and indexing ~400K documents. There  
were no
document locks so it would seem likely that there was some kind of  
thread
deadlock. I was hoping someone might be able to tell me some  
information

about the following thread dump taken at the time:

Full thread dump OpenJDK Client VM (1.6.0-b09 mixed mode):

"DestroyJavaVM" prio=10 tid=0x9322a800 nid=0xcef waiting on condition
[0x..0x0018a044]
  java.lang.Thread.State: RUNNABLE

"Java2D Disposer" daemon prio=10 tid=0x0a28cc00 nid=0xf1c in  
Object.wait()

[0x0311d000..0x0311def4]
  java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on <0x97a96840> (a java.lang.ref.ReferenceQueue 
$Lock)
   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java: 
133)

   - locked <0x97a96840> (a java.lang.ref.ReferenceQueue$Lock)
   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java: 
149)

   at sun.java2d.Disposer.run(Disposer.java:143)
   at java.lang.Thread.run(Thread.java:636)

"pool-1-thread-1" prio=10 tid=0x93a26c00 nid=0xcf7 waiting on  
condition

[0x08a6a000..0x08a6b074]
  java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  <0x967acfd0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer 
$ConditionObject)

   at
java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at
java.util.concurrent.locks.AbstractQueuedSynchronizer 
$ConditionObject.await(AbstractQueuedSynchronizer.java:1978)

   at
java 
.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java: 
386)

   at
java 
.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java: 
1043)

   at
java 
.util 
.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: 
1103)

   at
java.util.concurrent.ThreadPoolExecutor 
$Worker.run(ThreadPoolExecutor.java:603)

   at java.lang.Thread.run(Thread.java:636)

"Low Memory Detector" daemon prio=10 tid=0x93a00c00 nid=0xcf5  
runnable

[0x..0x]
  java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x09fe9800 nid=0xcf4 waiting on
condition [0x..0x096a7af4]
  java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x09fe8800 nid=0xcf3 waiting  
on

condition [0x..0x]
  java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x09fd7000 nid=0xcf2 in Object.wait()
[0x005ca000..0x005caef4]
  java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on <0x966e6d40> (a java.lang.ref.ReferenceQueue 
$Lock)
   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java: 
133)

   - locked <0x966e6d40> (a java.lang.ref.ReferenceQueue$Lock)
   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java: 
149)
   at java.lang.ref.Finalizer 
$FinalizerThread.run(Finalizer.java:177)


"Reference Handler" daemon prio=10 tid=0x09fd2c00 nid=0xcf1 in
Object.wait() [0x00579000..0x00579d74]
  java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on <0x966e6dc8> (a java.lang.ref.Reference$Lock)
   at java.lang.Object.wait(Object.java:502)
   at
java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
   - locked <0x966e6dc8> (a java.lang.ref.Reference$Lock)

"VM Thread" prio=10 tid=0x09fcf800 nid=0xcf0 runnable

"VM Periodic Task Thread" prio=10 tid=0x93a02400 nid=0xcf6 waiting on
condition

JNI global references: 1072

Heap
def new generation   total 36288K, used 23695K [0x93f1,  
0x9667,

0x9667)
 eden space 32256K,  73% used [0x93f1, 0x95633f60, 0x95e9)
 from space 4032K,   0% used [0x95e9, 0x95e9, 0x9628)
 to   space 4032K,   0% used [0x9628, 0x9628, 0x9667)
tenured generation   total 483968K, used 72129K [0x9667,  
0xb3f1,

0xb3f1)
  the space 483968K,  14% used [0x9667, 0x9ace04b8, 0x9ace0600,
0xb3f1)
compacting perm gen  total 23040K, used 22983K [0xb3f1,  
0xb559,

0xb7f1)
  the space 23040K,  99% used [0xb3f1, 0xb5581ff8, 0xb5582000,
0xb559)
No shared spaces configured.

Cheers. Brett.



--
View this message in context: 
http://www.nabble.com/Solrj-possible-deadlock-tp25530146p25531321.html
Sent from the Solr - User mailing list archive at Nabble.com.





releasing memory?

2009-10-08 Thread Ryan McKinley

Hello-

I have an application that can run in the background on a user Desktop  
-- it will go through phases of being used and not being used.  I want  
to be able to free as many system resources when not in use as possible.


Currently I have a timer that wants for 10 mins of inactivity and  
releases a bunch of memory (unrelated to lucene/solor).  Any  
suggestion on the best way to do this in lucene/solor?  perhaps reload  
a core?


thanks for any pointers
ryan


Re: (Solr 1.4 dev) Why solr.common.* packages are in solrj-*.jar ?

2009-10-14 Thread Ryan McKinley


I wonder why the common classes are in the solrj JAR?
Is the solrj JAR not just for the clients?


the solr server uses solrj for distributed search.  This makes solrj  
the general way to talk to solr (even from within solr)






Re: Programmatically configuring SLF4J for Solr 1.4?

2009-11-01 Thread Ryan McKinley
I'm sure it is possible to configure JDK logging (java.util.loging)  
programatically... but I have never had much luck with it.


It is very easy to configure log4j programatically, and this works  
great with solr.


To use log4j rather then JDK logging, simply add slf4j- 
log4j12-1.5.8.jar (from http://www.slf4j.org/download.html) to your  
classpath


ryan



On Nov 1, 2009, at 11:05 PM, Don Werve wrote:

So, I've spent a bit of the day banging my head against this, and  
can't get

it sorted.  I'm using a DirectSolrConnection embedded in a JRuby
application, and everything works great, except I can't seem to get  
it to do

anything except log to the console.  I've tried pointing
'java.util.logging.config.file' to a properties file, as well as  
specifying
a logfile as part of the constructor for DirectSolrConnection, but  
so far,

nothing has really worked.

What I'd like to do is programmatically direct the Solr logs to a  
logfile,
so that I can have my app start up, parse its config, and throw the  
Solr

logs where they need to go based on that.

So, I don't suppose anybody has a code snippet (in Java) that sets  
up SLF4J
for Solr logging (and that doesn't reference an external properties  
file)?


Using the latest (1 Nov 2009) nightly build of Solr 1.4.0-dev




Re: Problems downloading lucene 2.9.1

2009-11-02 Thread Ryan McKinley


On Nov 2, 2009, at 8:29 AM, Grant Ingersoll wrote:



On Nov 2, 2009, at 12:12 AM, Licinio Fernández Maurelo wrote:


Hi folks,

as we are using an snapshot dependecy to solr1.4, today we are  
getting
problems when maven try to download lucene 2.9.1 (there isn't a any  
2.9.1

there).

Which repository can i use to download it?


They won't be there until 2.9.1 is officially released.  We are  
trying to speed up the Solr release by piggybacking on the Lucene  
release, but this little bit is the one downside.


Until then, you can add a repo to:

http://people.apache.org/~mikemccand/staging-area/rc3_lucene2.9.1/maven/




Re: add XML/HTML documents using SolrJ, without bypassing HTML char filter

2009-11-11 Thread Ryan McKinley
The HTMLStripCharFilter will strip the html for the *indexed* terms,  
it does not effect the *stored* field.


If you don't want html in the stored field, can you just strip it out  
before passing to solr?



On Nov 11, 2009, at 8:07 PM, aseem cheema wrote:


Hey Guys,
How do I add HTML/XML documents using SolrJ such that it does not by
pass the HTML char filter?

SolrJ escapes the HTML/XML value of a field, and that make it bypass
the HTML char filter. For example content if added to
a field with HTMLStripCharFilter on the field using SolrJ, is not
stripped of center tags. But if check in analysis.jsp, it does get
stripped. When I look at the SolrJ XML feed, it looks like this:
http://haha.com
content
Any help is highly appreciated. Thanks. -- Aseem

Re: The status of Local/Geo/Spatial/Distance Solr

2009-11-13 Thread Ryan McKinley

It looks like solr+spatial will get some attention in 1.5, check:
https://issues.apache.org/jira/browse/SOLR-1561

Depending on your needs, that may be enough.  More robust/scaleable  
solutions will hopefully work their way into 1.5 (any help is always  
appreciated!)



On Nov 13, 2009, at 11:12 AM, Bertie Shen wrote:


Hey,

  I am interested in using LocalSolr to go Local/Geo/Spatial/Distance
search. But the wiki of LocalSolr(http://wiki.apache.org/solr/LocalSolr 
)
points to pretty old documentation. Is there a better document I  
refer to

for the setting up of LocalSolr and some performance analysis?

  Just sync-ed Solr codebase and found LocalSolr is still NOT in the
contrib package. Do we have a plan to incorporate it? I download a  
LocalSolr

lib localsolr-1.5.jar from
http://developer.k-int.com/m2snapshots/localsolr/localsolr/1.5/ and  
notice
that the namespace is com.pjaol.search. blah blah, while LocalLucene  
package
is in Lucene codebase and the package name is  
org.apache.lucene.spatial blah

blah.

  But localsolr-1.5.jar from from
http://developer.k-int.com/m2snapshots/localsolr/localsolr/1.5/   
does not
work with lucene-spatial-3.0-dev.jar I build from Lucene codebase  
directly.
After I restart tomcat, I could not load solr admin page. The error  
is as

follows. It looks solr is still looking for
old named classes.

 Thanks.

HTTP Status 500 - Severe errors in solr configuration. Check your  
log files
for more detailed information on what may be wrong. If you want solr  
to

continue after configuration errors, change:
false in null
-
java.lang.NoClassDefFoundError:
com/pjaol/search/geo/utils/DistanceFilter at  
java.lang.Class.forName0(Native

Method) at java.lang.Class.forName(Class.java:247) at
org 
.apache 
.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:357)

at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435) at
org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498) at
org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492) at
org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525) at
org.apache.solr.core.SolrCore.loadSearchComponents(SolrCore.java: 
833) at

org.apache.solr.core.SolrCore.(SolrCore.java:551) at
org.apache.solr.core.CoreContainer 
$Initializer.initialize(CoreContainer.java:137)

at
org 
.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java: 
83)

at
org 
.apache 
.catalina 
.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java: 
221)

at
org 
.apache 
.catalina 
.core 
.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java: 
302)

at
org 
.apache 
.catalina 
.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78)

at
org 
.apache 
.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
at  
org.apache.catalina.core.StandardContext.start(StandardContext.java: 
4222)

at
org 
.apache 
.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
at org.apache.catalina.core.ContainerBase.access 
$0(ContainerBase.java:744)

at
org.apache.catalina.core.ContainerBase 
$PrivilegedAddChild.run(ContainerBase.java:144)

at java.security.AccessController.doPrivileged(Native Method) at
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java: 
738) at
org.apache.catalina.core.StandardHost.addChild(StandardHost.java: 
544) at
org 
.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java: 
626)

at
org 
.apache 
.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:553)
at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java: 
488) at

org.apache.catalina.startup.HostConfig.start(HostConfig.java:1138) at
org 
.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java: 
311)

at
org 
.apache 
.catalina 
.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java: 
1022) at

org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java: 
1014) at
org.apache.catalina.core.StandardEngine.start(StandardEngine.java: 
443) at
org.apache.catalina.core.StandardService.start(StandardService.java: 
448) at
org.apache.catalina.core.StandardServer.start(StandardServer.java: 
700) at

org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun 
.reflect 
.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at
sun 
.reflect 
.DelegatingMethodAccessorImpl 
.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun 
.reflect 
.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j

Re: The status of Local/Geo/Spatial/Distance Solr

2009-11-13 Thread Ryan McKinley

Also:
https://issues.apache.org/jira/browse/SOLR-1302


On Nov 13, 2009, at 11:12 AM, Bertie Shen wrote:


Hey,

  I am interested in using LocalSolr to go Local/Geo/Spatial/Distance
search. But the wiki of LocalSolr(http://wiki.apache.org/solr/LocalSolr 
)
points to pretty old documentation. Is there a better document I  
refer to

for the setting up of LocalSolr and some performance analysis?

  Just sync-ed Solr codebase and found LocalSolr is still NOT in the
contrib package. Do we have a plan to incorporate it? I download a  
LocalSolr

lib localsolr-1.5.jar from
http://developer.k-int.com/m2snapshots/localsolr/localsolr/1.5/ and  
notice
that the namespace is com.pjaol.search. blah blah, while LocalLucene  
package
is in Lucene codebase and the package name is  
org.apache.lucene.spatial blah

blah.

  But localsolr-1.5.jar from from
http://developer.k-int.com/m2snapshots/localsolr/localsolr/1.5/   
does not
work with lucene-spatial-3.0-dev.jar I build from Lucene codebase  
directly.
After I restart tomcat, I could not load solr admin page. The error  
is as

follows. It looks solr is still looking for
old named classes.

 Thanks.

HTTP Status 500 - Severe errors in solr configuration. Check your  
log files
for more detailed information on what may be wrong. If you want solr  
to

continue after configuration errors, change:
false in null
-
java.lang.NoClassDefFoundError:
com/pjaol/search/geo/utils/DistanceFilter at  
java.lang.Class.forName0(Native

Method) at java.lang.Class.forName(Class.java:247) at
org 
.apache 
.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:357)

at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435) at
org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498) at
org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492) at
org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525) at
org.apache.solr.core.SolrCore.loadSearchComponents(SolrCore.java: 
833) at

org.apache.solr.core.SolrCore.(SolrCore.java:551) at
org.apache.solr.core.CoreContainer 
$Initializer.initialize(CoreContainer.java:137)

at
org 
.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java: 
83)

at
org 
.apache 
.catalina 
.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java: 
221)

at
org 
.apache 
.catalina 
.core 
.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java: 
302)

at
org 
.apache 
.catalina 
.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78)

at
org 
.apache 
.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
at  
org.apache.catalina.core.StandardContext.start(StandardContext.java: 
4222)

at
org 
.apache 
.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
at org.apache.catalina.core.ContainerBase.access 
$0(ContainerBase.java:744)

at
org.apache.catalina.core.ContainerBase 
$PrivilegedAddChild.run(ContainerBase.java:144)

at java.security.AccessController.doPrivileged(Native Method) at
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java: 
738) at
org.apache.catalina.core.StandardHost.addChild(StandardHost.java: 
544) at
org 
.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java: 
626)

at
org 
.apache 
.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:553)
at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java: 
488) at

org.apache.catalina.startup.HostConfig.start(HostConfig.java:1138) at
org 
.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java: 
311)

at
org 
.apache 
.catalina 
.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java: 
1022) at

org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java: 
1014) at
org.apache.catalina.core.StandardEngine.start(StandardEngine.java: 
443) at
org.apache.catalina.core.StandardService.start(StandardService.java: 
448) at
org.apache.catalina.core.StandardServer.start(StandardServer.java: 
700) at

org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun 
.reflect 
.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at
sun 
.reflect 
.DelegatingMethodAccessorImpl 
.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun 
.reflect 
.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at
sun 
.reflect 
.DelegatingMethodAccessorImpl 
.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597) at
org 
.apache.commons.daemon.support.DaemonLoader.start(Da

Re: Missing slf4j jar in solr 1.4.0 distribution?

2009-11-18 Thread Ryan McKinley
Solr includes slf4j-jdk14-1.5.5.jar, if you want to use the nop (or  
log4j, or loopback) impl you will need to include that in your own  
project.


Solr uses slf4j so that each user can decide their logging  
implementation, it includes the jdk version so that something works  
off-the-shelf, but if you want more control, then you can switch in  
whatever you want.


ryan


On Nov 18, 2009, at 1:22 AM, Per Halvor Tryggeseth wrote:

Thanks. I see. It seems that slf4j-nop-1.5.5.jar is the only jar  
file missing in solrj-lib, so I suggest that it should be included  
in the next release.


Per Halvor





-Opprinnelig melding-
Fra: Chris Hostetter [mailto:hossman_luc...@fucit.org]
Sendt: 17. november 2009 20:51
Til: 'solr-user@lucene.apache.org'
Emne: Re: Missing slf4j jar in solr 1.4.0 distribution?


: I downloaded solr 1.4.0 but discovered when using solrj 1.4 that a
: required slf4j jar was missing in the distribution (i.e.
: apache-solr-1.4.0/dist). I got a java.lang.NoClassDefFoundError:
: org/slf4j/impl/StaticLoggerBinder when using solrj
   ...
: Have I overlooked something or are not all necessary classes  
required

: for using solrj in solr 1.4.0 included in the distribution?

Regretably, Solr releases aren't particularly consistent about where  
third-party libraries can be found.


If you use the the pre-built war, the 'main' dependencies are  
allready bunlded into it.  If you want to roll your own, you need to  
look at the "./lib" directory -- "./dist" is only *suppose* to  
contain the artifacts built from solr source But that solrj-lib  
directory can be confusing)...


hoss...@brunner:apache-solr-1.4.0$ ls ./lib/slf4j-*
lib/slf4j-api-1.5.5.jar lib/slf4j-jdk14-1.5.5.jar

-Hoss





Re: logger in embedded solr

2009-11-19 Thread Ryan McKinley

check:
http://wiki.apache.org/solr/SolrLogging

if you are using 1.4 you want to drop in the slf4j-log4j jar file and  
then it should read your log4j configs



On Nov 19, 2009, at 2:15 PM, Harsch, Timothy J. (ARC-TI)[PEROT  
SYSTEMS] wrote:



Hi all,
I have an J2EE application using embedded solr via solr4j.  It seems  
the logging that SOLR produces has a mind of its own, and is not  
changeable via my log4j.properties.  In fact I know this because I  
wired in a Log4J config listener in my web.xml and redirected all my  
logs to a custom location.  Which works, but now all my messages go  
to the custom location and all the embedded SOLR messages are still  
going into catalina.out.  How can I get access to the logger of the  
Embedded SOLR.


Thanks,
Tim Harsch
Sr. Software Engineer
Perot Systems





Re: several tokenizers in one field type

2008-06-24 Thread Ryan McKinley


On Jun 24, 2008, at 12:07 AM, Norberto Meijome wrote:

hi all,
( I'm using 1.3 nightly build from 15th June 08.)

Is there some documentation about how analysers + tokenizers are  
applied in

fields ?  In particular, my question :



best docs are here:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters


- If I define 2 tokenizers in a fieldtype, only the first one is  
applied, the

other is ignored. Is that because the 2nd tokenizer would have to work
recursively on the tokens generated from the previous one? Would I  
have to
create my custom tokenizer to perform the job of 2 existing  
tokenizers in one ?


if you define two tokenizers, solr should throw an error  the  
second one can't do anything.


The tokenizer breaks the input stream into a stream of tokens, then  
token filters can modify these tokens.


ryan




Re: How to debug ?

2008-06-24 Thread Ryan McKinley

also, check the LukeRequestHandler

if there is a document you think *should* match, you can see what  
tokens it has actually indexed...



On Jun 24, 2008, at 7:12 PM, Norberto Meijome wrote:

hi,
I'm trying to understand why a search on a field tokenized with the  
nGram
tokenizer, with minGramSize=n and maxGramSize=m doesn't find any  
matches for

queries of length (in characters) of n+1..m (n works fine).

analysis.jsp shows that it SHOULD match, but /select doesn't bring  
anything
back. (For details on this queries, please see my previous post over  
the last

day or so to this list).

So i figure there is some difference between what analysis.jsp does  
and the
actual search executed , or what lucene indexes - i imagine  
analysis.jsp only
parses the input in the page with solr's tokenizers/filters but  
doesn't

actually do lucene's part of the job.

And I'd like to look into this... What is the suggested approach for  
this?
attach a debugger to jetty's web app ? Are there some pointers on  
how to debug

at this level? Preferably in Eclipse, but beggars cant be choosers ;)

thanks!!
B
_
{Beto|Norberto|Numard} Meijome

"Always do right.  This will gratify some and astonish the rest."
 Mark Twain

I speak for myself, not my employer. Contents may be hot. Slippery  
when wet.
Reading disclaimers makes you go blind. Writing them is worse. You  
have been

Warned.




negative boosting / analysis?

2008-07-01 Thread Ryan McKinley

Hi-

I'm working on a case where we have review text that may include words  
that describe what the item is *not*.


Given the text "the kitten is not clean", searching for "clean" should  
not include (at least at the top) the kitten.


The approach I am considering is to copy the text to a negation field  
and do simple heuristic analysis in a TokenFilter.  This analysis  
would only keep tokens for words that follow "not", then we could add  
a negative boost for this field:

  title^2 content^1 negation^0.1

Does this seem like a reasonable approach?  Any other ideas /  
suggestions / pointers?


thanks
ryan


Re: First version of solr javascript client to review

2008-07-03 Thread Ryan McKinley


Any thoughts / ideas on how to make formatting and laying out custom  
results less obtuse?


$sj('').html(item.id).appendTo(this.target);

seems ok for simple things -- like a list -- but not very designer  
friendly.


ryan


On Jul 1, 2008, at 3:00 AM, Matthias Epheser wrote:

Hi community,

as described here http://www.nabble.com/Announcement-of-Solr-Javascript-Client-to17462581.html 
 I started to work on a javascript widget library for solr.


I've now finished the basic framework work:

- creating a jquery environment
- creating helpers for jquery inheritance
- testing the json communication between client and solr
- creating a base class "AbstractWidget"

After that, I implemented a SimpleFacet widget and 2 ResultViews to  
show how these things will work in the new jquery environment.


A new wiki page documents this progress:
http://wiki.apache.org/solr/SolrJS

Check out the trunk at:
http://solrstuff.org/svn/solrjs/trunk/


Feedback about the implementation, the quality of code  
(documentation, integration into html, customizing the widgets) as  
well as ideas for future widgets etc. is very welcome.


regards
matthias




Re: Stupid update question

2008-07-03 Thread Ryan McKinley

Not sure exactly what you are asking for -- I'll answer a few versions:

Do you have an existing index and want to change the field "A" to  
"duck" for every document?  If so, there is no way to do that off the  
shelf -- check SOLR-139 for an option (but the current patch will not  
work)


Do you want to set the field "A" to "duck" at index time automatically?
- the easiest option is to manage that from your client indexing  
code.  just send the field "A" along with the document
- otherwise look at the UpdateProcessorFactory, (1.3-dev) and in the  
add callback, add the field to the document.


not sure if this helps
ryan


On Jul 3, 2008, at 1:12 PM, Alexander Ramos Jardim wrote:

Pals,

I want to set the field A from my index on all its documents to a  
given

value. How do I do that?

--
Alexander Ramos Jardim




Re: implementing a random result request handler - solr 1.2

2008-07-07 Thread Ryan McKinley
The random sort field in solr 1.3 relies on the field name and dynamic  
fields for ordering.  Check the example solrconfig.xml in 1.3


   

to get random results, try various field names:
 &sort=rand_123 asc
 &sort=rand_xyz asc
 &sort=rand_{generate your random number on the client} asc

This is good because you will get the same results for the same query  
string, and will get a new set of random results for a new URL.


ryan


On Jul 7, 2008, at 1:40 PM, Sean Laval wrote:
The RandomSortField in 1.3 each time you then issue a query, you  
get the same random sort order right? That is to say the randomness  
is implemented at index time rather than search time?


Thanks,

--
From: "Yonik Seeley" <[EMAIL PROTECTED]>
Sent: Monday, July 07, 2008 6:22 PM
To: 
Subject: Re: implementing a random result request handler - solr 1.2


If it's just a random ordering you are looking for, it's implemented
in the latest Solr 1.3
Solr 1.3 should be out soon, so if you are just starting development,
I'd start with the latest Solr version.

If you really need to stick with 1.2 (even after 1.3 is out?)  then
RandomSortField should be easy to backport to 1.2

-Yonik

On Mon, Jul 7, 2008 at 1:15 PM, Sean Laval <[EMAIL PROTECTED]>  
wrote:
Well its simply a business requirement from my perspective. I am  
not sure I
can say more than that. I could maybe implement a request handler  
that did
an initial search to work out how many hits there are resulting  
from the
query and then did as many more queries as were required fetching  
just 1
document starting at a given random number .. would that work?  
Sounds a bit

cludgy to me even as I say it.

Sean



--
From: "Walter Underwood" <[EMAIL PROTECTED]>
Sent: Monday, July 07, 2008 5:06 PM
To: 
Subject: Re: implementing a random result request handler - solr 1.2


Why do you want random hits? If we know more about the bigger
problem, we can probably make better suggestions.

Fundamentally, Lucene is designed to quickly return the best
hits for a query. Returning random hits from the entire
matched set is likely to be very slow. It just isn't what
Lucene is designed to do.

wunder

On 7/7/08 8:58 AM, "Sean Laval" <[EMAIL PROTECTED]> wrote:

I have seen various posts about implementing random sorting  
relating to

the
1.3 code base but I am trying to do this in 1.2. Does anyone  
have any

suggestions? The approach I have considered is to implement my own
request
handler that picks random documents from a larger result list. I
therefore
need to be able to create a DocList and add documents to it but  
can't

seem to
do this. Does anyone have any advice they could offer please?

Regards,

Sean









Re: Automated Index Creation

2008-07-08 Thread Ryan McKinley
nothing to automatically create a new index, but check the multicore  
stuff to see how you could implement this:

http://wiki.apache.org/solr/MultiCore


On Jul 8, 2008, at 10:25 AM, Willie Wong wrote:

Hi,

Sorry if this question sounds daft but I was wondering if there  
was
anything built into Solr that allows you to automate the creation of  
new
indexes once they reach a certain size or point in time.  I looked  
briefly
at the documentation on CollectionDestribution, but it seems more  
geared
to towards replicatting to other production servers...I'm  
looking for

something that is more along the lines of archiving indexes for later
use...


Thanks,

Willie





Re: Pre-processor for stored fields

2008-07-08 Thread Ryan McKinley
If all you are doing is stripping text from HTML, the best option is  
probably to just do that on the client *before* you send it to solr.


If you need to do something more complex -- or that needs to rely on  
other solr configurations you can consider using an  
UpdateRequestProcessor.  Likely you would override the processAdd  
function and augment/modify the document coming in.


An example of this is in the locallucene project, check:
https://locallucene.svn.sourceforge.net/svnroot/locallucene/trunk/localsolr/src/com/pjaol/search/solr/update/LocalUpdateProcessorFactory.java

ryan



On Jul 8, 2008, at 9:20 AM, Hugo Barauna wrote:

Hi,

I already haved aked this, but I didn't get any good answer, so I  
will try
again. I need to pre-process a stored field before it is saved. Just  
like a
field that is gonna be indexed. I would be good to apply an analyzer  
to this

stored field.

My problem is that I have to send to solr html documents and use a  
HTML

filter to remove the HTML tags. But that doesn't work for the stored
representation of that field.

I found some possible 
solutions  to my  
problem,

but I would like to know if there is something better.

Thanks!

--
Hugo Pessoa de Baraúna

"Se vc faz tudo igual a todo mundo, não pode esperar resultados  
diferentes."


http://hugobarauna.blogspot.com/




Re: Automated Index Creation

2008-07-08 Thread Ryan McKinley

re-reading your post...

Shalin is correct, just use the snapshooter script to create a point- 
in-time snapshot of the index.  The multicore stuff will not help with  
this.


ryan


On Jul 8, 2008, at 11:09 AM, Shalin Shekhar Mangar wrote:

Hi Willie,

If you want to have backups (point-in-time snapshots) then you'd need
something similar to the snapshooter script used in replication. I  
believe

it creates hard links to files of the current index in a new directory
marked with the timestamp. You can either use snapshooter itself or  
create

your own script by modifying snapshooter to create copies instead of
hardlinks if you want. You can use the RunExecutableListener to run  
your

script on every commit or optimize and use the snapshots for backup
purposes.

On Tue, Jul 8, 2008 at 7:55 PM, Willie Wong <[EMAIL PROTECTED] 
>

wrote:


Hi,

Sorry if this question sounds daft but I was wondering if there  
was
anything built into Solr that allows you to automate the creation  
of new
indexes once they reach a certain size or point in time.  I looked  
briefly
at the documentation on CollectionDestribution, but it seems more  
geared
to towards replicatting to other production servers...I'm  
looking for

something that is more along the lines of archiving indexes for later
use...


Thanks,

Willie





--
Regards,
Shalin Shekhar Mangar.




Re: Automated Index Creation

2008-07-08 Thread Ryan McKinley


*:*

will wipe all data in the index...


On Jul 8, 2008, at 12:05 PM, Willie Wong wrote:

Thanks Sahlin and Ryan for your posts...

I think the snapshooter will work fine for creating the indexes and  
then I
can use the multicore capabilities to make them available to  
users one
final question though, after snapshot has been created is there a  
way to
totally clear out the contents in the master index - or have solr  
recreate

the data directory?



Thanks,

Willie




Ryan McKinley <[EMAIL PROTECTED]>
08/07/2008 11:17 AM
Please respond to
solr-user@lucene.apache.org


To
solr-user@lucene.apache.org
cc

Subject
Re: Automated Index Creation






re-reading your post...

Shalin is correct, just use the snapshooter script to create a point-
in-time snapshot of the index.  The multicore stuff will not help with
this.

ryan


On Jul 8, 2008, at 11:09 AM, Shalin Shekhar Mangar wrote:

Hi Willie,

If you want to have backups (point-in-time snapshots) then you'd need
something similar to the snapshooter script used in replication. I
believe
it creates hard links to files of the current index in a new  
directory

marked with the timestamp. You can either use snapshooter itself or
create
your own script by modifying snapshooter to create copies instead of
hardlinks if you want. You can use the RunExecutableListener to run
your
script on every commit or optimize and use the snapshots for backup
purposes.

On Tue, Jul 8, 2008 at 7:55 PM, Willie Wong

<[EMAIL PROTECTED]



wrote:


Hi,

Sorry if this question sounds daft but I was wondering if there
was
anything built into Solr that allows you to automate the creation
of new
indexes once they reach a certain size or point in time.  I looked
briefly
at the documentation on CollectionDestribution, but it seems more
geared
to towards replicatting to other production servers...I'm
looking for
something that is more along the lines of archiving indexes for  
later

use...


Thanks,

Willie





--
Regards,
Shalin Shekhar Mangar.








nagios scripts for solr? other monitoring links?

2008-07-09 Thread Ryan McKinley
Is anyone out there using nagios to monitor solr?

I remember some discussion of this in the past around exposing
response handler timing info so it could play nice with nagios... did
anyone get anywhere with this?  want to share :)

Any other pointers to solr monitoring tools would be good too.

thanks
ryan


Re: Duplicate content

2008-07-15 Thread Ryan McKinley


On Jul 15, 2008, at 2:45 AM, Sunil wrote:


Hi All,

I want to change the duplicate content behavior in solr. What I want  
to

do is:

1) I don't want duplicate content.
2) I don't want to overwrite old content with new one.

Means, if I add duplicate content in solr and the content already
exists, the old content should not be overwritten.

Can anyone suggest how to achieve it?



Check the "allowDups" options for 
http://wiki.apache.org/solr/UpdateXmlMessages#head-3dfbf90fbc69f168ab6f3389daf68571ad614bef





Thanks,
Sunil






Re: Duplicate content

2008-07-15 Thread Ryan McKinley


On Jul 15, 2008, at 10:31 AM, Fuad Efendi wrote:


Thanks Ryan,

Is  really unique if we allow duplicates? I had similar  
problem...




if you allowDups, then uniqueKey may not be unique...

however, it is still used as the key for many items.




Quoting Ryan McKinley <[EMAIL PROTECTED]>:



On Jul 15, 2008, at 2:45 AM, Sunil wrote:


Hi All,

I want to change the duplicate content behavior in solr. What I  
want to

do is:

1) I don't want duplicate content.
2) I don't want to overwrite old content with new one.

Means, if I add duplicate content in solr and the content already
exists, the old content should not be overwritten.

Can anyone suggest how to achieve it?



Check the "allowDups" options for 
http://wiki.apache.org/solr/UpdateXmlMessages#head-3dfbf90fbc69f168ab6f3389daf68571ad614bef





Thanks,
Sunil










FileBasedSpellChecker behavior?

2008-07-15 Thread Ryan McKinley

Hi-

I'm messing with spellchecking and running into behavior that seems  
peculiar.  We have an index with many words including:

"swim" and "slim"

If I search for "slim", it returns "swim" as an option -- likewise, if  
I search for "slim" it returns "swim"


why does it check words that are in the dictionary?  This does not  
seem to be the behavior for IndexBasedSpellChecker.


- - - -

Perhaps the FileBasedSpellChecker should load the configs at startup.   
It is too strange to have to call load each time the index starts.  It  
should just implement solrCoreAware() and then load the file at startup.


thanks
ryan


spellchecking multiple fields?

2008-07-15 Thread Ryan McKinley
I have a use case where I want to spellcheck the input query across  
multiple fields:

 Did you mean: location = washington
  vs
 Did you mean: person = washington

The current parameter / response structure for the spellcheck  
component does not support this kind of thing.  Any thoughts on how/if  
the component should handle this?  Perhaps it could be in a  
requestHandler where the params are passed in as json?


 spelling={ dictionary="location",  
onlyMorePopular=true}&spelling={ dictionary="person",  
onlyMorePopular=false }


Thoughts?
ryan


Re: spellchecking multiple fields?

2008-07-16 Thread Ryan McKinley
and the caveat that all fields would need to be declared in the  
solrconfig.xml (or get used for both fields)


this could work...  would also need to augment the response with the  
name of the dictionary, or assert that something will be written all  
the time (so you could know the 2nd  would be for  
the 2nd configured dictionary.



On Jul 16, 2008, at 8:06 AM, Grant Ingersoll wrote:


Another thought that might work:

Declare two separate components, one for each field and then  
implement a QueryConverter that takes in the field and only extracts  
the tokens for the field or choice.


This is a definite workaround, but I think it might work.  Hmm,  
except we only have one QueryConverter


-Grant

On Jul 15, 2008, at 8:56 PM, Ryan McKinley wrote:

I have a use case where I want to spellcheck the input query across  
multiple fields:

Did you mean: location = washington
vs
Did you mean: person = washington

The current parameter / response structure for the spellcheck  
component does not support this kind of thing.  Any thoughts on how/ 
if the component should handle this?  Perhaps it could be in a  
requestHandler where the params are passed in as json?


spelling={ dictionary="location",  
onlyMorePopular=true}&spelling={ dictionary="person",  
onlyMorePopular=false }


Thoughts?
ryan







Re: Multiple query fields in DisMax handler

2008-07-16 Thread Ryan McKinley
(assuming you are using 1.3-dev), you could use the dismax query  
parser syntax for the fq param.  I think it is something like:

fq=your query

I can't find the syntax now (Yonik?)

but I don't know how you could pull out the qf,pf,etc fields for the  
fq portion vs the q portion.




On Jul 16, 2008, at 10:29 AM, chris sleeman wrote:


Hi all,

Is there currently a way of specifying more than 1 user query fields  
(q

parameter) with the disMax request handler ?
Basically my requirement is this - I have 2 user query fields -  
'query' and
'location', each of which corresponds to multiple solr query fields.  
The
'location' entered by user, for example, could be a  
city,state,address or a
zip code, each of which is a separate solr field. Similarly the user  
'query'

field corresponds to multiple solr fields with different boosts.
Is it possible to use dismax for this? Any ideas or workarounds ?

Thanks in advance,
Chris

--
Laurence J. Peter  - "Originality is the fine art of remembering  
what you

hear but forgetting where you heard it."




Re: Issue with wt=javabin and multicore

2008-07-18 Thread Ryan McKinley


I found that in org.apache.solr.servlet.SolrServlet.java, always  
PrintWriter object is sent as input parameter.


SolrServlet is deprecated.

If you are going to use new features like MultiCore, make sure you  
have the XmlUpdateRequestHandler registered to /update


  class="solr.XmlUpdateRequestHandler" />




Re: IllegalArgumentException with Solrj DocumentObjectBinder

2008-07-19 Thread Ryan McKinley

committed in rev 678204

thanks nobel!


On Jul 19, 2008, at 2:40 PM, Noble Paul നോബിള്‍  
नोब्ळ् wrote:



A patch is submitted in SOLR-536

On Sat, Jul 19, 2008 at 11:23 PM, Noble Paul നോബിള്‍  
नोब्ळ्

<[EMAIL PROTECTED]> wrote:

meanwhile , you can manage by making the field
List categories;

On Sat, Jul 19, 2008 at 11:22 PM, Noble Paul നോബിള്‍  
नोब्ळ्

<[EMAIL PROTECTED]> wrote:

it is a bug . I'll post a new patch



On Sat, Jul 19, 2008 at 7:10 PM, chris sleeman <[EMAIL PROTECTED] 
> wrote:

Hi,

I have a multivalued Solr text field, called 'categories', which  
is mapped
to a String[] in my java bean. I am directly converting the  
search results

to this bean.
This works absolutely fine if the field has two or more values,  
but If the

field has exactly one value, I get the following exception -

*Caused by: java.lang.RuntimeException: Exception while setting  
value

: [Ljava.lang.Object;@15b48b2 on private java.lang.String[]
com.app.model.Unit.categories
  at org.apache.solr.client.solrj.beans.DocumentObjectBinder 
$DocField.set(DocumentObjectBinder.java:230)
  at org.apache.solr.client.solrj.beans.DocumentObjectBinder 
$DocField.inject(DocumentObjectBinder.java:199)
  at  
org 
.apache 
.solr 
.client 
.solrj 
.beans.DocumentObjectBinder.getBeans(DocumentObjectBinder.java:57)
  at  
org 
.apache 
.solr 
.client.solrj.response.QueryResponse.getBeans(QueryResponse.java: 
256)


*


Is this a bug or am I missing something? I am using the latest  
1.3 build.


--
Regards,
Chris





--
--Noble Paul





--
--Noble Paul





--
--Noble Paul




Re: SolrJ + Spellcheck

2008-07-21 Thread Ryan McKinley
Currently, there are not any helper functions to pick out spellcheck  
info.


But you can always use:
  NamedList getResponse()
to pick out the data contained in the response:

Adding spellcheck functions to QueryResponse would be a welcome  
contribution!



On Jul 21, 2008, at 12:51 PM, Jon Baer wrote:


Hi,

I can't seem to locate any info on how to get SolrJ + Spellcheck  
working together, Id like to query the spellchecker if 0 items were  
matched, is SolrJ "generic" enough to pick apart added component  
results from the bottom of a query?


Thanks.

- Jon




Re: Vote on a new solr logo

2008-07-21 Thread Ryan McKinley

I can't figure how to use the poll either...

here are a few others to check out:
http://lapnap.net/solr/
perhaps "a" and "f" could live together, you use 'a' if you need a  
background other then white



On Jul 21, 2008, at 2:14 PM, Mike Klaas wrote:


On 20-Jul-08, at 6:19 PM, Mark Miller wrote:


From the dev list:

Shalin Shekhar Mangar:

+1 for a new logo. It's a new release, let's have a new logo too!  
First step

is to decide which one of these is more Solr-ish.


I'm looking to improve the look of solr, so I am going to do my  
best to push this process along.
Not to keep shoving polls down everyones throat, but if you could,  
please go to the following site

and rate the solr logos that you love or hate: 
http://solrlogo.myhardshadow.com/solr-logo-vote/


I don't really understand how to use the poll.  I click on a logo,  
and am then taken to a page on which the stars are unclickable.   
Which stars should be clicked on?


-Mike




Re: Vote on a new solr logo

2008-07-21 Thread Ryan McKinley

nor does http://selectricity.org/

On Jul 21, 2008, at 2:28 PM, Shalin Shekhar Mangar wrote:

Too bad the polls created with Google docs don't support images in  
them (or

atleast i couldn't figure out how to do it)

On Mon, Jul 21, 2008 at 11:52 PM, Ryan McKinley <[EMAIL PROTECTED]>  
wrote:



I can't figure how to use the poll either...

here are a few others to check out:
http://lapnap.net/solr/
perhaps "a" and "f" could live together, you use 'a' if you need a
background other then white



On Jul 21, 2008, at 2:14 PM, Mike Klaas wrote:

On 20-Jul-08, at 6:19 PM, Mark Miller wrote:


From the dev list:


Shalin Shekhar Mangar:

+1 for a new logo. It's a new release, let's have a new logo too!  
First

step
is to decide which one of these is more Solr-ish.



I'm looking to improve the look of solr, so I am going to do my  
best to

push this process along.
Not to keep shoving polls down everyones throat, but if you  
could, please

go to the following site
and rate the solr logos that you love or hate:
http://solrlogo.myhardshadow.com/solr-logo-vote/



I don't really understand how to use the poll.  I click on a logo,  
and am
then taken to a page on which the stars are unclickable.  Which  
stars should

be clicked on?

-Mike







--
Regards,
Shalin Shekhar Mangar.




Re: Highlight component throws Nullpointer when using q.alt=*:*

2008-07-24 Thread Ryan McKinley

bug that needs fixed!  Can you file a jira ticket?


On Jul 24, 2008, at 12:50 PM, kalyan chakravarti wrote:

Forgot to mention, I am using dismax queryhandler. I just tested  
this with out of box latest nightly build and it throws the same  
error.


http://localhost:8982/select/?q.alt=*:*&qt=dismax

Any thoughts.

Regards
Kalyan



- Original Message 
From: kalyan chakravarti <[EMAIL PROTECTED]>
To: solr user 
Sent: Thursday, July 24, 2008 11:28:05 AM
Subject: Highlight component throws Nullpointer when using q.alt=*:*

Hi,
  When I am trying to use q.alt=*:* without using q= param, the  
highlightComponent throws NPE.
responsebuilder.getHighlightQuery() is null in  
HighlightComponent.class


Here is the snippet of execption.

HTTP Status 500 - null java.lang.NullPointerException at  
org 
.apache 
.solr 
.handler 
.component.HighlightComponent.process(HighlightComponent.java:77) at  
org 
.apache 
.solr 
.handler 
.component.SearchHandler.handleRequestBody(SearchHandler.java:156)  
at  
org 
.apache 
.solr 
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
128) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1025) at  
org 
.apache 
.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java: 
338) at  
org 
.apache 
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java: 
272) at


Do I need to configure anything specific for q.alt=*:*.

Regards
Kalyan






Re: Easiest way to get data dir from a plugin?

2008-07-25 Thread Ryan McKinley

core.getDataDir()

what kind of plugin?  If you don't have access to core, you can  
implement SolrCoreAware...



On Jul 25, 2008, at 2:27 PM, Mark Miller wrote:

How do I get the solr / data dir from a plugin without using  
anything thats deprecated?


- Mark




Re: Best way to return ExternalFileField in the results

2008-07-28 Thread Ryan McKinley


In general though i wondering if steping back a bit and modifying your
request handler to use a SolrDocumentList where you've already  
flattened
the ExternalFileField into each SolrDocument would be an easier  
approach

-- then you wouldnt' need to modify the ResponseWriter at all.




Consider using a search component at the end of the chain that adds  
fields to your document...  this way things work for any writer (json,  
xml, whatever)


We really should add an example to do this... but in the meantime, a  
good example (though a bit complex) is with the local lucene:

http://sourceforge.net/projects/locallucene/

this adds a calculated distance to each document before it gets passed  
to the writer


Re: Multicore DataDir

2008-08-01 Thread Ryan McKinley

Check: https://issues.apache.org/jira/browse/SOLR-646

hopefully that will solve your problems...


On Aug 1, 2008, at 4:35 PM, CameronL wrote:



The dataDir parameter specified in the  element in  
multicore.xml

does not seem to point to the correct directory.  I commented out the
 element from solrconfig.xml in all of my cores.

I could explicitly set the  for each core in all of the
solrconfig.xml files, but I would rather only have to deal with a  
single
solrconfig.xml and not have to configure its contents individually  
for each
of my cores.  Plus, it's pretty convenient to only have to look at 1  
file

(multicore.xml) to configure the dataDirs

Is this the expected behavior?  Is there another place I should be  
setting

it?
--
View this message in context: 
http://www.nabble.com/Multicore-DataDir-tp18784286p18784286.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Solr Logo thought

2008-08-04 Thread Ryan McKinley


If there is a still room for new log design for Solr and the  
community is
open for it then I can try to come up with some proposal. Doing logo  
for

Mahout was really interesting experience.



In my opinion, yes  I'd love to see more effort put towards  the  
logo.  I have stayed out of this discussion since I don't really think  
any of the logos under consideration are complete.  (I begged some  
friends to do two of the three logos under consideration)  I would  
love to refine them, but time... oooh time.


ryan










Re: config reload JMX capabilities

2008-08-06 Thread Ryan McKinley

I don't know about JMX, but check the standard multi-core config...

If you are running things in multi-core mode, you can send a RELOAD  
command:

http://wiki.apache.org/solr/MultiCore#head-429a06cb83e1ce7b06857fd03c38d1200c4bcfc1


On Aug 5, 2008, at 2:39 PM, Kashyap, Raghu wrote:

One of the requirements we have is that when we deploy new data for  
solr

config (synonyms, dictionary etc) we should NOT be restarting the solr
instances for the changes to take effect.

Are there ConfigReload capabilities through JMX that can help us do
this?

Thanks in Advance



-Raghu





Re: SOLR 1.2 Multicore configuration

2008-08-13 Thread Ryan McKinley

Check: http://wiki.apache.org/solr/MultiCore

If you can wait a few days, there will likely be a 1.3 release  
candidate out soon.



On Aug 13, 2008, at 11:30 AM, McBride, John wrote:



Hi,

I am deploying an application across 3 geographies - and as a result
will be running multiple solr instances on one host.

I don't want to set up separate wars running on different ports as  
this

will cause an increased number of firewall requests and require more
management to track the set of ports we are using.

Is it possible to configure the server, such that it reads the country
in the url

Say
Uk/solr/admin
Fr/solr/admin
De/solr/admin

Or possibly have different domain names.

And uses solr home as uk/solrhome etc and passes on the request to
solr/admin handler using that for solrhome?

What is the approach here?  I am a Tomcat config newbie.


As an adjunct.

In order to simplify things, I am thinking of maintaining just one  
index

for all countries and place a country filter on the queries.  The
implication would be throwing away stemming and having all stopwords  
in

one file, which may not be desirable, but seems logistically simpler -
any comments?

Thanks,
John




Re: multicore /solr/update

2008-08-13 Thread Ryan McKinley

check a recent version, this issue should have been fixed in:
https://issues.apache.org/jira/browse/SOLR-545


On Aug 13, 2008, at 2:22 PM, Doug Steigerwald wrote:

Yeah, that's the problem.  Not having the core in the URL you're  
posting to shouldn't update any core, but it does.


Doug

On Aug 13, 2008, at 2:10 PM, Alok K. Dhir wrote:


you need to add the core to your call -- post to 
http://localhost:8983/solr/coreX/update

On Aug 13, 2008, at 1:58 PM, Doug Steigerwald wrote:

I've got two cores (core{0|1}) both using the provided example  
schema (example/solr/conf/schema.xml).


Posting to http://localhost:8983/solr/update added the example  
docs to the last core loaded (core1).  Shouldn't this give you a  
400?


Doug



---
Alok K. Dhir
Symplicity Corporation
www.symplicity.com
(703) 351-0200 x 8080
[EMAIL PROTECTED]








Re: more multicore fun

2008-08-13 Thread Ryan McKinley

the dataDir is configured in solrconfig.xml

With multicore it is currently a bit wonky.  Currenlty, you need to  
configure it explicitly for each core, but it shares the same system  
variables: ${solr.data.dir}, so if you use properties, you end up  
pointing to the same place.


https://issues.apache.org/jira/browse/SOLR-545 is hoping to solve  
this...


Before 1.3 is released, you will either be able to:
1. set the dataDir from your solr.xml config
  

or 2. set a system property in solr.xml and have solrconfig decide  
where the dataDir is...


for now -- if you remove the dataDir config from solrconfig.xml it  
will use the default directory for each instanceDir and will point to  
independent locations...


ryan



On Aug 13, 2008, at 2:52 PM, Doug Steigerwald wrote:

OK.  Last question for a while (hopefully), but something else with  
multicore seems to be wrong.



 
   
   
 


$ java -jar start.jar
...
INFO: [core0] Opening new SolrCore at solr/core0/, dataDir=./solr/ 
data/

...
INFO: [core1] Opening new SolrCore at solr/core1/, dataDir=./solr/ 
data/

...

The instanceDir seems to be fine, but the dataDir isn't being set  
correctly.  The dataDir is actually example/solr/data instead of  
example/solr/core{0|1}/data.


http://localhost:8983/solr/admin/multicore shows the exact same path  
to the index for both cores.  Am I missing something that the  
example multicore config doesn't use?


Thanks.
Doug




Re: multicore /solr/update

2008-08-13 Thread Ryan McKinley
aaah -- I see, we need the same error logic for SolrUpdateServlet as  
we added for SolrServlet.


I'll fix in one sec.

Thanks
ryan


On Aug 13, 2008, at 3:05 PM, Doug Steigerwald wrote:

I checked out the trunk about 2 hours ago.  Was the last commit on  
the 10th supposed to fix this (r684606)?


On Aug 13, 2008, at 3:00 PM, Ryan McKinley wrote:


check a recent version, this issue should have been fixed in:
https://issues.apache.org/jira/browse/SOLR-545


On Aug 13, 2008, at 2:22 PM, Doug Steigerwald wrote:

Yeah, that's the problem.  Not having the core in the URL you're  
posting to shouldn't update any core, but it does.


Doug

On Aug 13, 2008, at 2:10 PM, Alok K. Dhir wrote:


you need to add the core to your call -- post to 
http://localhost:8983/solr/coreX/update

On Aug 13, 2008, at 1:58 PM, Doug Steigerwald wrote:

I've got two cores (core{0|1}) both using the provided example  
schema (example/solr/conf/schema.xml).


Posting to http://localhost:8983/solr/update added the example  
docs to the last core loaded (core1).  Shouldn't this give you a  
400?


Doug



---
Alok K. Dhir
Symplicity Corporation
www.symplicity.com
(703) 351-0200 x 8080
[EMAIL PROTECTED]










Re: multicore /solr/update

2008-08-13 Thread Ryan McKinley

check now.  Should be fixed in trunk


On Aug 13, 2008, at 3:05 PM, Doug Steigerwald wrote:

I checked out the trunk about 2 hours ago.  Was the last commit on  
the 10th supposed to fix this (r684606)?


On Aug 13, 2008, at 3:00 PM, Ryan McKinley wrote:


check a recent version, this issue should have been fixed in:
https://issues.apache.org/jira/browse/SOLR-545


On Aug 13, 2008, at 2:22 PM, Doug Steigerwald wrote:

Yeah, that's the problem.  Not having the core in the URL you're  
posting to shouldn't update any core, but it does.


Doug

On Aug 13, 2008, at 2:10 PM, Alok K. Dhir wrote:


you need to add the core to your call -- post to 
http://localhost:8983/solr/coreX/update

On Aug 13, 2008, at 1:58 PM, Doug Steigerwald wrote:

I've got two cores (core{0|1}) both using the provided example  
schema (example/solr/conf/schema.xml).


Posting to http://localhost:8983/solr/update added the example  
docs to the last core loaded (core1).  Shouldn't this give you a  
400?


Doug



---
Alok K. Dhir
Symplicity Corporation
www.symplicity.com
(703) 351-0200 x 8080
[EMAIL PROTECTED]










Re: more multicore fun

2008-08-13 Thread Ryan McKinley


On Aug 13, 2008, at 3:29 PM, Andrew Nagy wrote:


Thanks for clarifing that Ryan - I was a bit confused too...


Before 1.3 is released, you will either be able to:
1. set the dataDir from your solr.xml config
  



I have been perusing the multicore code and found that the "default"  
attribute was removed.  It also appears that the "dataDir" attribute  
was removed as well, is this true?




yes dataDir was removed before it was committed, but we are still  
debating its future:


either you will set dataDir via system params (configurable for each  
core) OR via re-introducing this variable.


ryan


Word Gram?

2008-08-13 Thread Ryan McKinley
I'm looking for a way to get common word groups within documents.   
That is, what are the top two, three, ... n word groups within the  
index.


I was messing with indexing adjacent words together (sorry about the  
earlier commit)... is this a reasonable approach?  Any other ideas for  
pulling out common phrases?  Any simple post processing?


ryan


Re: Word Gram?

2008-08-13 Thread Ryan McKinley

aaah

thanks for the vocabulary lesson:  shingles == token n-grams


On Aug 13, 2008, at 5:27 PM, Brendan Grainger wrote:


Hi Ryan,

We do basically the same thing, using a modified ShingleFilter (http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//contrib-analyzers/org/apache/lucene/analysis/shingle/ShingleFilter.html 
). I have it set up to build 'shingles' of size 2, 3, 4, 5 which I  
index into separate fields. If there is a better way of doing this  
sort of thing I'd love to know :-)


Brendan

On Aug 13, 2008, at 3:59 PM, Ryan McKinley wrote:

I'm looking for a way to get common word groups within documents.   
That is, what are the top two, three, ... n word groups within the  
index.


I was messing with indexing adjacent words together (sorry about  
the earlier commit)... is this a reasonable approach?  Any other  
ideas for pulling out common phrases?  Any simple post processing?


ryan






  1   2   3   4   5   6   7   8   >