What's the need for a complicated SolrTestCaseJ4.getClassName() ?

2011-05-23 Thread Gabriele Kahlout
Hello,

As long as I subclass SolrTestCaseJ4 I cannot do
this.getClass().getSimpleName(), I don't understand why. I wonder if the
following complicated methods  in SolrTestCaseJ4 have anything to do with
it?

  protected static String getClassName() {
StackTraceElement[] stack = new
RuntimeException("WhoAmI").fillInStackTrace().getStackTrace();
for (int i = stack.length-1; i>=0; i--) {
  StackTraceElement ste = stack[i];
  String cname = ste.getClassName();
  if (cname.indexOf(".lucene.")>=0 || cname.indexOf(".solr.")>=0) {
return cname;
  }
}
return SolrTestCaseJ4.class.getName();
  }

  protected static String getSimpleClassName() {
String cname = getClassName();
return cname.substring(cname.lastIndexOf('.')+1);
  }

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
< Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with "X".
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


query parser and other query filters

2011-05-23 Thread Dmitry Kan
Hi all,

Can someone tell wether the query parser must always be the first step in
the processing chain, before any other filter / tokenizer hits the query?

Problem is that the user query can contain punctuation and we would like to
clean it similarly to what we do on the index side. Is there any way to tell
SOLR to execute query parser the last after all other filters have processes
the query?

-- 
Regards,

Dmitry Kan


Did anyone rewrite the solr indexing section?

2011-05-23 Thread LeoYuan88
Hi all,
 As we all have known, solr put all index files in a single directory,
namely ${datadir}/index, 
but the perfomance's getting slower when the size of index dir's getting
bigger and bigger,
so I wanna split the single dir into serveral dirs, e.g. ${datadir}/index1
and ${datadir}/index2, 
maybe I will put 1 users' info into the first one, and put another 1
users' info into 
the second one, when do searching, I will locate the index dir directly as I
need, 
Did anyone do such a thing before? 
Or does this refactoring of solr make sense?
Any advices or suggestions would be highly appreciated.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Did-anyone-rewrite-the-solr-indexing-section-tp2974539p2974539.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Did anyone rewrite the solr indexing section?

2011-05-23 Thread Gora Mohanty
On Mon, May 23, 2011 at 2:39 PM, LeoYuan88  wrote:
> Hi all,
>     As we all have known, solr put all index files in a single directory,
> namely ${datadir}/index,
> but the perfomance's getting slower when the size of index dir's getting
> bigger and bigger,
> so I wanna split the single dir into serveral dirs, e.g. ${datadir}/index1
> and ${datadir}/index2,
> maybe I will put 1 users' info into the first one, and put another 1
> users' info into
> the second one, when do searching, I will locate the index dir directly as I
> need,

Have you looked at multi-core Solr, http://wiki.apache.org/solr/CoreAdmin ,
with each core holding one shard of your index? Does that not meet your
needs?

Regards,
Gora


Re: Sorting function with condition

2011-05-23 Thread Jan Høydahl
Hi,

Could be possible if we implement 
https://issues.apache.org/jira/browse/SOLR-2136

Until then, perhaps you can try to pre-calculate a sort field during indexing, 
based on input data?

You can also emulate an if() function using map() function 
http://wiki.apache.org/solr/FunctionQuery#map
sort=sum(value,map(age,10,10,$x,$y))+asc&x=2&y=3

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 21. mai 2011, at 23.07, ngaurav2005 wrote:

> Hi all,
> 
> I had to do sorting on search results based on some condition inside sort
> function. Let me put forward with example:
> Query: q=timestamp:123454321&sort=somefunction asc 
> 
> But this "Somefunction" has to have if condition inside it, like
> If age =10, sum(value,x) 
> Else sum(value,y)
> 
> Actually somefunction is sum of (value,x) or (value,y) based on age
> condition. 
> 
> How to handle with this situation in solr search. I was trying to do with
> solr math functions using sort parameter in query, but it donot supports if
> condition. Is there any way to do this programatically(java)/solr plugin. I
> am new to solr, so please help me complete set of details. 
> 
> Thanks in anticipation.
> 
> Gaurav
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Sorting-function-with-condition-tp2970036p2970036.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr index as multiple separate index directories

2011-05-23 Thread LeoYuan88
Hi Jason,
I got the same question with you, I'm wondering that have you figured
that out?
Plz take a look of the topic below: 
http://lucene.472066.n3.nabble.com/Did-anyone-rewrite-the-solr-indexing-section-td2974539.html
 
Thanks in advance!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-index-as-multiple-separate-index-directories-tp500762p2974562.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Did anyone rewrite the solr indexing section?

2011-05-23 Thread LeoYuan88
Hi Gora,
 Thanks for you quick response, I will take a look of that page.
 Thanks again!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Did-anyone-rewrite-the-solr-indexing-section-tp2974539p2974568.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Extracting contents of zipped files with Tika and Solr 1.4.1 (now Solr 3.1)

2011-05-23 Thread Gary Taylor

Jayendra,

I cleared out my local repository, and replayed all of my steps from 
Friday and it now it works.  The only difference (or the only one that's 
obvious to me) was that I applied the patch before doing a full 
compile/test/dist.  But I assumed that given I was seeing my new log 
entries (from ExtractingDocumentLoader.java) I was running the correct 
code anyway.


However, I'm very pleased that it's working now - I get the full 
contents of the zipped files indexed and not just the file names.


Thank you again for your assistance, and the patch!

Kind regards,
Gary.


On 21/05/2011 03:12, Jayendra Patil wrote:

Hi Gary,

I tried the patch on the the 3.1 source code (@
http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/)
as well and it worked fine.
@Patch - https://issues.apache.org/jira/browse/SOLR-2416, which deals
with the Solr Cell module.

You may want to verify the contents from the results by enabling the
stored attribute on the text field.

e.g. URL curl 
"http://localhost:8983/solr/update/extract?stream.file=C:/Test.zip&literal.id=777045&literal.title=Test&commit=true";

Let me know if it works. I would be happy to share the generated
artifact you can test on.

Regards,
Jayendra




missing content stream on csv upload

2011-05-23 Thread Robin Palotai
Hi,

When uploading csv using
curl http://localhost:8983/solr/update/csv --data-binary @books.csv -H
'Content-type:text/plain; charset=utf-8'

as mentioned at the Wiki, you can get a "missing content stream" error
on the jetty console. The problem is that on Windows, the -H parameter
should be put inside quotes instead of apostrophes, or curl sends an
incorrect header (I guess it includes the apostrophe too).

Hope it helps, and also can get corrected on the wiki.

BR,
Robin


Too many Boolean Clause and Filter Query

2011-05-23 Thread Sujatha Arun
Hello,

We were initially doing a query with multiple  "OR" as follows

 AND id:(1 2 3  4 5 ...n number )

Which gave maxBoolean Exception on exceeding 1024  boolean operators

Then we changed this to a filter query as follows , as filter Query is not
supposed to result in MaxBoolean Clause Exception

 &fq=id:(1 2 3 ...n number)

But this also gives the same TOO many Boolean Caluses Exception


Can anybody throw any light on this?

Regards
Sujatha


Re: Too many Boolean Clause and Filter Query

2011-05-23 Thread Stefan Matheis
Sujatha,

On Mon, May 23, 2011 at 1:05 PM, Sujatha Arun  wrote:
> Then we changed this to a filter query as follows , as filter Query is not
> supposed to result in MaxBoolean Clause Exception

Where did you read this? I'm pretty sure, that both is based on the
same check for the maximum number of clauses?

Regards
Stefan


Re: Too many Boolean Clause and Filter Query

2011-05-23 Thread Ahmet Arslan
> But this also gives the same TOO many Boolean Caluses
> Exception

You can adjust that parameter in  solrconfig.xml
3024


Re: Too many Boolean Clause and Filter Query

2011-05-23 Thread Sujatha Arun
I got the info from this artice

http://www.lucidimagination.com/blog/2009/06/08/bringing-the-highlighter-back-to-wildcard-queries-in-solr-14/

Yes ,I know this can be configured ,but this will not scale  to "n" ,what is
the performance implication of this  with "n" Boolean Clauses .

Regards
Sujatha


On Mon, May 23, 2011 at 4:45 PM, Ahmet Arslan  wrote:

> > But this also gives the same TOO many Boolean Caluses
> > Exception
>
> You can adjust that parameter in  solrconfig.xml
> 3024
>


Re: chinese SOLR query parser

2011-05-23 Thread Michael McCandless
Not that I know of... and I'm no expert on it!!  I know there are at
least two possibilities -- ChineseAnalyzer / CJKAnalyzer (from trunk's
modules/analysis), but I don't know the tradeoffs of each.

Hopefully others will chime in here?

However, once you do figure out a good schema, could you please post
back?  I'd like to add it to Solr's example schema as an example field
type (text_example_zh?).

Mike

http://blog.mikemccandless.com

On Sat, May 21, 2011 at 7:20 PM, Andy  wrote:
> Is there any example schema for Chinese that I could use as a guide right now?
>
> Thanks
>
>
> --- On Sat, 5/21/11, Michael McCandless  wrote:
>
>> From: Michael McCandless 
>> Subject: Re: chinese SOLR query parser
>> To: solr-user@lucene.apache.org
>> Date: Saturday, May 21, 2011, 6:14 PM
>> Unfortunately, Solr's defaults
>> (example schema) are unusable for
>> non-whitespace languages... see:
>>
>>     http://markmail.org/thread/ww6mhfi3rfpngmc5
>>
>> So it could be you need to turn off
>> autoGeneratePhraseQueries in your
>> fieldType?  We are working towards fixing the example
>> schema (for
>> 3.2/4.0) in https://issues.apache.org/jira/browse/SOLR-2519 ...
>>
>> Also, it could be your web/app server is not using UTF8
>> character
>> encoding, eg Tomcat defaults to ISO-8859-1 -- see
>> http://wiki.apache.org/tomcat/FAQ/CharacterEncoding
>>
>> Mike
>>
>> http://blog.mikemccandless.com
>>
>> On Sat, May 21, 2011 at 3:30 PM, Pradeep Pujari 
>> wrote:
>> > Hi,
>> >
>> > I made changes to schema.xml with CJKAnalyzer. Does
>> naything else required to change in solrconfig.xml for query
>> parser component. Because, I do not get any result back
>> while searching? Looks like the chinese characters are being
>> encoded unable to match in the index. Any help is highly
>> appriciated.
>> >
>> > Thanks
>> > Pradeep.
>> >
>>
>


Re: very slow commits and overlapping commits

2011-05-23 Thread Bill Au
You can use the postCommit event listener as an callback mechanism to let
you know that a commit has happened.

Bill

On Sun, May 22, 2011 at 9:31 PM, Jeff Crump  wrote:

> I don't have an answer to this but only another question:  I don't think I
> can use auto-commit in my application, as I have to "checkpoint" my index
> submissions and I don't know of any callback mechanism that would let me
> know a commit has happened.  Is there one?
>
> 2011/5/21 Erick Erickson 
>
> > Well, committing less offside a possibilty  . Here's what's probably
> > happening. When you pass certain thresholds, segments are merged which
> can
> > take quite some time.  His are you triggering commits? If it's external,
> > think about using auto commit instead.
> >
> > Best
> > Erick
> > On May 20, 2011 6:04 PM, "Bill Au"  wrote:
> > > On my Solr 1.4.1 master I am doing commits regularly at a fixed
> interval.
> > I
> > > noticed that from time to time commit will take longer than the commit
> > > interval, causing commits to overlap. Then things will get worse as
> > commit
> > > will take longer and longer. Here is the logs for a long commit:
> > >
> > >
> > > [2011-05-18 23:47:30.071] start
> > >
> >
> >
> commit(optimize=false,waitFlush=false,waitSearcher=false,expungeDeletes=false)
> > > [2011-05-18 23:49:48.119] SolrDeletionPolicy.onCommit: commits:num=2
> > > [2011-05-18 23:49:48.119]
> > >
> >
> >
> commit{dir=/var/opt/resin3/5062/solr/data/index,segFN=segments_5cpa,version=1247782702272,generation=249742,filenames=[_4dqu_2g.del,
> > > _4e66.tis, _4e3r.tis, _4e59.nrm, _4e68_1.del, _4e4n.prx, _4e4n.fnm,
> > > _4e67.fnm, _4e3r.frq, _4e3r.tii, _4e6d.fnm, _4e6c.prx, _4e68.fdx,
> > _4e68.nrm,
> > > _4e6a.frq, _4e68.fdt, _4dqu.fnm, _4e4n.tii, _4e69.fdx, _4e69.fdt,
> > _4e0e.nrm,
> > > _4e4n.tis, _4e6e.fnm, _4e3r.prx, _4e66.fnm, _4e3r.nrm, _4e0e.prx,
> > _4e4c.fdx,
> > > _4dx1.prx, _4e5v.frq, _4e3r.fdt, _4e4c.tis, _4e41_6.del, _4e6b.tis,
> > > _4e6b_1.del, _4e4y_3.del, _4e6b.tii, _4e3r.fdx, _4dx1.nrm, _4e4y.frq,
> > > _4e4c.fdt, _4e4c.tii, _4e6d.fdt, _4e5k.fnm, _4e41.fnm, _4e69.fnm,
> > _4e67.fdt,
> > > _4e0e.tii, _4dty_h.del, _4e6b.fnm, _4e0e_h.del, _4e6d.fdx, _4e67.fdx,
> > > _4e0e.tis, _4e5v.nrm, _4dx1.fnm, _4e5v.tii, _4dqu.fdt, segments_5cpa,
> > > _4e5v.prx, _4dqu.fdx, _4e59.fnm, _4e6d.prx, _4e59_5.del, _4e4c.prx,
> > > _4e4c.nrm, _4e5k.prx, _4e66.fdx, _4dty.frq, _4e6c.frq, _4e5v.tis,
> > _4e6e.tii,
> > > _4e66.fdt, _4e6b.fdx, _4e68.prx, _4e59.fdx, _4e6e.fdt, _4e41.prx,
> > _4dx1.tii,
> > > _4dx1.fdt, _4e6b.fdt, _4e5v_4.del, _4e4n.fdt, _4e6e.fdx, _4dx1.fdx,
> > > _4e41.nrm, _4e4n.fdx, _4e6e.tis, _4e66.tii, _4e4c.fnm, _4e6b.prx,
> > _4e67.prx,
> > > _4e0e.fnm, _4e4n.nrm, _4e67.nrm, _4e5k.nrm, _4e6a.prx, _4e68.fnm,
> > > _4e4c_4.del, _4dx1.tis, _4e6e.nrm, _4e59.tii, _4e68.tis, _4e67.frq,
> > > _4e3r.fnm, _4dty.nrm, _4e4y.prx, _4e6e.prx, _4dty.tis, _4e4y.tis,
> > _4e6b.nrm,
> > > _4e6a.fdt, _4e4n.frq, _4e6d.frq, _4e59.fdt, _4e6a.fdx, _4e6a.fnm,
> > _4dqu.tii,
> > > _4e41.tii, _4e67_1.del, _4e41.tis, _4dty.fdt, _4e69.tis, _4dqu.frq,
> > > _4dty.fdx, _4dx1.frq, _4e6e.frq, _4e66_1.del, _4e69.prx, _4e6d.tii,
> > > _4e5k.tii, _4e0e.fdt, _4dqu.tis, _4e6d.tis, _4e69.nrm, _4dqu.prx,
> > _4e4y.fnm,
> > > _4e67.tis, _4e69_1.del, _4e6d.nrm, _4e6c.tis, _4e0e.fdx, _4e6c.tii,
> > > _4dx1_n.del, _4e5v.fnm, _4e5k.tis, _4e59.tis, _4e67.tii, _4dqu.nrm,
> > > _4e5k_8.del, _4e6c.fdx, _4e6c.fdt, _4e41.frq, _4e4y.fdx, _4e69.frq,
> > > _4e6a.tis, _4dty.prx, _4e66.frq, _4e5k.frq, _4e6a.tii, _4e69.tii,
> > _4e6c.nrm,
> > > _4dty.fnm, _4e59.prx, _4e59.frq, _4e66.prx, _4e68.frq, _4e5k.fdx,
> > _4e4y.tii,
> > > _4e6c.fnm, _4e0e.frq, _4e6b.frq, _4e41.fdt, _4e4n_2.del, _4dty.tii,
> > > _4e4y.fdt, _4e66.nrm, _4e4c.frq, _4e6a.nrm, _4e5k.fdt, _4e3r_i.del,
> > > _4e5v.fdt, _4e4y.nrm, _4e68.tii, _4e5v.fdx, _4e41.fdx]
> > > [2011-05-18 23:49:48.119]
> > >
> >
> >
> commit{dir=/var/opt/resin3/5062/solr/data/index,segFN=segments_5cpb,version=1247782702273,generation=249743,filenames=[_4dqu_2g.del,
> > > _4e66.tis, _4e59.nrm, _4e3r.tis, _4e4n.fnm, _4e67.fnm, _4e3r.tii,
> > _4e6d.fnm,
> > > _4e68.fdx, _4e68.fdt, _4dqu.fnm, _4e4n.tii, _4e69.fdx, _4e69.fdt,
> > _4e4n.tis,
> > > _4e6e.fnm, _4e0e.prx, _4e4c.tis, _4e5v.frq, _4e4y_3.del, _4e6b_1.del,
> > > _4e4c.tii, _4e6f.fnm, _4e5k.fnm, _4e6c_1.del, _4e41.fnm, _4dx1.fnm,
> > > _4e5v.nrm, _4e5v.tii, _4e5v.prx, _4e5k.prx, _4e4c.nrm, _4dty.frq,
> > _4e66.fdx,
> > > _4e5v.tis, _4e66.fdt, _4e6e.tii, _4e59.fdx, _4e6b.fdx, _4e41.prx,
> > _4e6b.fdt,
> > > _4e41.nrm, _4e6e.tis, _4e4c.fnm, _4e66.tii, _4e6b.prx, _4e0e.fnm,
> > _4e5k.nrm,
> > > _4e6a.prx, _4e6e.nrm, _4e59.tii, _4e67.frq, _4dty.nrm, _4e4y.tis,
> > _4e6a.fdt,
> > > _4e6b.nrm, _4e59.fdt, _4e6a.fdx, _4e41.tii, _4e41.tis, _4e67_1.del,
> > > _4dty.fdt, _4dty.fdx, _4e69.tis, _4e66_1.del, _4e6e.frq, _4e5k.tii,
> > > _4dqu.prx, _4e67.tis, _4e69_1.del, _4e6c.tis, _4e6c.tii, _4e5v.fnm,
> > > _4e5k.tis, _4e59.tis, _4e67.tii, _4e6c.fdx, _4e4y.fdx, _4e41.frq,
> > _4e6c.fdt,
> 

Re: how to convert YYYY-MM-DD to YYY-MM-DD hh:mm:ss - DIH

2011-05-23 Thread stockii
okay. i didn find the problem =( 
it sstill the same shit.

i cannot conver with DateTimeFormater dates form -00-00 =>
"-MM-dd'T'hh:mm:ss'Z'"

i put my date fields into another entity:




 
   

solr throws exceptopn like above.
WHY cannot transform this Transformer this correctly ? 

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores < 100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-convert--MM-DD-to-YYY-MM-DD-hh-mm-ss-DIH-tp2961481p2975235.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how to convert YYYY-MM-DD to YYY-MM-DD hh:mm:ss - DIH

2011-05-23 Thread Gora Mohanty
On Mon, May 23, 2011 at 6:51 PM, stockii  wrote:
> okay. i didn find the problem =(
> it sstill the same shit.
>
> i cannot conver with DateTimeFormater dates form -00-00 =>
> "-MM-dd'T'hh:mm:ss'Z'"
[...]

Please post the exact Solr exception.

Have you tried the last suggestion in the DIH FAQ?
http://wiki.apache.org/solr/DataImportHandlerFaq#Invalid_dates_.28e.g._.22-00-00.22.29_in_my_MySQL_database_cause_my_import_to_abort

Regards,
Gora


Re: how to convert YYYY-MM-DD to YYY-MM-DD hh:mm:ss - DIH

2011-05-23 Thread stockii
yes. i put the &zeroDateTimeBehavior=convertToNull to my url like: 
url="jdbc:mysql://localhost/databaseName?zeroDateTimeBehavoir=convertToNull"

ExceptoiN:

May 23, 2011 3:30:22 PM org.apache.solr.handler.dataimport.DataImporter
doFullImport
SEVERE: Full Import failed
org.apache.solr.handler.dataimport.DataImportHandlerException: Error reading
data from database Processing Document # 1
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.getARow(JdbcDataSource.java:319)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.access$700(JdbcDataSource.java:226)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.next(JdbcDataSource.java:264)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.next(JdbcDataSource.java:258)
at
org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:76)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:233)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:579)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:605)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:260)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:184)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:392)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:373)
Caused by: java.sql.SQLException: Value 'XXX
2011-01-07
-00-00
2011-01-21
-00-0030311414501210open2011-01-07 15:10:47
10.1.0.1212011-01-07 15:10:472011-01-07 15:10:47' can not be represented
as java.sql.Date
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1055)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:926)
at com.mysql.jdbc.ResultSetRow.getDateFast(ResultSetRow.java:140)
at com.mysql.jdbc.BufferRow.getDateFast(BufferRow.java:706)
at com.mysql.jdbc.ResultSetImpl.getDate(ResultSetImpl.java:2174)
at com.mysql.jdbc.ResultSetImpl.getDate(ResultSetImpl.java:2127)
at com.mysql.jdbc.ResultSetImpl.getObject(ResultSetImpl.java:4956)
at com.mysql.jdbc.ResultSetImpl.getObject(ResultSetImpl.java:5012)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.getARow(JdbcDataSource.java:284)
... 13 more



XXX are data which nobody should see ;-)

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores < 100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-convert--MM-DD-to-YYY-MM-DD-hh-mm-ss-DIH-tp2961481p2975285.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how to convert YYYY-MM-DD to YYY-MM-DD hh:mm:ss - DIH

2011-05-23 Thread stockii
the problems are not the empty -00-00 values. the problem is the missing
timestamp at the end of the string !

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores < 100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-convert--MM-DD-to-YYY-MM-DD-hh-mm-ss-DIH-tp2961481p2975293.html
Sent from the Solr - User mailing list archive at Nabble.com.


Indexing documents with "complex multivalued fields"

2011-05-23 Thread anass talby
Hi,

I'm new in solr and would like to index documents that have complex
multivalued fields. I do want to do something like:


   1
   
 1
 red
   
   
 2
 green
   
   ...

 ...


How can i do this with solr

thanks in advance.

-- 
   Anass


Re: Pivot with Stats (or Stats with Pivot)

2011-05-23 Thread Eduardo
I would like to request this feature. How can I do that?

Thanks.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Pivot-with-Stats-or-Stats-with-Pivot-tp2953789p2975341.html
Sent from the Solr - User mailing list archive at Nabble.com.


I only want to return a fields value in certain cases, how is this done

2011-05-23 Thread bryan rasmussen
Let us say I have 3 fields I index
f1, f2, f3.

f1 and f2 are copied to f4.
f4 is the default searched field.


There is a value that is found in f2 and f3.

When I am searching in f3 I want to return only f3 and none  other.
when I am searching in f4 I do not want to return f3.
I only want to return f1 if it has the value that is found in the search.


Is this doable? Can you show me an example?

Thanks,
Bryan Rasmussen


SOLR Install

2011-05-23 Thread Roger Shah
Hi,

I am a new user and I have installed SOLR 3.1.0 and running Tomcat 7.0.
I was able to run the example which shows the SOLR Admin screen.  Also posted 
an XML file by this command from dos prompt:  java -jar post.jar solr.xml.

How can I get SOLR to search web sites and also search through other types of 
files, databases, etc?

Instead of running the example that comes with SOLR, How do I create my own?

Also can you point me to a SOLR Guide or documentation?  I did not see any 
detailed documentation.

Please show me where can I post messages on the SOLR web site.

Thanks,
Raj




Re: Indexing documents with "complex multivalued fields"

2011-05-23 Thread Stefan Matheis
Anass,

what about combining them both into one? so to say:
1|red
2|green

"synchronized" multivalued fields are not possible, afaik.

Regards
Stefan

On Mon, May 23, 2011 at 3:40 PM, anass talby  wrote:
> Hi,
>
> I'm new in solr and would like to index documents that have complex
> multivalued fields. I do want to do something like:
>
> 
>   1
>   
>     1
>     red
>   
>   
>     2
>     green
>   
>   ...
> 
>  ...
>
>
> How can i do this with solr
>
> thanks in advance.
>
> --
>       Anass
>


Re: Indexing documents with "complex multivalued fields"

2011-05-23 Thread Renaud Delbru

Hi,

you could look at this recent thread [1], it is similar to your problem.

[1] 
http://search.lucidimagination.com/search/document/33ec1a98d3f93217/search_across_related_correlated_multivalue_fields_in_solr#1f66876c782c78d5

--
Renaud Delbru

On 23/05/11 14:40, anass talby wrote:

Hi,

I'm new in solr and would like to index documents that have complex
multivalued fields. I do want to do something like:


1

  1
  red


  2
  green

...

  ...


How can i do this with solr

thanks in advance.





Re: I only want to return a fields value in certain cases, how is this done

2011-05-23 Thread Anuj Kumar
Hi,

On Mon, May 23, 2011 at 7:27 PM, bryan rasmussen
wrote:

> Let us say I have 3 fields I index
> f1, f2, f3.
>
> f1 and f2 are copied to f4.
> f4 is the default searched field.
>
>
> There is a value that is found in f2 and f3.
>
> When I am searching in f3 I want to return only f3 and none  other.
>

You can explicitly specify the field f3:keyword and also use the 'fl'
parameter to restrict the fields that you want in the response.


> when I am searching in f4 I do not want to return f3.
> I only want to return f1 if it has the value that is found in the search.
>

In this case, restrict f3 in the response again using the 'fl' parameter.

>
> Is this doable? Can you show me an example?
>

You can take a look at 'fl' parameter to control the fields that are
returned in the response. Details-
http://wiki.apache.org/solr/CommonQueryParameters#fl

Alternatively, you can
write your own handlers for custom operations.

Regards,
Anuj

>
> Thanks,
> Bryan Rasmussen
>


Re: how to convert YYYY-MM-DD to YYY-MM-DD hh:mm:ss - DIH

2011-05-23 Thread Gora Mohanty
On Mon, May 23, 2011 at 7:03 PM, stockii  wrote:
> yes. i put the &zeroDateTimeBehavior=convertToNull to my url like:
> url="jdbc:mysql://localhost/databaseName?zeroDateTimeBehavoir=convertToNull"

  I presume that
this is just a typo here, i.e., "Behavoir" instead of "Behavior".

[...]
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:373)
> Caused by: java.sql.SQLException: Value ' XXX
> 2011-01-07
> -00-00
> 2011-01-21
> -00-00 30 31 1 4 14 5 0 12 1 0 open 2011-01-07 15:10:47
> 10.1.0.121 2011-01-07 15:10:47 2011-01-07 15:10:47' can not be represented
> as java.sql.Date
>        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1055)
>        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956)
>        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:926)
>        at com.mysql.jdbc.ResultSetRow.getDateFast(ResultSetRow.java:140)
>        at com.mysql.jdbc.BufferRow.getDateFast(BufferRow.java:706)
>        at com.mysql.jdbc.ResultSetImpl.getDate(ResultSetImpl.java:2174)
>        at com.mysql.jdbc.ResultSetImpl.getDate(ResultSetImpl.java:2127)
>        at com.mysql.jdbc.ResultSetImpl.getObject(ResultSetImpl.java:4956)
>        at com.mysql.jdbc.ResultSetImpl.getObject(ResultSetImpl.java:5012)
[...]

Sorry, am on a bad network connection, and cannot test your particular
edge case of "-00-00" not converting properly with the specified
dateTimeFormat, but maybe you can have a workaround of a scriptTransformer
applied before the DateTimeTransformer that converts "-00-00" to
something more reasonable, like the beginning of the UNIX epoch, 1970-01-01.

Will check the "-00-00" issue tomorrow.

Regards,
Gora


spellcheck.collate returning all results

2011-05-23 Thread Richard Hodsdon
Hi,

I have been trying to set up spellchecking on our system using the
SpellCheckComponent.

According to the wiki by using spellcheck.collate any fq parameters that are
passed through to the original query while doing spellcheck will return
results if the collation is re-run. So far this has not been happening.
I am getting results returned but if I re-run the query passing through the
collated q param it finds nothing.

My initial Query i as follows:
http://127.0.0.1:8983/solr/select?q=reeed%20bulll&spellcheck=true&spellcheck.collate=true&fq=content_type:post

and I get back in the spellcheck lst



1
0
5

red



1
6
11

bull


red bull



The issue is if I run the query again using the 'correct' query 

http://127.0.0.1:8983/solr/select?q=red%20bull&spellcheck=true&spellcheck.collate=true&fq=content_type:post&wt=json

I get no reponses returned. This is because of my content_type:post, which
is filtering correctly. 

I have also run spellcheck.build=true 

I have set up my solrconfig.xml as follows.


textgen

  solr.IndexBasedSpellChecker
  ./spellchecker
  name
  true
  true

  


 
   explicit
   10
 
 
spellcheck
 


My scheme.xml declares textgen fieldsType and name field


  




  
  





  


Thanks

Richard



--
View this message in context: 
http://lucene.472066.n3.nabble.com/spellcheck-collate-returning-all-results-tp2975621p2975621.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR Install

2011-05-23 Thread Gora Mohanty
On Mon, May 23, 2011 at 7:40 PM, Roger Shah  wrote:
> Hi,
>
> I am a new user and I have installed SOLR 3.1.0 and running Tomcat 7.0.
> I was able to run the example which shows the SOLR Admin screen.  Also posted 
> an XML file by this command from dos prompt:  java -jar post.jar solr.xml.

Great.

> How can I get SOLR to search web sites and also search through other types of 
> files, databases, etc?

Solr does not crawl websites. You probably want Nutch, or some other
crawler. Files, and databases are possible

> Instead of running the example that comes with SOLR, How do I create my own?

Um, start by modifying the examples, maybe? There are more examples
that cover files, DB, etc. Please do ask here if you run into issues.

> Also can you point me to a SOLR Guide or documentation?  I did not see any 
> detailed documentation.

Er, what? Solr is probably among the best-documented FOSS projects, and
I can honestly say that because I have done none of the aforesaid
documentation :-)
the SolrWiki is fantastic:
* Complete list: http://wiki.apache.org/solr/FrontPage
* Initial tutorial: http://lucene.apache.org/solr/tutorial.html
* For easy data import from a database, you could consider
 using the DataImportHandler:
 http://wiki.apache.org/solr/DataImportHandler

> Please show me where can I post messages on the SOLR web site.

Not sure what that means.

Regards,
Gora


Including Score in Solr POJO

2011-05-23 Thread Kissue Kissue
Hi,

I am currently using Solr and indexing/reading my documents as POJO. The
question i have is how can i include the score in the POJO for each document
found in the index?

Thanks.


Re: Including Score in Solr POJO

2011-05-23 Thread Anuj Kumar
Hi,

If you mean SolrJ (as I understand by your description of POJOs), you can
add the score by setting the property IncludeScore to true. For example-

SolrQuery query = new SolrQuery().
setQuery(keyword).
  *setIncludeScore(true);*

Regards,
Anuj

On Mon, May 23, 2011 at 8:31 PM, Kissue Kissue  wrote:

> Hi,
>
> I am currently using Solr and indexing/reading my documents as POJO. The
> question i have is how can i include the score in the POJO for each
> document
> found in the index?
>
> Thanks.
>


Re: Indexing documents with "complex multivalued fields"

2011-05-23 Thread anass talby
Thank you Renaud.

I appreciate your help

On Mon, May 23, 2011 at 4:47 PM, Renaud Delbru wrote:

> Hi,
>
> you could look at this recent thread [1], it is similar to your problem.
>
> [1]
> http://search.lucidimagination.com/search/document/33ec1a98d3f93217/search_across_related_correlated_multivalue_fields_in_solr#1f66876c782c78d5
> --
> Renaud Delbru
>
>
> On 23/05/11 14:40, anass talby wrote:
>
>> Hi,
>>
>> I'm new in solr and would like to index documents that have complex
>> multivalued fields. I do want to do something like:
>>
>> 
>>1
>>
>>  1
>>  red
>>
>>
>>  2
>>  green
>>
>>...
>> 
>>  ...
>>
>>
>> How can i do this with solr
>>
>> thanks in advance.
>>
>>
>


-- 
   Anass


Re: Including Score in Solr POJO

2011-05-23 Thread Kissue Kissue
Thanks Anuj for your reply. Would it then include it as a field in my POJO?
How do i define such field? I have a POJO with the @Field annotation which
is mapped to fields in my schema.

Thanks.

On Mon, May 23, 2011 at 4:10 PM, Anuj Kumar  wrote:

> Hi,
>
> If you mean SolrJ (as I understand by your description of POJOs), you can
> add the score by setting the property IncludeScore to true. For example-
>
> SolrQuery query = new SolrQuery().
>setQuery(keyword).
>  *setIncludeScore(true);*
>
> Regards,
> Anuj
>
> On Mon, May 23, 2011 at 8:31 PM, Kissue Kissue 
> wrote:
>
> > Hi,
> >
> > I am currently using Solr and indexing/reading my documents as POJO. The
> > question i have is how can i include the score in the POJO for each
> > document
> > found in the index?
> >
> > Thanks.
> >
>


Re: Indexing documents with "complex multivalued fields"

2011-05-23 Thread anass talby
Thank you very much

On Mon, May 23, 2011 at 4:27 PM, Stefan Matheis <
matheis.ste...@googlemail.com> wrote:

> Anass,
>
> what about combining them both into one? so to say:
> 1|red
> 2|green
>
> "synchronized" multivalued fields are not possible, afaik.
>
> Regards
> Stefan
>
> On Mon, May 23, 2011 at 3:40 PM, anass talby 
> wrote:
> > Hi,
> >
> > I'm new in solr and would like to index documents that have complex
> > multivalued fields. I do want to do something like:
> >
> > 
> >   1
> >   
> > 1
> > red
> >   
> >   
> > 2
> > green
> >   
> >   ...
> > 
> >  ...
> >
> >
> > How can i do this with solr
> >
> > thanks in advance.
> >
> > --
> >   Anass
> >
>



-- 
   Anass


correctlySpelled and onlyMorePopular in 3.1

2011-05-23 Thread Markus Jelsma
Hi,

I know about the behaviour of the onlyMorePopular setting. It can return 
suggestions while the actual query is correctly spelled. There is, in my 
opinion, some bad behaviour, consider the following query that is correctly 
spelled and yields results and never suggestions:

q=test&spellcheck.onlyMorePopular=false
true


q=test&spellcheck.onlyMorePopular=true
false

Now, also consider the following scenario with onlyMorePopular enabled. Both 
term_a and term_b are correctly spelled and in the index.

&q=term_a
true
term_b

&q=term_b
false

The value of correctlySpelled can be very counter intuitive when 
onlyMorePopular is enabled, isn't it? File an issue or live with it?

Cheers,

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


Re: Including Score in Solr POJO

2011-05-23 Thread Anuj Kumar
Hi,

On Mon, May 23, 2011 at 8:52 PM, Kissue Kissue  wrote:

> Thanks Anuj for your reply. Would it then include it as a field in my POJO?
>

I meant the score given by Solr in response to the search query. Is it an
application specific score that you want to include?


> How do i define such field? I have a POJO with the @Field annotation which
> is mapped to fields in my schema.
>

At the time of indexing, you need not specify the score. The score is
calculated based on the relevance of the query against the matched
documents. If you have an application specific score or weight that you want
to add, you can add it as a separate field but what I understand from your
query is that you want the score that Solr gives to each search results. In
that case, just setting the property IncludeScore to true while constructing
the query object (as shown in the example that I gave earlier) will suffice.

>From the query response, you can then query for the maximum score, as well
as each document's score. For example-

// get the response
QueryResponse results = getSearchServer().query(query);
// get the documents
SolrDocumentList resultDocs = results.getResults();
// get the maximum score
float maxScore = resultDocs.getMaxScore();
// iterate through the documents to see the results
for(SolrDocument doc : resultDocs){
// get the score
Object score = doc.get("score");
}

Hope that helps.

Regards,
Anuj

>
> Thanks.
>
> On Mon, May 23, 2011 at 4:10 PM, Anuj Kumar  wrote:
>
> > Hi,
> >
> > If you mean SolrJ (as I understand by your description of POJOs), you can
> > add the score by setting the property IncludeScore to true. For example-
> >
> > SolrQuery query = new SolrQuery().
> >setQuery(keyword).
> >  *setIncludeScore(true);*
> >
> > Regards,
> > Anuj
> >
> > On Mon, May 23, 2011 at 8:31 PM, Kissue Kissue 
> > wrote:
> >
> > > Hi,
> > >
> > > I am currently using Solr and indexing/reading my documents as POJO.
> The
> > > question i have is how can i include the score in the POJO for each
> > > document
> > > found in the index?
> > >
> > > Thanks.
> > >
> >
>


RE: spellcheck.collate returning all results

2011-05-23 Thread Dyer, James
Richard,

To enable the guarantee you need to specify "spellcheck.maxCollationTries" with 
a value other than zero (which is default).  There is cost involved with 
verifying beforehand if the collations will return hits so this feature is 
"off" by default.  Also, you may want to enable extended collations with 
"spellcheck.collateExtendedResults" to know beforehand how many hits you'll 
get.  It also will detail exactly which correction was subbed in for which 
original misspelled word.

Two things you might want to be aware of:
- This is new functionality for 3.1 so it doesn't work on 1.4 without a patch 
(see SOLR-2010 in jira).

- There is a critical bug in the spell check collate functionality that affects 
any use of "spellcheck.collate=true" in 3.1 and Trunk (4.x).  If using collate 
(even *without* "spellcheck.maxCollationTries") you should apply SOLR-2462 
first (see https://issues.apache.org/jira/browse/SOLR-2462 for information & a 
patch).  It is likely this (or a similar fix) will eventually get committed and 
included in the next bug-fix release, should there be one.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Richard Hodsdon [mailto:hodsdon.rich...@gmail.com] 
Sent: Monday, May 23, 2011 9:54 AM
To: solr-user@lucene.apache.org
Subject: spellcheck.collate returning all results

Hi,

I have been trying to set up spellchecking on our system using the
SpellCheckComponent.

According to the wiki by using spellcheck.collate any fq parameters that are
passed through to the original query while doing spellcheck will return
results if the collation is re-run. So far this has not been happening.
I am getting results returned but if I re-run the query passing through the
collated q param it finds nothing.

My initial Query i as follows:
http://127.0.0.1:8983/solr/select?q=reeed%20bulll&spellcheck=true&spellcheck.collate=true&fq=content_type:post

and I get back in the spellcheck lst



1
0
5

red



1
6
11

bull


red bull



The issue is if I run the query again using the 'correct' query 

http://127.0.0.1:8983/solr/select?q=red%20bull&spellcheck=true&spellcheck.collate=true&fq=content_type:post&wt=json

I get no reponses returned. This is because of my content_type:post, which
is filtering correctly. 

I have also run spellcheck.build=true 

I have set up my solrconfig.xml as follows.


textgen

  solr.IndexBasedSpellChecker
  ./spellchecker
  name
  true
  true

  


 
   explicit
   10
 
 
spellcheck
 


My scheme.xml declares textgen fieldsType and name field


  




  
  





  


Thanks

Richard



--
View this message in context: 
http://lucene.472066.n3.nabble.com/spellcheck-collate-returning-all-results-tp2975621p2975621.html
Sent from the Solr - User mailing list archive at Nabble.com.


Similarity

2011-05-23 Thread Brian Lamb
Hi all,

I'm having trouble getting the basic similarity example to work. If you
notice at the bottom of the schema.xml file, there is a line there that is
commented out:



I uncomment that line and replace it with the following:



Which comes natively with lucene. However, the scores before and after
making this change are the same. I did a full import both times but that
didn't seem to help.

I ran svn up on both my solr directory and my lucene directory. Actually, my
lucene directory was not previously under svn so I removed everything in
there and did svn co
http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/

So why isn't my installation taking the SweetSpot Similarity change?

Thanks,

Brian Lamb


Re: Similarity

2011-05-23 Thread Markus Jelsma
As far as i know, SweetSpotSimilarty needs be configured. I did use it once but 
wrapped a factory around it to configure the sweet spot. It worked just as 
expected and explained in that paper about the subject.

If you use a custom similarity that , for example, caps tf to 1. Does it then 
work?



> Hi all,
> 
> I'm having trouble getting the basic similarity example to work. If you
> notice at the bottom of the schema.xml file, there is a line there that is
> commented out:
> 
> 
> 
> I uncomment that line and replace it with the following:
> 
> 
> 
> Which comes natively with lucene. However, the scores before and after
> making this change are the same. I did a full import both times but that
> didn't seem to help.
> 
> I ran svn up on both my solr directory and my lucene directory. Actually,
> my lucene directory was not previously under svn so I removed everything
> in there and did svn co
> http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/
> 
> So why isn't my installation taking the SweetSpot Similarity change?
> 
> Thanks,
> 
> Brian Lamb


Including phonetic search in text field

2011-05-23 Thread Jamie Johnson
I am new to solr and am trying to determine the best way to take the text
field type (the one in the example) and add phonetic searches to it.
Currently I have done the following:


  










  
  








  


which seems to work.  Is this appropriate or is there a better way of doing
this?  I had previously defined a custom phonetic field but that would mean
for each field which I wanted to support a phonetic style search I would
need to add an additional field.  Adding it to the text seemed much more
elegant since it would work for all text fields.  Is there a reason not to
do this (i.e. performance, index size, etc)?  Any insight/guidance would be
greatly appreciated.

Also if anyone could point me to what exactly filters do (docs) I would
appreciate it.  My assumption is that they inject additional tokens based on
the specific filter class.  Am I correct?


Re: Including phonetic search in text field

2011-05-23 Thread Paul Libbrecht
Jamie,

the problem with that is that you cannot do exact matching anymore.
For this reason, it is good style to have two fields, to use a query expander 
such as dismax (prefer exact matches, and less phonetic matches), and to only 
use that when you sort by score.

hope it helps

paul


Le 23 mai 2011 à 21:43, Jamie Johnson a écrit :

> I am new to solr and am trying to determine the best way to take the text
> field type (the one in the example) and add phonetic searches to it.
> Currently I have done the following:
> 
> autoGeneratePhraseQueries="true">
>  
>
>
>
>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>
> 
>  
>  
>
>
> ignoreCase="true" expand="true"/>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>
>  
>
> 
> which seems to work.  Is this appropriate or is there a better way of doing
> this?  I had previously defined a custom phonetic field but that would mean
> for each field which I wanted to support a phonetic style search I would
> need to add an additional field.  Adding it to the text seemed much more
> elegant since it would work for all text fields.  Is there a reason not to
> do this (i.e. performance, index size, etc)?  Any insight/guidance would be
> greatly appreciated.
> 
> Also if anyone could point me to what exactly filters do (docs) I would
> appreciate it.  My assumption is that they inject additional tokens based on
> the specific filter class.  Am I correct?



Re: Including phonetic search in text field

2011-05-23 Thread Jamie Johnson
Ah, yes very helpful thanks Paul.  I knew there would be something that I
broke :).  I will need to go back and consider the use cases and see which
will and will not require exact matches.  Thanks again!


I have never heard of DisMax so this is new to me as well but have found
some posts about it.  I am sure this will generate other questions :)  Again
thanks.

On Mon, May 23, 2011 at 3:56 PM, Paul Libbrecht  wrote:

> Jamie,
>
> the problem with that is that you cannot do exact matching anymore.
> For this reason, it is good style to have two fields, to use a query
> expander such as dismax (prefer exact matches, and less phonetic matches),
> and to only use that when you sort by score.
>
> hope it helps
>
> paul
>
>
> Le 23 mai 2011 à 21:43, Jamie Johnson a écrit :
>
> > I am new to solr and am trying to determine the best way to take the text
> > field type (the one in the example) and add phonetic searches to it.
> > Currently I have done the following:
> >
> > positionIncrementGap="100"
> > autoGeneratePhraseQueries="true">
> >  
> >
> >
> >
> >
> > >ignoreCase="true"
> >words="stopwords.txt"
> >enablePositionIncrements="true"
> >/>
> > > generateWordParts="1" generateNumberParts="1" catenateWords="1"
> > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> >
> > > protected="protwords.txt"/>
> >
> >
> >  
> >  
> >
> >
> > > ignoreCase="true" expand="true"/>
> > >ignoreCase="true"
> >words="stopwords.txt"
> >enablePositionIncrements="true"
> >/>
> > > generateWordParts="1" generateNumberParts="1" catenateWords="0"
> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> >
> > > protected="protwords.txt"/>
> >
> >  
> >
> >
> > which seems to work.  Is this appropriate or is there a better way of
> doing
> > this?  I had previously defined a custom phonetic field but that would
> mean
> > for each field which I wanted to support a phonetic style search I would
> > need to add an additional field.  Adding it to the text seemed much more
> > elegant since it would work for all text fields.  Is there a reason not
> to
> > do this (i.e. performance, index size, etc)?  Any insight/guidance would
> be
> > greatly appreciated.
> >
> > Also if anyone could point me to what exactly filters do (docs) I would
> > appreciate it.  My assumption is that they inject additional tokens based
> on
> > the specific filter class.  Am I correct?
>
>


Re: Including phonetic search in text field

2011-05-23 Thread Jamie Johnson
Paul,

Do you have an example of how to enable this in the solr config on the
default request handler?  Is it as simple as adding


edismax
   
  text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
   

to the requestHandler named search?

On Mon, May 23, 2011 at 4:18 PM, Jamie Johnson  wrote:

> Ah, yes very helpful thanks Paul.  I knew there would be something that I
> broke :).  I will need to go back and consider the use cases and see which
> will and will not require exact matches.  Thanks again!
>
>
> I have never heard of DisMax so this is new to me as well but have found
> some posts about it.  I am sure this will generate other questions :)  Again
> thanks.
>
>
> On Mon, May 23, 2011 at 3:56 PM, Paul Libbrecht  wrote:
>
>> Jamie,
>>
>> the problem with that is that you cannot do exact matching anymore.
>> For this reason, it is good style to have two fields, to use a query
>> expander such as dismax (prefer exact matches, and less phonetic matches),
>> and to only use that when you sort by score.
>>
>> hope it helps
>>
>> paul
>>
>>
>> Le 23 mai 2011 à 21:43, Jamie Johnson a écrit :
>>
>> > I am new to solr and am trying to determine the best way to take the
>> text
>> > field type (the one in the example) and add phonetic searches to it.
>> > Currently I have done the following:
>> >
>> >> positionIncrementGap="100"
>> > autoGeneratePhraseQueries="true">
>> >  
>> >
>> >
>> >
>> >
>> >> >ignoreCase="true"
>> >words="stopwords.txt"
>> >enablePositionIncrements="true"
>> >/>
>> >> > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>> >
>> >> > protected="protwords.txt"/>
>> >
>> >
>> >  
>> >  
>> >
>> >
>> >> > ignoreCase="true" expand="true"/>
>> >> >ignoreCase="true"
>> >words="stopwords.txt"
>> >enablePositionIncrements="true"
>> >/>
>> >> > generateWordParts="1" generateNumberParts="1" catenateWords="0"
>> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>> >
>> >> > protected="protwords.txt"/>
>> >
>> >  
>> >
>> >
>> > which seems to work.  Is this appropriate or is there a better way of
>> doing
>> > this?  I had previously defined a custom phonetic field but that would
>> mean
>> > for each field which I wanted to support a phonetic style search I would
>> > need to add an additional field.  Adding it to the text seemed much more
>> > elegant since it would work for all text fields.  Is there a reason not
>> to
>> > do this (i.e. performance, index size, etc)?  Any insight/guidance would
>> be
>> > greatly appreciated.
>> >
>> > Also if anyone could point me to what exactly filters do (docs) I would
>> > appreciate it.  My assumption is that they inject additional tokens
>> based on
>> > the specific filter class.  Am I correct?
>>
>>
>


Re: Similarity

2011-05-23 Thread Brian Lamb
Okay well this is encouraging. I changed SweetSpotSimilarity to
MyClassSimilarity. I created this class in:

lucene/contrib/misc/src/java/org/apache/lucene/misc/

I am getting a ClassNotFoundException when I try to start solr.

Here is the contents of the MyClassSimilarity file:

package org.apache.lucene.misc;
import org.apache.lucene.search.DefaultSimilarity;

public class MyClassSimilarity extends DefaultSimilarity {
  public MyClassSimilarity() { super(); }
  public float idf(int a1, int a2) { return 1; }
}

So then this raises two questions. Why am I getting a classNotFoundException
and how can I go about fixing it?

Thanks,

Brian Lamb

On Mon, May 23, 2011 at 3:41 PM, Markus Jelsma
wrote:

> As far as i know, SweetSpotSimilarty needs be configured. I did use it once
> but
> wrapped a factory around it to configure the sweet spot. It worked just as
> expected and explained in that paper about the subject.
>
> If you use a custom similarity that , for example, caps tf to 1. Does it
> then
> work?
>
>
>
> > Hi all,
> >
> > I'm having trouble getting the basic similarity example to work. If you
> > notice at the bottom of the schema.xml file, there is a line there that
> is
> > commented out:
> >
> > 
> >
> > I uncomment that line and replace it with the following:
> >
> > 
> >
> > Which comes natively with lucene. However, the scores before and after
> > making this change are the same. I did a full import both times but that
> > didn't seem to help.
> >
> > I ran svn up on both my solr directory and my lucene directory. Actually,
> > my lucene directory was not previously under svn so I removed everything
> > in there and did svn co
> > http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/
> >
> > So why isn't my installation taking the SweetSpot Similarity change?
> >
> > Thanks,
> >
> > Brian Lamb
>


Re: Similarity

2011-05-23 Thread Markus Jelsma
Hmm. I don't add code to Apache packages but create my own packages and 
namespaces, build a jar and add it to the lib directory as specified in 
solrconfig. Then you can use the FQCN to in the similarity config to point to 
the class.

May be it can work when messing inside the apache namespace but then you have 
to build Lucene as well.


> Okay well this is encouraging. I changed SweetSpotSimilarity to
> MyClassSimilarity. I created this class in:
> 
> lucene/contrib/misc/src/java/org/apache/lucene/misc/
> 
> I am getting a ClassNotFoundException when I try to start solr.
> 
> Here is the contents of the MyClassSimilarity file:
> 
> package org.apache.lucene.misc;
> import org.apache.lucene.search.DefaultSimilarity;
> 
> public class MyClassSimilarity extends DefaultSimilarity {
>   public MyClassSimilarity() { super(); }
>   public float idf(int a1, int a2) { return 1; }
> }
> 
> So then this raises two questions. Why am I getting a
> classNotFoundException and how can I go about fixing it?
> 
> Thanks,
> 
> Brian Lamb
> 
> On Mon, May 23, 2011 at 3:41 PM, Markus Jelsma
> 
> wrote:
> > As far as i know, SweetSpotSimilarty needs be configured. I did use it
> > once but
> > wrapped a factory around it to configure the sweet spot. It worked just
> > as expected and explained in that paper about the subject.
> > 
> > If you use a custom similarity that , for example, caps tf to 1. Does it
> > then
> > work?
> > 
> > > Hi all,
> > > 
> > > I'm having trouble getting the basic similarity example to work. If you
> > > notice at the bottom of the schema.xml file, there is a line there that
> > 
> > is
> > 
> > > commented out:
> > > 
> > > 
> > > 
> > > I uncomment that line and replace it with the following:
> > > 
> > > 
> > > 
> > > Which comes natively with lucene. However, the scores before and after
> > > making this change are the same. I did a full import both times but
> > > that didn't seem to help.
> > > 
> > > I ran svn up on both my solr directory and my lucene directory.
> > > Actually, my lucene directory was not previously under svn so I
> > > removed everything in there and did svn co
> > > http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/
> > > 
> > > So why isn't my installation taking the SweetSpot Similarity change?
> > > 
> > > Thanks,
> > > 
> > > Brian Lamb


dates with unknown parts

2011-05-23 Thread Mari Masuda
Hello,

I am just getting started with Solr and have a question about date/time 
indexing.  I know from the javadoc documentation ( 
http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html ) that 
the format should be a restricted form of 
http://www.w3.org/TR/xmlschema-2/#dateTime-canonical-representation where the 
timezone is always Z.

The data I am working with is such that an exact date and time are not always 
known.  For example, a record in my database could contain a value like 
1918-00-00T00:00:00Z to indicate that the year is 1918 but the exact month and 
day (let alone the time) are unknown.

I searched the list archives and found a thread ( 
http://marc.info/?l=solr-user&m=125438370510775&w=2 ) from a couple of years 
ago that is basically the same situation as mine.  I was just wondering if 
there had been any updates since then that would allow date/time segments with 
unknown values to be represented, or if I need to do some kind of kludge to 
make this work.  Thanks!

Mari

Re: dates with unknown parts

2011-05-23 Thread Markus Jelsma
I'm not sure i understand what you're looking for but if you write about 
unknown values i think of lack of precission. Solr allows you to reduce 
precission of dates using the flexible DateMathParser.

http://lucene.apache.org/solr/api/org/apache/solr/util/DateMathParser.html


> Hello,
> 
> I am just getting started with Solr and have a question about date/time
> indexing.  I know from the javadoc documentation (
> http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html )
> that the format should be a restricted form of
> http://www.w3.org/TR/xmlschema-2/#dateTime-canonical-representation where
> the timezone is always Z.
> 
> The data I am working with is such that an exact date and time are not
> always known.  For example, a record in my database could contain a value
> like 1918-00-00T00:00:00Z to indicate that the year is 1918 but the exact
> month and day (let alone the time) are unknown.
> 
> I searched the list archives and found a thread (
> http://marc.info/?l=solr-user&m=125438370510775&w=2 ) from a couple of
> years ago that is basically the same situation as mine.  I was just
> wondering if there had been any updates since then that would allow
> date/time segments with unknown values to be represented, or if I need to
> do some kind of kludge to make this work.  Thanks!
> 
> Mari


RE: Spatial Solr 3.1: filter by viewport

2011-05-23 Thread Zac Smith
It looks like someone asked this question a few months ago and didn't get an 
answer either ... 
http://lucene.472066.n3.nabble.com/Spatial-Solr-Representing-a-bounding-box-and-searching-for-it-tc2447262.html#none

I really thought this would be a pretty simple question to answer? Is there no 
way to specify the exact coordinates of the bounding box - 
http://wiki.apache.org/solr/SpatialSearch#bbox_-_Bounding-box_filter ??


Zac

-Original Message-
From: Zac Smith [mailto:z...@trinkit.com] 
Sent: Sunday, May 22, 2011 9:34 PM
To: solr-user@lucene.apache.org
Subject: Spatial Solr 3.1: filter by viewport

How would I specify a filter that covered a rectangular viewport? I have 4 
coordinate points for the corners and I want to return everything inside that 
area.
My first naive attempt was this:
q=*:*&fq=coords:[44.119141,-125.948638 TO 47.931066,-111.029205]

At first this seems to work OK, except where the viewport crosses over a point 
where the longitude goes from a positive value to a negative value.

Thanks
Zac


Improving PayloadTermQuery Performance

2011-05-23 Thread Neil Hooey
What are some ways that one can increase the performance of PayloadTermQuery's?

I'm currently getting a max of 22 QPS after 90k unique queries from a
payload-enhanced keyword field on a dataset of 18 million documents,
where a simple term search on the equivalent multivalue string field
gives a max of 700 QPS.

Here are the performance numbers for queries 89,000 - 90,000:
 Int #ReqsSecs  Reqs/s Avg  Median80th95th99th Max
891000   45.5222.0   0.045   0.013   0.067   0.198   0.360   1.144

In terms of implementation, I wrote a bunch of custom classes that end
up overriding QueryParserBase.newTermQuery() to return a
PayloadTermQuery instead of a TermQuery. This implementation seems to
work fine, but it's very slow.

I'm using HTTPD::Bench::ApacheBench with anywhere between 1 and 40
concurrent requests, and it pegs one of four CPUs at 100% the whole
time, leaving the others idle.

Specfically, are there ways to:
1. Use more than one CPU for PayloadTermQuery processing?
2. Take advantage of caching when calculating payloads?
   (I've heard multivalue string fields take advantage of caching
where payloads do not)
3. Increase the query throughput for payloads in any other way?

Thanks,

- Neil


Re: SOLR Install

2011-05-23 Thread Yuhan Zhang
Hi Raj,

To index files using java, use solrj:
http://www.google.com/search?q=solrj&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a

To index files by a post request, follow this tutorial:
http://www.xml.com/pub/a/2006/08/09/solr-indexing-xml-with-lucene-andrest.html

Yuhan

On Mon, May 23, 2011 at 7:10 AM, Roger Shah  wrote:

> Hi,
>
> I am a new user and I have installed SOLR 3.1.0 and running Tomcat 7.0.
> I was able to run the example which shows the SOLR Admin screen.  Also
> posted an XML file by this command from dos prompt:  java -jar post.jar
> solr.xml.
>
> How can I get SOLR to search web sites and also search through other types
> of files, databases, etc?
>
> Instead of running the example that comes with SOLR, How do I create my
> own?
>
> Also can you point me to a SOLR Guide or documentation?  I did not see any
> detailed documentation.
>
> Please show me where can I post messages on the SOLR web site.
>
> Thanks,
> Raj
>
>
>


Re: Too many Boolean Clause and Filter Query

2011-05-23 Thread Sujatha Arun
A filter query does not affect the scoring of the document  ,so is filter
query like constantScore query and hence should  not be affected by the
maxBoolean clause?

Regards
Sujatha

On Mon, May 23, 2011 at 4:56 PM, Sujatha Arun  wrote:

> I got the info from this artice
>
>
> http://www.lucidimagination.com/blog/2009/06/08/bringing-the-highlighter-back-to-wildcard-queries-in-solr-14/
>
> Yes ,I know this can be configured ,but this will not scale  to "n" ,what
> is the performance implication of this  with "n" Boolean Clauses .
>
> Regards
> Sujatha
>
>
> On Mon, May 23, 2011 at 4:45 PM, Ahmet Arslan  wrote:
>
>> > But this also gives the same TOO many Boolean Caluses
>> > Exception
>>
>> You can adjust that parameter in  solrconfig.xml
>> 3024
>>
>
>


Stats help needed on price field using different currencies

2011-05-23 Thread insigh...@gmail.com

Hi all,

I took a look at:

http://lucene.472066.n3.nabble.com/Tuning-StatsComponent-td2225809.html

which is a similar problem.

I have a "price" field, on which I'm faceting (stats.field=price). But 
the currency of those docs is different, so the returned value is 
useless without converting the prices to the same currency BEFORE the 
returned stats value.


E.g.:


1
100.00
USD


2
200.00
GBP


3
300.00
AUD


The result of the facet returns "200.00" for a mean, which, based on the 
price values is correct, yet erroneous.


Is there a way to pass a math function into the facet BEFORE the results?

e.g. if currency='GBP' then sum(price * 1.61) else if currency='HKD' 
then sum(price * 1.05) else price


A tedious workaround has been to do a stats.facet=currency, then do my 
stats calculations by summing/dividing each returned sub result to find 
an aggregate.


Is there an easier solution, or have devs thought of adding the ability 
to pass a calculation before the returned stats values, maybe 
s.field.pre.math / s.field.post.math in solrconfig or the query?


Thanks for your help,

Dan


Re: [Announce[ White paper describing Near Real Time Implementation with Solr and RankingAlgorithm

2011-05-23 Thread Andy
Thanks Nagendra.

2 questions:

1) How does NRT update affect the performance of facet search? My understanding 
is that facet in Solr relies on caching. With NRT update, wouldn't facet cache 
be invalidated all the time? And if that's the case, would the performance of 
facet be significantly reduced?

2) The paper stated that the performance of NRT updates could be drastically 
improved if IndexWriter.getReader() performance is improved. Is there any 
tuning on Solr that can be done to improve IndexWriter.getReader() performance?

Andy


--- On Thu, 5/19/11, Nagendra Nagarajayya  wrote:

> From: Nagendra Nagarajayya 
> Subject: [Announce[ White paper describing Near Real Time Implementation with 
> Solr and RankingAlgorithm
> To: solr-user@lucene.apache.org
> Date: Thursday, May 19, 2011, 9:18 AM
> Hi!
> 
> I would like to announce a white paper that describes the
> technical details of  Near Real Time implementation
> with Solr and the RankingAlgorithm. The paper discusses the
> modifications made to enable NRT.
> 
> You can download the white paper from here:
> http://solr-ra.tgels.com/papers/NRT_Solr_RankingAlgorithm.pdf
> 
> The modified src can also be downloaded from here:
> http://solr-ra.tgels.com
> 
> Regards,
> 
> - Nagendra Nagarajayya
> http://solr-ra.tgels.com
> 
> 
> 
>