date:20110919

Re: Upgrading solr from 3.3 to 3.4

2011-09-19 Thread Isan Fulia

Hi ,

Ya we need to upgrade but my question is whether  reindexing of all cores is
required
or
we can directly use already indexed data folders of solr 3.3 to solr 3.4.

Thanks,
Isan Fulia.





On 19 September 2011 11:03, Wyhw Whon  wrote:

> If you are already using Apache Lucene 3.1, 3.2 or 3.3, we strongly
> recommend you upgrade to 3.4.0 because of the index corruption bug on
> OS or computer crash or power loss (LUCENE-3418), now fixed in 3.4.0.
>
> 2011/9/19 Isan Fulia 
>
> > Hi all,
> >
> > Does upgrading solr from 3.3 to 3.4 requires reindexing of all the cores
> or
> > we can directly copy the data folders to
> > the new solr ?
> >
> >
> > --
> > Thanks & Regards,
> > Isan Fulia.
> >
>



-- 
Thanks & Regards,
Isan Fulia.

Re: Is it possible to use different types of datasource in DIH?

2011-09-19 Thread O. Klein

I did some more testing and it seems that as soon as you use FileDataSource
it overrides any other dataSource.






http://www.server.com/rss.xml"; processor="XPathEntityProcessor"
forEach="/rss/channel/item" >

 
 



will not work, unless you remove FileDataSource. Anyone know a way to fix
this (except removing FileDataSource) ?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-possible-to-use-different-types-of-datasource-in-DIH-tp3344380p3348011.html
Sent from the Solr - User mailing list archive at Nabble.com.

java.io.CharConversionException While Indexing in Solr 3.4

2011-09-19 Thread Pranav Prakash

Hi List,

I tried Solr 3.4.0 today and while indexing I got the error
java.lang.RuntimeException: [was class java.io.CharConversionException]
Invalid UTF-8 middle byte 0x73 (at char #66611, byte #65289)

My earlier version was Solr 1.4 and this same document went into index
successfully. Looking around, I see issue
https://issues.apache.org/jira/browse/SOLR-2381 which seems to fix the
issue. I thought this patch is already applied to Solr 3.4.0. Is there
something I am missing?

Is there anything else I need to mention? Logs/ My document details etc.?

*Pranav Prakash*

"temet nosce"

Twitter  | Blog  |
Google

Re: Lucene->SOLR transition

2011-09-19 Thread Erik Hatcher

On Sep 18, 2011, at 19:43 , Michael Sokolov wrote:

> On 9/15/2011 8:30 PM, Scott Smith wrote:
>> 
>> 2.   Assuming that the answer to 1 is "correct", then is there an easy 
>> way to take a lucene query (with nested Boolean queries, filter queries, 
>> etc.) and generate a SOLR query string with q and fq components?
>> 
>> 
> I believe that Query.toString() will probably get you back something that can 
> be parsed in turn by the traditional lucene QueryParser, thus completing the 
> circle and returning your original Query.  But why would you want to do that?

No, you can't rely on Query.toString() roundtripping (think stemming, for 
example - but many other examples that won't work that way too).

What you can do, since you know Lucene's API well, is write a QParser(Plugin) 
that takes request parameters as strings and generates the Query from that like 
you are now with your Lucene app.

Erik

Re: indexing data from rich documents - Tika with solr3.1

2011-09-19 Thread Erik Hatcher


On Sep 18, 2011, at 21:52 , scorpking wrote:

> Hi Erik Hatcher-4
> I tried index from your url. But i have a problem. In your case, you knew a
> files absolute path (Dir.new("/Users/erikhatcher/apache-solr-3.3.0/docs").
> So you can indexed it. In my case, i don't know a files absolute path. I
> only know http's address where have files (ex: you can see this link as
> reference: http://www.lc.unsw.edu.au/onlib/pdf/). Another ways? Thanks 

Write a little script that takes the HTTP directory listing like that, and then 
uses stream.url (rather than stream.file as my example used).

Erik

Re: Upgrading solr from 3.3 to 3.4

2011-09-19 Thread Erik Hatcher

Reindexing is not necessary.  Drop in 3.4 and go.  

For this sort of scenario, it's easy enough to try using a copy of your 
 directory with an instance of the newest release of Solr.  If the 
release notes don't say a reindex is necessary, then it's not, but always a 
good idea to try it and run any tests you have handy.

Erik



On Sep 19, 2011, at 00:02 , Isan Fulia wrote:

> Hi ,
> 
> Ya we need to upgrade but my question is whether  reindexing of all cores is
> required
> or
> we can directly use already indexed data folders of solr 3.3 to solr 3.4.
> 
> Thanks,
> Isan Fulia.
> 
> 
> 
> 
> 
> On 19 September 2011 11:03, Wyhw Whon  wrote:
> 
>> If you are already using Apache Lucene 3.1, 3.2 or 3.3, we strongly
>> recommend you upgrade to 3.4.0 because of the index corruption bug on
>> OS or computer crash or power loss (LUCENE-3418), now fixed in 3.4.0.
>> 
>> 2011/9/19 Isan Fulia 
>> 
>>> Hi all,
>>> 
>>> Does upgrading solr from 3.3 to 3.4 requires reindexing of all the cores
>> or
>>> we can directly copy the data folders to
>>> the new solr ?
>>> 
>>> 
>>> --
>>> Thanks & Regards,
>>> Isan Fulia.
>>> 
>> 
> 
> 
> 
> -- 
> Thanks & Regards,
> Isan Fulia.

Re: indexing data from rich documents - Tika with solr3.1

2011-09-19 Thread scorpking

yeah, i want to use DIH and i tried config my file dataconfig. but it is
wrong. This is my config:

*







 

   


http://media.gox.vn/edu/document/original/${VTCEduDocument.s_path_origin}";>


 
  
  

*

And here error: 
*EVERE: Full Import
failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
Exception in invoking url null Processing Document # 1
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at
org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:89)
at
org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:38)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:591)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:267)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:186)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:353)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:411)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:392)
Caused by: java.net.MalformedURLException: no protocol: nullselect TOP 10
pk_document_id, s_path_origin from [VTC_Edu].[dbo].[tbl_Document]
at java.net.URL.(URL.java:567)
at java.net.URL.(URL.java:464)
at java.net.URL.(URL.java:413)
at
org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:81)
... 10 more*

???
Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-data-from-rich-documents-Tika-with-solr3-1-tp3322555p3348149.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: java.io.CharConversionException While Indexing in Solr 3.4

2011-09-19 Thread Pranav Prakash

Just in case, someone might be intrested here is the log

SEVERE: java.lang.RuntimeException: [was class
java.io.CharConversionException] Invalid UTF-8 middle byte 0x73 (at char
#66641, byte #65289)
 at
com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
 at
com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
 at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:287)
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:146)
 at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:67)
 at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
 at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
 at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
 at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
 at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: java.io.CharConversionException: Invalid UTF-8 middle byte 0x73
(at char #66641, byte #65289)
 at com.ctc.wstx.io.UTF8Reader.reportInvalidOther(UTF8Reader.java:313)
at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:204)
 at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101)
at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
 at
com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)
at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:992)
 at
com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4628)
at
com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4126)
 at
com.ctc.wstx.sr.BasicStreamReader.finishToken(BasicStreamReader.java:3701)
at
com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3649)
 ... 26 more

Also, is there a setting so I can change the level of backtrace? This would
be helpful in showing the complete stack instead of 26 more ...

*Pranav Prakash*

"temet nosce"

Twitter  | Blog  |
Google 

On Mon, Sep 19, 2011 at 14:16, Pranav Prakash  wrote:

>
> Hi List,
>
> I tried Solr 3.4.0 today and while indexing I got the error
> java.lang.RuntimeException: [was class java.io.CharConversionException]
> Invalid UTF-8 middle byte 0x73 (at char #66611, byte #65289)
>
> My earlier version was Solr 1.4 and this same document went into index
> successfully. Looking around, I see issue
> https://issues.apache.org/jira/browse/SOLR-2381 which seems to fix the
> issue. I thought this patch is already applied to Solr 3.4.0. Is there
> something I am missing?
>
> Is there anything else I need to mention? Logs/ My document details etc.?
>
> *Pranav Prakash*
>
> "temet nosce"
>
> Twitter  | Blog |
> Google 
>

term vector parser in solr.NET

2011-09-19 Thread jame vaalet

hi,
i was wondering if there is any method to get back the term vector list from
solr through solr.NET?
from the source code for SOLR.NET i couldn't notice any term vector parser
in SOLR.NET .

-- 

-JAME

Re: Is it possible to use different types of datasource in DIH?

2011-09-19 Thread Ahmet Arslan

> I did some more testing and it seems
> that as soon as you use FileDataSource
> it overrides any other dataSource.
> 
> 
>  encoding="UTF-8"
> connectionTimeout="3" readTimeout="3"/>
>  />
> 
> 
>  rootEntity="false"
> url="http://www.server.com/rss.xml";
> processor="XPathEntityProcessor"
> forEach="/rss/channel/item" >
>      xpath="/rss/channel/item/link"/>
>  
>      
> 
> 
> 
> will not work, unless you remove FileDataSource. Anyone
> know a way to fix
> this (except removing FileDataSource) ?

Did you try to give a name to FileDataSource? e.g.

Re: Upgrading solr from 3.3 to 3.4

2011-09-19 Thread Isan Fulia

Thanks Erick.


On 19 September 2011 15:10, Erik Hatcher  wrote:

> Reindexing is not necessary.  Drop in 3.4 and go.
>
> For this sort of scenario, it's easy enough to try using a copy of your
>  directory with an instance of the newest release of Solr.  If
> the release notes don't say a reindex is necessary, then it's not, but
> always a good idea to try it and run any tests you have handy.
>
>Erik
>
>
>
> On Sep 19, 2011, at 00:02 , Isan Fulia wrote:
>
> > Hi ,
> >
> > Ya we need to upgrade but my question is whether  reindexing of all cores
> is
> > required
> > or
> > we can directly use already indexed data folders of solr 3.3 to solr 3.4.
> >
> > Thanks,
> > Isan Fulia.
> >
> >
> >
> >
> >
> > On 19 September 2011 11:03, Wyhw Whon  wrote:
> >
> >> If you are already using Apache Lucene 3.1, 3.2 or 3.3, we strongly
> >> recommend you upgrade to 3.4.0 because of the index corruption bug on
> >> OS or computer crash or power loss (LUCENE-3418), now fixed in 3.4.0.
> >>
> >> 2011/9/19 Isan Fulia 
> >>
> >>> Hi all,
> >>>
> >>> Does upgrading solr from 3.3 to 3.4 requires reindexing of all the
> cores
> >> or
> >>> we can directly copy the data folders to
> >>> the new solr ?
> >>>
> >>>
> >>> --
> >>> Thanks & Regards,
> >>> Isan Fulia.
> >>>
> >>
> >
> >
> >
> > --
> > Thanks & Regards,
> > Isan Fulia.
>
>


-- 
Thanks & Regards,
Isan Fulia.

Re: Is it possible to use different types of datasource in DIH?

2011-09-19 Thread O. Klein

Yeah, naming datasources maybe only works when they are of the same type.

I got this to work with URLdatasource and
url="file:///${crawl.fileAbsolutePath}"  (2 forward slashes doesn't work)
for the local files.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-possible-to-use-different-types-of-datasource-in-DIH-tp3344380p3348257.html
Sent from the Solr - User mailing list archive at Nabble.com.

dataimport.properties still updated on error

2011-09-19 Thread Barry Harding

Hi I am currently using the DIH to connect to and import data from a MS SQL 
Server, and in general doing full, delta or deletes seems to work perfectly.

The issue is that I spotted some errors being logged in the tomcat logs for 
SOLR which are :

19-Sep-2011 07:45:25 org.apache.solr.common.SolrException log
SEVERE: Exception in entity : 
product:org.apache.solr.handler.dataimport.DataImportHandlerException: 
com.microsoft.sqlserver.jdbc.SQLServerException: Transaction (Process ID 125) 
was deadlocked on lock resources with another process and has been chosen as 
the deadlock victim. Rerun the transaction.
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.hasnext(JdbcDataSource.java:339)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.access$600(JdbcDataSource.java:228)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.hasNext(JdbcDataSource.java:262)
at 
org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:77)
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:75)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:591)
at 
org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:302)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:178)
at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:390)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:429)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Transaction 
(Process ID 125) was deadlocked on lock resources with another process and has 
been chosen as the deadlock victim. Rerun the transaction.
at 
com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:213)
at 
com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4713)
at 
com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1671)
at 
com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:944)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.hasnext(JdbcDataSource.java:331)
... 11 more

19-Sep-2011 07:45:25 org.apache.solr.handler.dataimport.DocBuilder doDelta
INFO: Delta Import completed successfully
19-Sep-2011 07:45:25 org.apache.solr.handler.dataimport.DocBuilder finish


Now this SQL error I can deal with and I will probably switch to snapshot 
Isolation as these are constantly updated tables, but my issue is not the sql 
error but the fact that the delta import still reported that it had imported 
successfully and still wrote out the last updated time to the 
dataimport.properties file, so the next time it ran it missed a bunch of 
documents that should have been indexed.

If it had failed and just rolled back the changes and not updated the 
dataimport.properties file it would (assuming no more deadlocks) have caught 
all of the missed documents on the next delta import.

My connection to MS SQL is using the "responseBuffering=adaptive" setting to 
reduce memory overhead, So I guess what I am asking is there any way I can 
cause the DIH to roll back the import if an error occurs and to not update the 
"dataimport.properties" file.

Any help or suggestions would be appreciated

Thanks

Barry H
 
DISCLAIMER: This email and its attachments may be confidential and are intended 
solely for the use of the individual to whom it is addressed. Any views or 
opinions expressed are solely those of the author and do not necessarily 
represent those of Misco UK Ltd. Any unauthorised use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your 
cooperation. Misco UK Ltd, registered in Scotland Number 114143. Registered 
Office: Caledonian Exchange, 19a Canning Street, Edinburgh EH3 8EG. Telephone 
+44 (0)1933 686000. This e-mail message has been scanned by CA Gateway Security.

Re: OutOfMemoryError coming from TermVectorsReader

2011-09-19 Thread Glen Newton

Please include information about your heap size, (and other Java
command line arguments) as well a platform OS (version, swap size,
etc), Java version, underlying hardware (RAM, etc) for us to better
help you.

>From the information you have given, increasing your heap size should help.

Thanks,
Glen

http://zzzoot.blogspot.com/


On Mon, Sep 19, 2011 at 1:34 AM,   wrote:
> Hi,
>
> I am new to solr. I an trying to index text documents of large size. On 
> searching from indexed documents I am getting following OutOfMemoryError. 
> Please help me in resolving this issue.
>
> The field which stores file content is configured in schema.xml as below:
>
>
>  omitNorms="true" termVectors="true" termPositions="true" termOffsets="true" />
>
> and Highlighting is configured as below:
>
>
> on
>
> ${all.fields.list}
>
> 500
>
> true
>
>
>
> 2011-09-16 09:38:45.763 [http-thread-pool-9091(5)] ERROR - 
> java.lang.OutOfMemoryError: Java heap space
>        at 
> org.apache.lucene.index.TermVectorsReader.readTermVector(TermVectorsReader.java:503)
>        at 
> org.apache.lucene.index.TermVectorsReader.get(TermVectorsReader.java:263)
>        at 
> org.apache.lucene.index.TermVectorsReader.get(TermVectorsReader.java:284)
>        at 
> org.apache.lucene.index.SegmentReader.getTermFreqVector(SegmentReader.java:759)
>        at 
> org.apache.lucene.index.DirectoryReader.getTermFreqVector(DirectoryReader.java:510)
>        at 
> org.apache.solr.search.SolrIndexReader.getTermFreqVector(SolrIndexReader.java:234)
>        at 
> org.apache.lucene.search.vectorhighlight.FieldTermStack.(FieldTermStack.java:83)
>        at 
> org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getFieldFragList(FastVectorHighlighter.java:175)
>        at 
> org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getBestFragments(FastVectorHighlighter.java:166)
>        at 
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByFastVectorHighlighter(DefaultSolrHighlighter.java:509)
>        at 
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:376)
>        at 
> org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:116)
>        at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
>        at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
>        at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
>        at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
>        at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:256)
>        at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:215)
>        at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:279)
>        at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
>        at 
> org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:655)
>        at 
> org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:595)
>        at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:98)
>        at 
> com.sun.enterprise.web.PESessionLockingStandardPipeline.invoke(PESessionLockingStandardPipeline.java:91)
>        at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:162)
>        at 
> org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:326)
>        at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:227)
>        at 
> com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:170)
>        at 
> com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:822)
>        at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:719)
>        at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1013)
>
> Thanks & Regards
> Anand Nigam
> Developer
>
>
> ***
> The Royal Bank of Scotland plc. Registered in Scotland No 90312.
> Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB.
> Authorised and regulated by the Financial Services Authority. The
> Royal Bank of Scotland N.V. is authorised and regulated by the
> De Nederlandsche Bank and has its seat at Amsterdam, the
> Netherlands, and is registered in the Commercial Register under
> number 33002587. Registered Office: Gustav Mahlerlaan 350,
> Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and
> The Royal Bank of Scotland plc are authorised to act as agent for each
> other in certain jurisdictions.
>
> This e-mail message is confidential and for use by the addressee only.
> If the message is received by anyone other than the add

Re: Lucene->SOLR transition

2011-09-19 Thread Michael Sokolov


On 9/19/2011 5:27 AM, Erik Hatcher wrote:

On Sep 18, 2011, at 19:43 , Michael Sokolov wrote:


On 9/15/2011 8:30 PM, Scott Smith wrote:

2.   Assuming that the answer to 1 is "correct", then is there an easy way 
to take a lucene query (with nested Boolean queries, filter queries, etc.) and generate a 
SOLR query string with q and fq components?



I believe that Query.toString() will probably get you back something that can 
be parsed in turn by the traditional lucene QueryParser, thus completing the 
circle and returning your original Query.  But why would you want to do that?

No, you can't rely on Query.toString() roundtripping (think stemming, for 
example - but many other examples that won't work that way too).

Oops - thanks for clearing that up, Erik

Two unrelated questions

2011-09-19 Thread Olson, Ron

Hi all-

I'm not sure if I should break this out into two separate questions to the list 
for searching purposes, or if one is more acceptable (don't want to flood).

I have two (hopefully) straightforward questions:

1. Is it possible to expose the unique ID of a document to a DIH query? The 
reason I want to do this is because I use the unique ID of the row in the table 
as the unique ID of the Lucene document, but I've noticed that the counts of 
documents doesn't match the count in the table; I'd like to add these rows and 
was hoping to avoid writing a custom SolrJ app to do it.

2. Is there any limit to the number of conditions in a Boolean search? We're 
working on a new project where the user can choose either, for example, "Ford 
Vehicles", in which case I can simply search for "Ford", but if the user 
chooses specific makes and models, then I have to say something like "Crown Vic 
OR Focus OR Taurus OR F-150", etc., where they could theoretically choose every 
model of Ford ever made except one. This could lead to a *very* large query, 
and was worried both that it was even possible, but also the impact on 
performance.


Thanks, and I apologize if this really should be two separate messages.

Ron

DISCLAIMER: This electronic message, including any attachments, files or 
documents, is intended only for the addressee and may contain CONFIDENTIAL, 
PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended 
recipient, you are hereby notified that any use, disclosure, copying or 
distribution of this message or any of the information included in or with it 
is  unauthorized and strictly prohibited.  If you have received this message in 
error, please notify the sender immediately by reply e-mail and permanently 
delete and destroy this message and its attachments, along with any copies 
thereof. This message does not create any contractual obligation on behalf of 
the sender or Law Bulletin Publishing Company.
Thank you.

Different Solr versions between Master and Slave(s)

2011-09-19 Thread Tommaso Teofili

Hi all,
while thinking about a migration plan of a Solr 1.4.1 master / slave
architecture (1 master with N slaves already in production) to Solr 3.x I
imagined to go for a graceful migration, starting with migrating only
one/two slaves, making the needed tests on those while still offering the
indexing and searching capabilities on top of the 1.4.1 instances.
I did a small test of this migration plan but I see that the 'javabin'
format used by the replication handler has changed (version 1 in 1.4.1,
version 2 in 3.x) so the slaves at 3.x seem not able to replicate from the
master (at 1.4.1).
Is it possible to use the older 'javabin' version in order to enable
replication from the master at 1.4.1 towards the slave at 3.x ?
Or is there a better migration approach that sounds better for the above
scenario?
Thanks in advance for your help.
Cheers,
Tommaso

Re: Different Solr versions between Master and Slave(s)

2011-09-19 Thread Markus Jelsma

The javabin versions are not compatible as well as the index format. I don't 
think it will even work.

Can you not reindex the master on a 3.x version?

On Monday 19 September 2011 18:17:45 Tommaso Teofili wrote:
> Hi all,
> while thinking about a migration plan of a Solr 1.4.1 master / slave
> architecture (1 master with N slaves already in production) to Solr 3.x I
> imagined to go for a graceful migration, starting with migrating only
> one/two slaves, making the needed tests on those while still offering the
> indexing and searching capabilities on top of the 1.4.1 instances.
> I did a small test of this migration plan but I see that the 'javabin'
> format used by the replication handler has changed (version 1 in 1.4.1,
> version 2 in 3.x) so the slaves at 3.x seem not able to replicate from the
> master (at 1.4.1).
> Is it possible to use the older 'javabin' version in order to enable
> replication from the master at 1.4.1 towards the slave at 3.x ?
> Or is there a better migration approach that sounds better for the above
> scenario?
> Thanks in advance for your help.
> Cheers,
> Tommaso

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

RE: Example setting TieredMergePolicy for Solr 3.3 or 3.4?

2011-09-19 Thread Burton-West, Tom

Thanks Robert,

Removing "set" from " setMaxMergedSegmentMB" and using "maxMergedSegmentMB" 
fixed the problem.
( Sorry about the multiple posts.  Our mail server was being flaky and the 
client lied to me about whether the message had been sent.)

I'm still confused about the mergeFactor=10 setting in the example 
configuration.  Took a quick look at the code, but I'm obviously looking in the 
wrong place. Is mergeFactor=10 interpreted by TieredMergePolicy as
segmentsPerTier=10 and maxMergeAtOnce=10?   If I specify values for these is 
the mergeFactor setting ignored?

Tom

-Original Message-
From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Friday, September 16, 2011 7:09 PM
To: solr-user@lucene.apache.org
Subject: Re: Example setting TieredMergePolicy for Solr 3.3 or 3.4?

On Fri, Sep 16, 2011 at 6:53 PM, Burton-West, Tom  wrote:
> Hello,
>
> The TieredMergePolicy has become the default with Solr 3.3, but the 
> configuration in the example uses the mergeFactor setting which applys to the 
> LogByteSizeMergePolicy.
>
> How is the mergeFactor interpreted by the TieredMergePolicy?
>
> Is there an example somewhere showing how to configure the Solr 
> TieredMergePolicy to set the parameters:
> setMaxMergeAtOnce, setSegmentsPerTier, and setMaxMergedSegmentMB?

an example is here:
http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/conf/solrconfig-mergepolicy.xml

>
> I tried setting setMaxMergedSegmentMB in Solr 3.3
> 
>      20
>      40
>     
> 2
>    
>
>
> and got this error message
> "SEVERE: java.lang.RuntimeException: no setter corrresponding to 
> 'setMaxMergedSegmentMB' in org.apache.lucene.index.TieredMergePolicy"

Right, i think it should be:

2

-- 
lucidimagination.com

XPath value passed to SQL query

2011-09-19 Thread ntsrikanth

Hi, 

After little struggle figured out a way of joining xml files with database.
But for some reason it is not working. After the import, only the content
from xml is present in my index. Msql contents are missing. 

To debug, I replaced the parametrized query with a simple select statement
and it worked well. As a next step, I purposefully created a syntax error in
the sql and tried again. This time the import failed as expected printing
the values in the log file. 

What I found interesting is all the values eg. brochure_id are substituted
in the query by a enclosing square brackets. for example:  

SELECT * FROM accommodation_attribute_content where accommodation_code =
'[7850]' and brochure_year = [12] and brochure_id = '[55]'


I have the following in the schema.xml 

613
614


And my data configuration: 


dataconfig.xml 
-- 
 

 





   
   
   
   

   








Any idea why I am getting this weird substitution ? 

Thanks, 
Srikanth 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/XPath-value-passed-to-SQL-query-tp3348658p3348658.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Lucene->SOLR transition

2011-09-19 Thread Scott Smith

OK.  Thanks for all of the suggestions.

Cheers

Scott

-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com] 
Sent: Monday, September 19, 2011 3:27 AM
To: solr-user@lucene.apache.org
Subject: Re: Lucene->SOLR transition

On Sep 18, 2011, at 19:43 , Michael Sokolov wrote:

> On 9/15/2011 8:30 PM, Scott Smith wrote:
>> 
>> 2.   Assuming that the answer to 1 is "correct", then is there an easy 
>> way to take a lucene query (with nested Boolean queries, filter queries, 
>> etc.) and generate a SOLR query string with q and fq components?
>> 
>> 
> I believe that Query.toString() will probably get you back something that can 
> be parsed in turn by the traditional lucene QueryParser, thus completing the 
> circle and returning your original Query.  But why would you want to do that?

No, you can't rely on Query.toString() roundtripping (think stemming, for 
example - but many other examples that won't work that way too).

What you can do, since you know Lucene's API well, is write a QParser(Plugin) 
that takes request parameters as strings and generates the Query from that like 
you are now with your Lucene app.

Erik

JSON indexing failing...

2011-09-19 Thread Pulkit Singhal

Hello,

I am running a simple test after reading:
http://wiki.apache.org/solr/UpdateJSON

I am only using one object from a large json file to test and see if
the indexing works:
curl 'http://localhost:8983/solr/update/json?commit=true'
--data-binary @productSample.json -H 'Content-type:application/json'

The data is from bbyopen.com, I've attached the one single object that
I'm testing with.

The indexing process fails with:
Sep 19, 2011 2:37:54 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: invalid key: url [1701]
at org.apache.solr.handler.JsonLoader.parseDoc(JsonLoader.java:355)

I thought that any json attributes that did not have a mapping in the
schema.xml file would simply not get indexed.
(a) Is this not true?

But this error made me retry after adding url to schema.xml file:

I retried after a restart but I still keep getting the same error!
(b) Can someone wise perhaps point me in the right direction for
troubleshooting this issue?

Thank You!
- Pulkit


productSample.json
Description: application/json

How does Solr deal with JSON data?

2011-09-19 Thread Pulkit Singhal

Hello Everyone,

I'm quite curious about how does the following data get understood and
indexed by Solr?
[{
"id":"Fubar",
"url": null,
"regularPrice": 3.99,
 "offers": [
{
  "url": "",
  "text": "On Sale",
  "id": "OS"
}
 ]
}]

1) The field "id" is present as part of the main object and as part of
a nested offers object, so how does Solr make sense of it?
2) Is the data under offers expected to be stored as multi-value in
Solr? Or am I supposed to create offerURL, offerText and offerId
fields in schema.xml? Even if I do that how do I tell Solr what data
to match up where?

Please be kind, I know I'm not thinking about this in the right
manner, just gently set me straight about all this :)
- Pulkit

Re: JSON indexing failing...

2011-09-19 Thread Pulkit Singhal

Ok a little bit of deleting lines from the json file led me to realize
that Solr isn't happy with the following:
  "offers": [
{
  "url": "",
  "text": "On Sale",
  "id": "OS"
}
  ],
But as to why? Or what to do to remedy this ... I have no clue :(

- Pulkit

On Mon, Sep 19, 2011 at 2:45 PM, Pulkit Singhal  wrote:
> Hello,
>
> I am running a simple test after reading:
> http://wiki.apache.org/solr/UpdateJSON
>
> I am only using one object from a large json file to test and see if
> the indexing works:
> curl 'http://localhost:8983/solr/update/json?commit=true'
> --data-binary @productSample.json -H 'Content-type:application/json'
>
> The data is from bbyopen.com, I've attached the one single object that
> I'm testing with.
>
> The indexing process fails with:
> Sep 19, 2011 2:37:54 PM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: invalid key: url [1701]
>        at org.apache.solr.handler.JsonLoader.parseDoc(JsonLoader.java:355)
>
> I thought that any json attributes that did not have a mapping in the
> schema.xml file would simply not get indexed.
> (a) Is this not true?
>
> But this error made me retry after adding url to schema.xml file:
> 
> I retried after a restart but I still keep getting the same error!
> (b) Can someone wise perhaps point me in the right direction for
> troubleshooting this issue?
>
> Thank You!
> - Pulkit
>

Re: JSON indexing failing...

2011-09-19 Thread Jonathan Rochkind

So I'm not an expert in the Solr JSON update message, never used it 
before myself. It's documented here:


http://wiki.apache.org/solr/UpdateJSON

But Solr is not a structured data store like mongodb or something; you 
can send it an update command in JSON as a convenience, but don't let 
that make you think it can store arbitrarily nested structured data like 
mongodb or couchdb or something.


Solr has a single flat list of indexes, as well as stored fields which 
are also a single flat list per-document. You can format your update 
message as JSON in Solr 3.x, but you still can't tell it to do something 
it's incapable of. If a field is multi-valued, according to the 
documentation, the json value can be an array of values. But if the JSON 
value is a hash... there's nothing Solr can do with this, it's not how 
solr works.



It looks from the documentation that the value can sometimes be a hash 
when you're communicating other meta-data to Solr, like field boosts:


"my_boosted_field": {/* use a map with boost/value for a 
boosted field */

  "boost": 2.3,
  "value": "test"
},

But you can't just give it arbitrary JSON, you have to give it JSON of 
the sort it expects. Which does not include arbitrarily nested data hashes.


Jonathan

How To perform SQL Like Join

2011-09-19 Thread Ahson Iqbal

Hi

As we do join two or more tables in sql, can we join 2 or more indexes in solr 
as well. if yes than in which version.

Regards
Ahsan

Re: Upgrading solr from 3.3 to 3.4

Re: Is it possible to use different types of datasource in DIH?

java.io.CharConversionException While Indexing in Solr 3.4

Re: Lucene->SOLR transition

Re: indexing data from rich documents - Tika with solr3.1

Re: Upgrading solr from 3.3 to 3.4

Re: indexing data from rich documents - Tika with solr3.1

Re: java.io.CharConversionException While Indexing in Solr 3.4

term vector parser in solr.NET

Re: Is it possible to use different types of datasource in DIH?

Re: Upgrading solr from 3.3 to 3.4

Re: Is it possible to use different types of datasource in DIH?

dataimport.properties still updated on error

Re: OutOfMemoryError coming from TermVectorsReader

Re: Lucene->SOLR transition

Two unrelated questions

Different Solr versions between Master and Slave(s)

Re: Different Solr versions between Master and Slave(s)

RE: Example setting TieredMergePolicy for Solr 3.3 or 3.4?

XPath value passed to SQL query

RE: Lucene->SOLR transition

JSON indexing failing...

How does Solr deal with JSON data?

Re: JSON indexing failing...

Re: JSON indexing failing...

How To perform SQL Like Join

26 matches

Site Navigation

Mail list logo

Footer information