Null pointer exception in spell checker at addchecker method

2013-12-06 Thread sweety
Im trying to use spell check component.
My *schema* is:(i have included only fields necessary for spell check not
the entire schema)
 












 






 





 




My *solrconfig* is:


text

direct
contents
solr.DirectSolrSpellChecker
internal
0.8
1
1
5
3
0.01

  


  
   wordbreak
   solr.WordBreakSolrSpellChecker
   contents
   true
   true
   10
 
  



true
direct 
default
wordbreak 
on
true
5
true 
true


spellcheck



I get this *error*:
java.lang.NullPointerException at
org.apache.solr.spelling.*ConjunctionSolrSpellChecker.addChecker*(ConjunctionSolrSpellChecker.java:58)
at
org.apache.solr.handler.component.SpellCheckComponent.getSpellChecker(SpellCheckComponent.java:475)
at
org.apache.solr.handler.component.SpellCheckComponent.prepare(SpellCheckComponent.java:106)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:187)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at 

I know that the error might be in addchecker method,i read this method but
the coding of this method is such that, for all the null values, default
values are added.
(eg: if (queryAnalyzer == null) 
 queryAnalyzer = checker.getQueryAnalyzer(); )
Now so i feel that the Null checker value is sent when  
/checkers.add(checker);/  is executed.

If i am right tell me how to resolve this,else what has gone wrong.
Thanks in advance.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Null-pointer-exception-in-spell-checker-at-addchecker-method-tp4105489.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Null pointer exception in spell checker at addchecker method

2013-12-09 Thread sweety
yes, it worked.
And i got the reason for the error.
Thanks a lot.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Null-pointer-exception-in-spell-checker-at-addchecker-method-tp4105489p4105636.html
Sent from the Solr - User mailing list archive at Nabble.com.


Java heap space:out of memory

2013-12-10 Thread sweety
I just indexed 10 doc of total 15mb.For some queries it works fine but, 
for some queries i get this error:


java.lang.OutOfMemoryError: Java heap space

java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at
java.lang.Thread.run(Unknown Source) Caused by: java.lang.OutOfMemoryError:
Java heap space

500



I have direclty indexed them into solr.
My schema.xml is:
 













 

I dont understand for such small num of  doc why do i get this error.
I havent studied much about solr performance details.
How to increase the heap size? because I need to index a lot more data
still.
Thanks in advance.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Java-heap-space-out-of-memory-tp4105903.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Java heap space:out of memory

2013-12-10 Thread sweety
4gb ram.
I m running on Windows 7,with Tomcat as webserver. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Java-heap-space-out-of-memory-tp4105903p4105929.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Java heap space:out of memory

2013-12-10 Thread sweety
sorry but i dont know how to check that?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Java-heap-space-out-of-memory-tp4105903p4105947.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Java heap space:out of memory

2013-12-10 Thread sweety
okay thanks,
here it is:
max heap size : 63.56MB(it is howing 37.2% usage though)
How to increase that size??




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Java-heap-space-out-of-memory-tp4105903p4105952.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Java heap space:out of memory

2013-12-10 Thread sweety
I have set : JAVA_OPTS as  value: -Xms1024M-Xmx1024M 
But the dashboard still shows 64M,but now the usage is only 18%
How could that be? yesterday it was 87%.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Java-heap-space-out-of-memory-tp4105903p4106069.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Java heap space:out of memory

2013-12-10 Thread sweety
yes,i did put the space,as in the image




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Java-heap-space-out-of-memory-tp4105903p4106077.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Java heap space:out of memory

2013-12-10 Thread sweety
You were right the changes made in JAVA_OPTs didn't show increase in the heap
size, I made changes in the UI of tomcat 
Initial pool memory : 512 MB
Maximum pool memory : 1024 MB

Now the heap size has increased.
Thanks you all for your suggestions,it really saved my time.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Java-heap-space-out-of-memory-tp4105903p4106082.html
Sent from the Solr - User mailing list archive at Nabble.com.


indexing .docx using solrj

2013-12-21 Thread sweety
i am trying to index .docx file using solrj, i referred this link:
http://wiki.apache.org/solr/ContentStreamUpdateRequestExample

My code is :
import java.io.File;
import java.io.IOException;

import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.SolrServerException;

import org.apache.solr.client.solrj.request.AbstractUpdateRequest;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.impl.*;
import org.apache.solr.client.solrj.request.ContentStreamUpdateRequest;
public class rich_index {
 
 public static void main(String[] args) {
   try {
 //Solr cell can also index MS file (2003 version and 2007 
version)
types.
 String fileName = 
"C:\\solr\\document\\src\\test1\\contract.docx"; 
 //this will be unique Id used by Solr to index the file 
contents.
String solrId = "contract.docx"; 

indexFilesSolrCell(fileName, solrId);

  } catch (Exception ex) {
System.out.println(ex.toString());
  }
}
 
   public static void indexFilesSolrCell(String fileName, String 
solrId) 
   throws IOException, SolrServerException {
   
   String urlString = "http://localhost:8080/solr/document";; 
   SolrServer solr = new HttpSolrServer(urlString);
   
   ContentStreamUpdateRequest up  = new
ContentStreamUpdateRequest("/update/extract");
   
   up.addFile(new File(fileName), "text");
   
   
up.setParam("literal.id", solrId);
   up.setParam("uprefix", "ignored_");
   up.setParam("fmap.content", "contents");
   
   up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
   
   solr.request(up);
   
   QueryResponse rsp = solr.query(new SolrQuery("*:*"));
   
   System.out.println(rsp);
 }  
}



This is my logs:
Dec 22, 2013 12:27:58 AM org.apache.solr.update.processor.LogUpdateProcessor
finish
INFO: [document] webapp=/solr path=/update/extract
params={fmap.content=contents&waitSearcher=true&commit=true&uprefix=ignored_&literal.id=contract.docx&wt=javabin&version=2&softCommit=false}
{} 0 0
Dec 22, 2013 12:27:58 AM org.apache.solr.common.SolrException log
SEVERE: null:java.lang.RuntimeException: java.lang.NoClassDefFoundError:
*org/apache/xml/serialize/BaseMarkupSerializer*
at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)

To resolve this i added xerces.jar in the build path,this has.
org/apache/xml/serialize/BaseMarkupSerializer class,but the error is not
resolved.
What is the problem??


*Solrconfig:*


last_modified
contents
true
ignored_




*scehma:*
 











--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-docx-using-solrj-tp4107737.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: indexing .docx using solrj

2013-12-21 Thread sweety
I have added that jar,in the build path.
but the same error,i get.
Why is eclipse not recognising that jar??

Logs also show this,
Caused by: java.lang.NoClassDefFoundError:
org/apache/xml/serialize/BaseMarkupSerializer
at 
org.apache.solr.handler.extraction.ExtractingRequestHandler.newLoader(ExtractingRequestHandler.java:117)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:63)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
... 16 more
Caused by: java.lang.ClassNotFoundException:
org.apache.xml.serialize.BaseMarkupSerializer
at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1688)
at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1533)
... 22 more





--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-docx-using-solrj-tp4107737p4107746.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: indexing .docx using solrj

2013-12-21 Thread sweety
Jar is already there in the lib folder of solr home.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-docx-using-solrj-tp4107737p4107748.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: program termination in solrj

2013-12-21 Thread sweety
Before and after running client,stats remain same only,

class:org.apache.solr.update.DirectUpdateHandler2
version:1.0
description:Update handler that efficiently directly updates the on-disk
main lucene index
src:$URL:
https:/​/​svn.apache.org/​repos/​asf/​lucene/​dev/​branches/​branch_4x/​solr/​core/​src/​java/​org/​apache/​solr/​update/​DirectUpdateHandler2.java
$

stats:
commits:0
autocommits:0
soft autocommits:0
optimizes:0
rollbacks:0
expungeDeletes:0
docsPending:0
adds:0
deletesById:0
deletesByQuery:0
errors:0
cumulative_adds:0
cumulative_deletesById:0
cumulative_deletesByQuery:0
cumulative_errors:0



--
View this message in context: 
http://lucene.472066.n3.nabble.com/program-termination-in-solrj-tp4107706p4107749.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: indexing .docx using solrj

2013-12-21 Thread sweety
solr: 4.2
tomcat: 7.0
jdk1.7.0.45

i have created solr home in c:\solr as in java options: 
-Dsolr.solr.home=C:\solr

c:solr/lib contains:

tika jars, actually i pasted all the jars from the solr 4.2 dist,contrib
folders in c:solr/lib

tomcat/lib contains:
all the jars when installed.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-docx-using-solrj-tp4107737p4107752.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: program termination in solrj

2013-12-21 Thread sweety
also my default search handler has no dismax.




 

 
   explicit 
   20
   *
   contents
   2.1
 
  



--
View this message in context: 
http://lucene.472066.n3.nabble.com/program-termination-in-solrj-tp4107706p4107753.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: program termination in solrj

2013-12-21 Thread sweety
okay, i did a mistake, i did not refresh the stats,so the stats after running
java program:

commits:1
autocommits:0
soft autocommits:0
optimizes:0
rollbacks:0
expungeDeletes:0
docsPending:0
adds:0
deletesById:0
deletesByQuery:0
errors:0
cumulative_adds:1
cumulative_deletesById:0
cumulative_deletesByQuery:0
cumulative_errors:0




--
View this message in context: 
http://lucene.472066.n3.nabble.com/program-termination-in-solrj-tp4107706p4107754.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: indexing .docx using solrj

2013-12-21 Thread sweety
It is working now,i just restarted computer.
But i dont still get the reason for the error.
Thank you though,for your efforts.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-docx-using-solrj-tp4107737p4107755.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: indexing .docx using solrj

2013-12-21 Thread sweety
yes,i copied all jars from contrib/extraction to solr/lib.
It is not getting the poi jar now, as mentioned in above post of mine, new
error it shows now.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-docx-using-solrj-tp4107737p4107758.html
Sent from the Solr - User mailing list archive at Nabble.com.


to index byte array

2014-01-01 Thread sweety
I am converting .doc and .docx files to byte array in c#, now I need to index
this byte array of doc files.
Is it possible in solr to index byte array of files??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/to-index-byte-array-tp4108999.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: to index byte array

2014-01-01 Thread sweety
For indexing .docx files using tika, requires file system path, but i dont
want to give the path.

I read in DIH faq's that by using transformer the output can be converted
from byte to string.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/to-index-byte-array-tp4108999p4109007.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: to index byte array

2014-01-01 Thread sweety
For indexing .docx files using tika, requires file system path, but i dont
want to give the path.

I read in DIH faq's that by using transformer the output can be converted
from byte to string.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/to-index-byte-array-tp4108999p4109008.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: to index byte array

2014-01-01 Thread sweety
If you consider a client-server architecture, the documents will sent in
binary format to server, now for solr this binary format will be the source
to index, so i need to index byte array.
Also if store this byte-array into db and then index in solr, then will the
contents of document be searchable like normal documents(because the
contents are in binary format so will the solr match the query)??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/to-index-byte-array-tp4108999p4109023.html
Sent from the Solr - User mailing list archive at Nabble.com.


using extract handler: data not extracted

2014-01-11 Thread sweety
I need to index rich text documents, this is* solrconfig.xml for extract
handler*:



true
ignored_
true



My *schema.xml* is:










But after *indexing using this curl*:
curl
"http://localhost:8080/solr/document/update/extract?literal.id=12&commit=true";
-F"myfile=Coding.pdf"
when queried as q=id:12, the *output* is :

myfile


application/octet-stream


3336935


Coding.pdf


application/pdf

 *Contents not shown*
1456831756526157824
8eb229e0-5f25-4d26-bba4-6cb67aab7f81


Why is it so??

Also date_modified field does not appear??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: using extract handler: data not extracted

2014-01-11 Thread sweety
Sorry, that my question was not clear.
Initially when indexed pdf files it showed the data within this pdf in the
contents field.as follows:(this is output for initially indexed documents)

Cloud ctured As tale in size as well as complexity. We need a cloud based
system that will solve this problem.  Provide interfaces to registeP CSS
Client Measurements Benchmarkinse times by varying Number of documents
fromnds to millions Nuervers from 1 to 5 Storage and search options as
discussed abo


But for newly indexed documents, the contents field is empty, 
Actually coding.pdf is of 3mb size, but as shown in the output the contents
of this pdf are not extracted, indexing extracts the metadata,but not the
contents of the file,
the contents field is empty,   

what is the reason for this? Is is because of some jar missing?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110873.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: using extract handler: data not extracted

2014-01-11 Thread sweety
I set the level of extract handler to finest, now the logs are :
INFO: [document] webapp=/solr path=/update/extract
params={commit=true&literal.id=12&debug=true} {add=[12
(1456944038966984704)],commit=} 0 2631
Jan 11, 2014 7:51:57 PM org.apache.solr.servlet.SolrDispatchFilter
handleAdminRequest
INFO: [admin] webapp=null path=/admin/cores params={indexInfo=false&wt=json}
status=0 QTime=0 
Jan 11, 2014 7:51:57 PM org.apache.solr.core.SolrCore execute
INFO: [contract] webapp=/solr path=/admin/system params={wt=json} status=0
QTime=1 
Jan 11, 2014 7:51:58 PM org.apache.solr.core.SolrCore execute
INFO: [document] webapp=/solr path=/admin/mbeans params={stats=true&wt=json}
status=0 QTime=3 

This shows no error.
Also in the curl query i have set debug=true.

What is the reason?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110877.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: using extract handler: data not extracted

2014-01-11 Thread sweety
how set finest for tika package??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110888.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: using extract handler: data not extracted

2014-01-11 Thread sweety
the logging screen does not show tika package, also i searched on net, it
requires log4j and slf4j jars, is it true?? Do i need to do the
configurations for package level log?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110891.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: using extract handler: data not extracted

2014-01-11 Thread sweety
this is the output i get when indexed through* solrj*, i followed the link
you suggested.
i tried indexing .doc file.


400
17



org.apache.solr.search.SyntaxError: Cannot parse
'id:C:\solr\document\src\new_index_doc\document_1.doc': Encountered " ":" ":
"" at line 1, column 4. Was expecting one of:   ...  ... 
... "+" ... "-" ...  ... "(" ... "*" ... "^" ...  ...
 ...  ...  ...  ...  ...
"[" ... "{" ...  ...  ...

400



Also when indexed with *solrnet*, i get this error:
Caused by: java.lang.LinkageError: loader constraint violation: loader
(instance of org/apache/catalina/loader/WebappClassLoader) previously
initiated loading for a different type with name
"org/apache/xmlbeans/XmlCursor"
why this linkage error?? 

Now *curl does not work, neither does solrj and solrnet.*



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110915.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: using extract handler: data not extracted

2014-01-12 Thread sweety
ya right all 3 points are right.
Let me solve the 1 first, there is some errror in tika level indexing, for
that i need to debug at tika level right??
 but how to do that?? Solr admin does not show package wise logging.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110922.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: using extract handler: data not extracted

2014-01-12 Thread sweety
through command line(>java  -jar tika-app-1.4.jar -v C:Cloud.docx) apache
tika is able to parse .docx files,  so can i use this tika-app-1.4.jar in
solr?? how to do that??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110938.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: using extract handler: data not extracted

2014-01-12 Thread sweety
Sorry for the mistake.
im using solr 4.2, it has tika-1.3.
So now, java  -jar tika-app-1.3.jar -v C:Coding.pdf , parses pdf document
without error or msg.
Also, java  -jar tika-app-1.4.jar* -t *C:Cloud.docx, shows the entire
document.
Which means there is no problem in tika right??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110951.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: using extract handler: data not extracted

2014-01-12 Thread sweety
Sorry for the mistake. 
im using solr 4.2, it has tika-1.3. 
So now, java  -jar tika-app-1.3.jar -v C:\Coding.pdf , parses pdf document
without error or msg. 
Also, java  -jar tika-app-1.3.jar -t C:\Coding.pdf, shows the entire
document. 
Which means there is no problem in tika right??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110954.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: using extract handler: data not extracted

2014-01-12 Thread sweety
Sorry for the mistake. 
im using solr 4.2, it has tika-1.3. 
So now, java  -jar tika-app-1.3.jar -v C:\Coding.pdf , parses pdf document
without error or msg. 
Also, java  -jar tika-app-1.3.jar -t C:\Coding.pdf, shows the entire
document. 
Which means there is no problem in tika right??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110957.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: using extract handler: data not extracted

2014-01-12 Thread sweety
I am working on Windows 7



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110993.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solrcloud: no registered leader found and new searcher error

2014-02-17 Thread sweety
I have configured solrcloud as follows,
 

Solr.xml:

  


  


I  have added all the required config for solrcloud, referred this :
http://wiki.apache.org/solr/SolrCloud#Required_Config

I am adding data to core:document.
Now when i try to index using solrnet, (solr.Add(doc)) , i get this error :
SEVERE: org.apache.solr.common.SolrException: *No registered leader was
found, collection:document* slice:shard2
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:481)

and this error also:
SEVERE: null:java.lang.RuntimeException: *SolrCoreState already closed*
at
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:84)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:520)

I guess, it is because the leader is from core:contract and i am trying to
index in core:document?
Is there a way to change the leader, and how ?
How can i change the state of shards from gone to active?

Also when i try to query : q=*:* , this is shown
org.apache.solr.common.SolrException: *Error opening new searcher at*
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1415) at 

I read that, if number of commits exceed then this searcher error comes, but
i did not issue commit command,then how will the commit exceed. Also it
requires some warming setting, so i added this to solrconfig.xml, but still
i get the same error,


 
  
 solr
  0
  10

 rocks
  0
  10

  

2


I have just started with solrcloud, please tell if I am doing anything wrong
in solrcloud configurations.
Also i did not good material for solrcloud in windows 7 with apache tomcat ,
please suggest for that too.
Thanks a lot.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solrcloud-no-registered-leader-found-and-new-searcher-error-tp4117724.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solrcloud: no registered leader found and new searcher error

2014-02-17 Thread sweety
How do i get them running?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solrcloud-no-registered-leader-found-and-new-searcher-error-tp4117724p4117830.html
Sent from the Solr - User mailing list archive at Nabble.com.


to reduce indexing time

2014-03-05 Thread sweety
Before indexing , this was the memory layout,

System Memory : 63.2% ,2.21 gb
JVM Memory : 8.3% , 81.60mb of 981.38mb

I have indexed 700 documents of total size 12MB.
Following are the results i get : 
Qtime: 8122, System time : 00:00:12.7318648
System Memory : 65.4% ,2.29 gb
JVM Memory : 15.3% , 148.32mb of 981.38mb

After indexing 7,000 documents,
Qtime: 51817, System time : 00:01:12.6028320
System Memory : 69.4% 2.43Gb
JVM Memoery : *26.5%* , 266.60mb

After indexing 70,000 documents of 1200mb size, this are the results :
Qtime: 511447, System time : 00:11:14.0398768
System memory : 82.7% , 2.89Gb
JVM memory :* 11.8%* , 118.46mb

Here the JVM usage decreases as compared to 7000 doc, why is it  so?? 

This is* solrconfig.xml *;


${solr.document.log.dir:}

   
1000


60 
true


 
I am indexing through solrnet, indexing each document,  var res =
solr.Add(doc). // Doc doc = new Doc();

How do i reduce the time for indexing, as the size of data indexed is quite
less?? Will batch indexing reduce the indexing time?? But then, do i need to
make changes in solrconfig.xml
Also, i want the documents to be searched in 1sec of indexing.
Is it true that, if softcommit is done, then faceting cannot be done on the
data??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/to-reduce-indexing-time-tp4121391.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: to reduce indexing time

2014-03-05 Thread sweety
Now i have batch indexed, with batch of 250 documents.These were the results.
After 7,000 documents,
Qtime: 46894, System time : 00:00:55.9384892
JVM memory : 249.02mb, 24.8%
This shows quite a reduction in timing.

After 70,000 documents,
Qtime: 480435, System time : 00:09:29.5206727 
System memory : 82.8%, 2.90gb
JVM memory : 82% , 818.06mb //Here, the memory usage has increased, though
the timing has reduced.

After disabling softcommit and tlog, for 70,000 contracts.
Qtime: 461331, System time : 00:09:09.7930326
JVM Memory : 62.4% , 623.42mb. //Memory usage is less.

What causes this memory usage to  change, if the data to be indexed is same?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/to-reduce-indexing-time-tp4121391p4121441.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: to reduce indexing time

2014-03-05 Thread sweety
I will surely read about JVM Garbage collection. Thanks a lot, all of you.

But, is the time required for my indexing good enough? I dont know about the
ideal timings.
I think that my indexing is taking more time.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/to-reduce-indexing-time-tp4121391p4121483.html
Sent from the Solr - User mailing list archive at Nabble.com.


no such field error:smaller big block size details while indexing doc files

2013-10-07 Thread sweety
Im trying to index .doc,.docx,pdf files,
im using this url:
curl
"http://localhost:8080/solr/document/update/extract?literal.id=12&commit=true";
-F"myfile=@complex.doc"

This is the error I get:
Oct 07, 2013 5:02:18 PM org.apache.solr.common.SolrException log
SEVERE: null:java.lang.RuntimeException: java.lang.NoSuchFieldError:
SMALLER_BIG_BLOCK_SIZE_DETAILS
at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.NoSuchFieldError: SMALLER_BIG_BLOCK_SIZE_DETAILS
at
org.apache.poi.poifs.filesystem.NPOIFSFileSystem.(NPOIFSFileSystem.java:93)
at
org.apache.poi.poifs.filesystem.NPOIFSFileSystem.(NPOIFSFileSystem.java:190)
at
org.apache.poi.poifs.filesystem.NPOIFSFileSystem.(NPOIFSFileSystem.java:184)
at
org.apache.tika.parser.microsoft.POIFSContainerDetector.getTopLevelNames(POIFSContainerDetector.java:376)
at
org.apache.tika.parser.microsoft.POIFSContainerDetector.detect(POIFSContainerDetector.java:165)
at
org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113)
at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
... 16 more

Also using same type of url,txt,mp3 and pdf files are indexed successfully.
(curl
"http://localhost:8080/solr/document/update/extract?literal.id=12&commit=true";
-F"myfile=@abc.txt")

Schema.xml is:

 












 




  



id


Im not able to understand what kind of error this is,please help me.






--
View this message in context: 
http://lucene.472066.n3.nabble.com/no-such-field-error-smaller-big-block-size-details-while-indexing-doc-files-tp4093883.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: no such field error:smaller big block size details while indexing doc files

2013-10-08 Thread sweety
This my new schema.xml:

 












 

 


  


id

I still get the same error.


 From: Erick Erickson [via Lucene] 
To: sweety  
Sent: Tuesday, October 8, 2013 7:16 AM
Subject: Re: no such field error:smaller big block size details while indexing 
doc files
 


Well, one of the attributes parsed out of, probably the 
meta-information associated with one of your structured 
docs is SMALLER_BIG_BLOCK_SIZE_DETAILS and 
Solr Cel is faithfully sending that to your index. If you 
want to throw all these in the bit bucket, try defining 
a true catch-all field that ignores things, like this. 
 

Best, 
Erick 

On Mon, Oct 7, 2013 at 8:03 AM, sweety <[hidden email]> wrote: 

> Im trying to index .doc,.docx,pdf files, 
> im using this url: 
> curl 
> "http://localhost:8080/solr/document/update/extract?literal.id=12&commit=true";
>  
> -F"myfile=@complex.doc" 
> 
> This is the error I get: 
> Oct 07, 2013 5:02:18 PM org.apache.solr.common.SolrException log 
> SEVERE: null:java.lang.RuntimeException: java.lang.NoSuchFieldError: 
> SMALLER_BIG_BLOCK_SIZE_DETAILS 
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
>  
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
>  
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
>  
>         at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>  
>         at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>  
>         at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
>  
>         at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
>  
>         at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168) 
>         at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98) 
>         at 
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928) 
>         at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>  
>         at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) 
>         at 
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
>  
>         at 
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
>  
>         at 
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)
>  
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
>         at java.lang.Thread.run(Unknown Source) 
> Caused by: java.lang.NoSuchFieldError: SMALLER_BIG_BLOCK_SIZE_DETAILS 
>         at 
> org.apache.poi.poifs.filesystem.NPOIFSFileSystem.(NPOIFSFileSystem.java:93)
>  
>         at 
> org.apache.poi.poifs.filesystem.NPOIFSFileSystem.(NPOIFSFileSystem.java:190)
>  
>         at 
> org.apache.poi.poifs.filesystem.NPOIFSFileSystem.(NPOIFSFileSystem.java:184)
>  
>         at 
> org.apache.tika.parser.microsoft.POIFSContainerDetector.getTopLevelNames(POIFSContainerDetector.java:376)
>  
>         at 
> org.apache.tika.parser.microsoft.POIFSContainerDetector.detect(POIFSContainerDetector.java:165)
>  
>         at 
> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61) 
>         at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113) 
>         at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
>  
>         at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>  
>         at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>  
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797) 
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
>  
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
>  
>         ... 16 more 
> 
> Also using same type of url,txt,mp3 and pdf files are indexed successfully. 
> (curl 
> "http://localhost:8080/solr/document/update/extract?literal.id=12&commit=true";
>  
> -F"myfile=@abc.txt") 
> 
> Schema.xml is: 
>  
>  
>  multiValued="false"/> 
>  multiValued="true"/> 
>  multiValued="fal

Re: no such field error:smaller big block size details while indexing doc files

2013-10-09 Thread sweety
I will try using solrj.Thanks.

but I tried to index .docx file I am getting  some different error:
SEVERE: null:java.lang.RuntimeException: java.lang.VerifyError: (class: 
org/apache/poi/extractor/ExtractorFactory, method: createExtractor signature: 
(Lorg/apache/poi/poifs/filesystem/DirectoryNode;)Lorg/apache/poi/POITextExtractor;)
 Wrong return type in function
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.VerifyError: (class: 
org/apache/poi/extractor/ExtractorFactory, method: createExtractor signature: 
(Lorg/apache/poi/poifs/filesystem/DirectoryNode;)Lorg/apache/poi/POITextExtractor;)
 Wrong return type in function
at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:59)
at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:82)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
... 16 more
I read this 
solution(http://stackoverflow.com/questions/14696371/how-to-extract-the-text-of-a-ppt-file-with-tika),which
 says removal of jars solves errors,but there are no such mentioned jars in my 
classpath.
Is it that,Jars may cause the issue?

Thank You.



On Wednesday, October 9, 2013 12:54 PM, sweety shinde 
 wrote:
 
I will try using solrJ.

Now I tried indexing .docx files and I get some different error,logs are:
SEVERE: null:java.lang.RuntimeException: java.lang.VerifyError: (class: 
org/apache/poi/extractor/ExtractorFactory, method: createExtractor signature: 
(Lorg/apache/poi/poifs/filesystem/DirectoryNode;)Lorg/apache/poi/POITextExtractor;)
 Wrong return type in function
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
at

Re: no such field error:smaller big block size details while indexing doc files

2013-10-09 Thread sweety
I will try using solrJ.

Now I tried indexing .docx files and I get some different error,logs are:
SEVERE: null:java.lang.RuntimeException: java.lang.VerifyError: (class: 
org/apache/poi/extractor/ExtractorFactory, method: createExtractor signature: 
(Lorg/apache/poi/poifs/filesystem/DirectoryNode;)Lorg/apache/poi/POITextExtractor;)
 Wrong return type in function
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.VerifyError: (class: 
org/apache/poi/extractor/ExtractorFactory, method: createExtractor signature: 
(Lorg/apache/poi/poifs/filesystem/DirectoryNode;)Lorg/apache/poi/POITextExtractor;)
 Wrong return type in function
at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:59)
at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:82)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
... 16 more

But does the jars cause these errors? Because I read one solution which said 
removal of few jars in classpath may solve the errors,but those jars are not 
present in my classpath.(the link to solution 
:http://stackoverflow.com/questions/14696371/how-to-extract-the-text-of-a-ppt-file-with-tika)

Thank You.



On Wednesday, October 9, 2013 6:05 AM, Erick Erickson [via Lucene] 
 wrote:
 
Hmmm, that is odd, the glob dynamicField should 
pick this up. 

Not quite sure what's going on. You an parse the file 
via Tika yourself and look at what's in there, it's a relatively 
simple SolrJ program, here's a sample: 
http://searchhub.org/2012/02/14/indexing-with-solrj/

Best, 
Erick 

On Tue, Oct 8, 2013 at 4:15 PM, sweety <[hidden email]> wrote: 

> This my new schema.xml: 
>  
>  
>  multiValued="false"/> 
>  multiValued="true"/> 
>  multiValued="false"/> 
>  multiValued="false"/> 
>  multiValued="false"/> 
>  multiValued="false"/> 
>  multiValued="false"/> 
>  multiValued="false"/> 
>  multiValued="true"/> 
>  
>  
>  
>  
>  
>  class="solr.StrField" /> 
>  
>  
>  
>  
>  
> id 
>  
> I still get the same error. 
> 
>  
>  From: Erick Erickson [via Lucene] <[hidden email]> 
> To: sweety <[hidden email]> 
> Sent: Tuesday, October 8, 2013 7:16 AM 
> Subject: Re: no such field error:smaller big block size details while 
> indexing doc files 
> 
> 
> 
> Well, one of the attributes parsed out of, probably the 
> meta-information associated with one of your structured 
> docs is SMALLER_BIG_BLOCK_SIZE_DETAILS and 
> Solr Cel is faithfully sending that to your index. If you 
> want to throw all these in the bit bu