Hello,
I have a file with the input string "91{40}9490949090", and I wanted to return
this file when I search for the query string "+91?40?9*". The problem is that,
the input string is getting indexed as 3 terms "91", "40", "9490949090". Is
there a way to consider "{" and "}" as part of the
Hello,
I have the following piece of code :
ContentStreamUpdateRequest contentUpdateRequest = new
ContentStreamUpdateRequest("/update/extract");
contentUpdateRequest.addFile(new File(contentFileName));
contentUpdateRequest.setParam("extractOnly","true");
NamedList result = solrServerSession.req
arini
wrote:
> Some problem with extraction (Tika, etc...)? My suggestion is : try to
> extract manually the document...I had a lot of problem with Tika and pdf
> extraction...
>
> Cheers,
> Andrea
>
> Il 13/04/2010 13:05, Sandhya Agarwal ha scritto:
>>
>> Hello,
Hello,
As I understand, we have to use the syntax { * TO } or [ * TO
], for queries less than or less than or equal to , etc;
Where is a numeric field.
There is no direct < or <= syntax supported. Is that correct ?
Thanks,
Sandhya
ay, April 14, 2010 5:09 PM
To: solr-user@lucene.apache.org
Subject: Re: solr numeric range queries
On Apr 14, 2010, at 6:09 AM, Sandhya Agarwal wrote:
> Hello,
>
> As I understand, we have to use the syntax { * TO } or [ *
> TO ], for queries less than or less than or equal
>
Hello,
We want to design a solution where we have one polling directory (data source
directory) containing the xml files, of all data that must be indexed. These
XML files contain a reference to the content file. So, we need another
datasource that must be created for the content files. Could s
sor (I think)
FLEP walks the directory and supplies a separate record per file.
BFDS pulls the file and supplies it to TikaEntityProcessor.
BinFileDataSource is not documented, but you need it for binary data
streams like PDF & Word. For text files, use FileDataSource.
On 4/14/10, Sandh
ng Solr1.4 or later, take a look at solr
trie range support.
http://www.lucidimagination.com/blog/2009/05/13/exploring-lucene-and-solrs-trierange-capabilities/
Ankit
-Original Message-----
From: Sandhya Agarwal [mailto:sagar...@opentext.com]
Sent: Wednesday, April 14, 2010 7:56 AM
To:
Hello,
Is it a problem if I use *copyField* for some fields and not for others. In my
query, I have both fields, the ones mentioned in copyField and ones that are
not copied to a common destination. Will this cause an anomaly in my search
results. I am seeing some weird behavior.
Thanks,
Sandh
Hello,
I am confused about the proper usage of the Boolean operators, AND, OR and NOT.
Could somebody please provide me an easy to understand explanation.
Thanks,
Sandhya
Thank You Mitch.
I have a query mentioned below : (my defaultOperator is set to "AND")
(field1 : This is a good string AND field2 : This is a good string AND field3 :
This is a good string AND (field4 : ASCIIDocument OR field4 : BinaryDocument OR
field4 : HTMLDocument) AND field5 : doc)
This i
Also, one of the fields here, *field3* is a dynamic field. All the other fields
except this field, are copied into "text" with copyField.
Thanks,
Sandhya
-Original Message-----
From: Sandhya Agarwal [mailto:sagar...@opentext.com]
Sent: Monday, April 19, 2010 2:55 PM
To:
Thanks Erick. Using parentheses works.
With parentheses, the query,q=field1: (this is a good string) is parsed as
follows :
+field1:this +field1:good +field1:string
Is that ok to do.
Thanks,
Sandhya
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Tues
Hello,
What are the advantages of using the “dismax” query handler vs the “standard”
query handler. As I understand, “dismax” queries are parsed differently and
provide more flexibility w.r.t score boosting etc. Do we have any more reasons ?
Thanks,
Sandhya
Hello,
I see that solr 1.4 is bundled with tika 0.4, which does not do proper content
extraction of zip files. So, I replaced tika jars with the latest tika 0.7
jars. I still see an issue and the individual files in the zip file are not
being indexed. Any configuration I must do to get this wor
Hello,
I am using ContentStreamUpdateRequest, to index binary documents. At the time
of indexing the content, I want to be able to index some additional metadata as
well. I believe, this metadata must be provided, prefixed with *literal*. For
instance, I have a field named “field1”, defined in
I observed the same issue too, with tika 0.7 jars. It now fails to extract
content from documents of any type. Works with tika 0.5 though.
Thanks,
Sandhya
-Original Message-
From: pk [mailto:pkal...@gmail.com]
Sent: Friday, April 30, 2010 3:17 PM
To: solr-user@lucene.apache.org
Subject:
@lucene.apache.org
Subject: Re: Indexing metadata in solr using ContentStreamUpdateRequest
What does your schema look like?
On Apr 30, 2010, at 3:47 AM, Sandhya Agarwal wrote:
> Hello,
>
> I am using ContentStreamUpdateRequest, to index binary documents. At the time
> of indexing the content,
Hello,
Please let me know if anybody figured out a way out of this issue.
Thanks,
Sandhya
-Original Message-
From: Praveen Agrawal [mailto:pkal...@gmail.com]
Sent: Friday, April 30, 2010 11:14 PM
To: solr-user@lucene.apache.org
Subject: Re: Problem with pdf, upgrading Cell
Grant,
You
Hello,
But I see that the libraries are being loaded :
INFO: Adding specified lib dirs to ClassLoader
May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader
INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/asm-3.1.jar' to
classloader
May 4, 2010
Yes, Grant. You are right. Copying the tika libraries to solr webapp, solved
the issue and the content extraction works fine now.
Thanks,
Sandhya
-Original Message-
From: Sandhya Agarwal [mailto:sagar...@opentext.com]
Sent: Tuesday, May 04, 2010 12:58 PM
To: solr-user@lucene.apache.org
-Original Message-
From: Sandhya Agarwal [mailto:sagar...@opentext.com]
Sent: Tuesday, May 04, 2010 1:10 PM
To: solr-user@lucene.apache.org
Subject: RE: Problem with pdf, upgrading Cell
Yes, Grant. You are right. Copying the tika libraries to solr webapp, solved
the issue and the content
On Behalf Of Grant Ingersoll
Sent: Tuesday, May 04, 2010 4:00 PM
To: solr-user@lucene.apache.org
Subject: Re: Problem with pdf, upgrading Cell
Yes, it is loading the libraries, but they are in a different classloader that
apparently the new way Tika loads doesn't have access to.
-Gra
Praveen,
Along with the tika core and parser jars, did you run "mvn
dependency:copy-dependencies", to generate all the dependencies too.
Thanks,
Sandhya
-Original Message-
From: Praveen Agrawal [mailto:pkal...@gmail.com]
Sent: Tuesday, May 04, 2010 4:52 PM
To: solr-user@lucene.apache.o
-user@lucene.apache.org
Subject: Re: Problem with pdf, upgrading Cell
Yes Sandhya,
i copied new poi/jempbox/pdfbox/fontbox etc jars too. I believe this is what
you were asking.
Thanks.
On Tue, May 4, 2010 at 5:01 PM, Sandhya Agarwal wrote:
> Praveen,
>
> Along with the tika core and parser
ote:
Yes Sandhya,
i copied new poi/jempbox/pdfbox/fontbox etc jars too. I believe this is what
you were asking.
Thanks.
On Tue, May 4, 2010 at 5:01 PM, Sandhya Agarwal
mailto:sagar...@opentext.com>> wrote:
Praveen,
Along with the tika core and parser jars, did you run "mvn
dependency:copy-
Thanks,
Praveen
On Tue, May 4, 2010 at 5:28 PM, Sandhya Agarwal wrote:
> Both the files work for me, Praveen.
>
> Thanks,
> Sandhya
>
> From: Praveen Agrawal [mailto:pkal...@gmail.com]
> Sent: Tuesday, May 04, 2010 5:22 PM
> To: solr-user@lucene.apache.org
.jar
metadata-extractor-2.4.0-beta-1.jar
pdfbox-1.1.0.jar
poi-3.6.jar
poi-ooxml-3.6.jar
poi-ooxml-schemas-3.6.jar
poi-scratchpad-3.6.jar
tagsoup-1.2.jar
tika-core-0.7.jar
tika-parsers-0.7.jar
xml-apis-1.0.b2.jar
xmlbeans-2.3.0.jar
Thanks,
Sandhya
-Original Message-
From: Sandhya Agarwal
hemas-3.6.jar
> > poi-scratchpad-3.6.jar
> > tagsoup-1.2.jar
> > tika-core-0.7.jar
> > tika-parsers-0.7.jar
> > xml-apis-1.0.b2.jar
> > xmlbeans-2.3.0.jar
> >
> > Thanks,
> > Sandhya
> >
> >
> >
> > -Original Message
On May 4, 2010, at 3:28 AM, Sandhya Agarwal wrote:
>
> > Hello,
> >
> >
> >
> > But I see that the libraries are being loaded :
> >
> >
> >
> > INFO: Adding specified lib dirs to ClassLoader
> >
> > May 4, 2010 12:49:59 PM org.apac
Hello,
Can somebody please point me to an example, of how we can leverage
*stream.file* for streaming documents, using UpdateRequest API. (SolrJ API)
Thanks,
Sandhya
Sorry. That is what I meant. But, I put it wrongly. I have not been
able to find examples of using solrj, for this.
- Sent from iPhone
On 07-May-2010, at 1:23 AM, "Chris Hostetter"
wrote:
>
> : Subject: Example of using "stream.file" to post a binary file to
> solr
>...
> : Can somebo
Yes, I did. But, I don't find a solrj example there. The example in
the doc uses curl.
- Sent from iPhone
On 07-May-2010, at 8:12 PM, "Chris Hostetter"
wrote:
> : Sorry. That is what I meant. But, I put it wrongly. I have not been
> : able to find examples of using solrj, for this.
>
> did
Hello,
It is observed that TIKA does not extract the "Content-Language" for documents
encoded in UTF-8. For natively encoded documents, it works fine. Any idea on
how we can resolve this ?
Thanks,
Sandhya
34 matches
Mail list logo