Just wanted to know if anyone has used LucidWorks Solr.
- How do you compare it to the standard Apache Solr?
- the non-blocking IO of LucidWorks Solr -- is that for networking IO or disk
IO? what are its effects?
- LucidWorks website also talked about "significantly improved faceting
performa
Thanks for asking, I am interested as well in reading the response to
your questions.
Paolo
Andy wrote:
Just wanted to know if anyone has used LucidWorks Solr.
- How do you compare it to the standard Apache Solr?
- the non-blocking IO of LucidWorks Solr -- is that for networking IO or disk
Hi,
I need to submit thousands of online PDF/html files to Solr. I can submit
one file using SolrJ (StreamingUpdateSolrServer and
..solr.common.util.ContentStreamBase.URLStream), setting
literal.idparameter to the url. I can't do the same with a batch of
multiple files, as
their 'id' should be uniq
Hi,
I need to submit thousands of online PDF/html files to Solr. I can submit
one file using SolrJ (StreamingUpdateSolrServer and
..solr.common.util.ContentStreamBase.URLStream), setting literal.id
parameter to the url. I can't do the same with a batch of multiple files, as
their 'id' should be un
I am.using text for type, which is static. For example: type is a field and
I am using type for categorization. For news type I am using news and for
blog using blog.. type is a text field.
On Apr 17, 2010 8:38 PM, "Ahmet Arslan" wrote:
> I am facing problem to get facet result count. I must be
Hi,
while posting a sample pdf (that comes with Solr dist'n) to solr, i'm
getting a TikaException.
Using Solr-1.4, SolrJ (StreamingUpdateSolrServer) for posting pdf to solr.
Other sample pdfs can be parsed and indexed successfully.. I;m getting same
error with some other pdfs also (but adobe read
Thanks everyone, It works! I have successfully indexed them. Thanks again!
I have couple of more questions regarding with solr, if you don't mind.
1-) As I said before, the text files are quite large, between
100kb-10mb, but I need to store them as well for highlighting,
including with their titl
Can you extract content from this using Tika's standalone command line tool?
PDF's are notorious for problems in extracting. To me, it looks like a bug in
PDFBox. I would try to isolate it down to there and then send, if possible,
the sample document to PDFBox and see if they can come up w/ a
On Apr 18, 2010, at 3:53 AM, Andy wrote:
> Just wanted to know if anyone has used LucidWorks Solr.
>
> - How do you compare it to the standard Apache Solr?
We take a release of Solr. We wrap it w/ an installer, tomcat/jetty, our
reference guide, Luke, etc. We also add in an optimized versio
FAIK, There are no columns per se. But in the past I've just used UTM
values for each lat lon and just do
basic numeric operators >, < to search within a bounding geographic
region. Add them as numeric fields though. Easy.
There is new support for spatial searching, however I'm not sure how it
com
The DataImportHandler has a tool for doing PDF extraction. This allows
you to create new fields, do multiple files, and supply lists of
access to get the multiple files.
http://wiki.apache.org/solr/TikaEntityProcessor
On Sun, Apr 18, 2010 at 9:52 AM, pk wrote:
>
> Hi,
> I need to submit thousand
Highlighting is a complex topic. A field has to be stored to be
highlight. It does not have to be indexed. But, if it is not,
highlighting analyzes it just like if it was indexed in order to
highlight it.
http://www.lucidimagination.com/search/document/CDRG_ch07_7.9?q=highlighting
http://www.luci
Can we see the actual field definitions from your schema file.
Ahmet's question is vital and is best answered if you'll
copy/paste the relevant configuration entries But based
on what you *have* posted, I'd guess you're trying to
facet on tokenized fields, which is not recommended.
You might t
Because there is a lot of data, and for scalability reasons we want all
non-write operations to happen from a slave - we don't want to be using
the master unless necessary
On 17/04/10 08:28, Otis Gospodnetic wrote:
Hm, why not just go to the MySQL master then?
Otis
Sematext :: http://se
I don't really understand how this will help. Can you elaborate ?
Do you mean that the last_index_time can be imported from somewhere
outside solr? But I need to be able to *set* what last_index_time is
stored in dataimport.properties, not get properties from somewhere else
On 18/04/10 10:
Hi Erick,
My schema configuration is following.
On Mon, Apr 19, 20
--- On Sun, 4/18/10, Grant Ingersoll wrote:
>
> Sure, but I'm biased. ;-) Hopefully, you will find it
> useful, but choose the one that best fits your needs (and
> let me know if you need help assessing that.)
>
Thanks for the explanation Grant.
WHat is the advantage of KStem over the sta
Lance,
I can submit and extract pdf contents using Solr and SolrJ, as i indicated
earlier.
I've made 'id' a mandatory field and i had to submit its value while
submitting (request.addParams("literal.id",url))..
If i put multiple files/streams in the request, then i can't put 'id' this
way as the
Hello,
Is it a problem if I use *copyField* for some fields and not for others. In my
query, I have both fields, the ones mentioned in copyField and ones that are
not copied to a common destination. Will this cause an anomaly in my search
results. I am seeing some weird behavior.
Thanks,
Sandh
19 matches
Mail list logo