Hi,
I would like to know the best strategy/standards to follow for indexing
multiple document types thru SOLR.
In other words, let us say we have a file upload form thru which user woudl
upload the files of different types (text, html, xml, word docs,excel
sheets, pdf, jpg, gif..etc)
Once we save
Hi,
I wonder if someone might be able to shed some insight into this problem:
Is it possible and/or what is the best/accepted way to achieve deduplication of
documents by field at query-time?
For example:
Let's say an index contains:
Doc1
host:Host1
time:1
Thanks Eric
Correctly said!!
Initially we used to have a different settings for queryResultCache which
used to serve the purpose of serving queries from the cache.
But we changed the settings some days back to see if there were any
issues/improvements.
I believe we need to switch back to some s
> I would like to know the best strategy/standards to follow for indexing
> multiple document types thru SOLR.
> In other words, let us say we have a file upload form thru which user woudl
> upload the files of different types (text, html, xml, word docs,excel
http://lucene.apache.org/tika/
http
Hi,
I did not read the original mail, but for the UTF-8 issue with Tomcat
you might consult the url http://wiki.apache.org/solr/SolrTomcat
The relevant piece of information is under "URI Charset Config":
*** quote ***
Edit Tomcat's conf/server.xml and add the following attribute to the correct
waw...
well, transactional or "transactional", whether it's a nice feature to
have or just a "selling point". Bottom line, For some applications
Compass can be very appealing, for other Solr will be the choice. In the
last several years I've integrated both in different applications and
gaine
This manner of detecting duplicates at query time does really match
with what field collapsing does. So I suggest you look into that. As
far as I know there isn't any function query that does something you
have described in your example.
Cheers,
Martijn
On 23 January 2010 12:31, Peter S wrote:
"newly added" is a bit vague. Do you mean "since last Sunday" ? "between
the last and the one before that" ? Also, do you need to
distinguish between updated and newly added documents ?
Perhaps you could be more specific about the use case.
-Simon
On Fri, Jan 22, 2010 at 4:25 AM, Erik Hatcher
After mass upload of docs in Solr I get some "REMOVING ALL DOCUMENTS FROM
INDEX" without any explanation.
I was running indexing w/ Solr for several weeks now and everything was ok -
I indexed 22K+ docs using the SimplePostTool
I was first launching
*:*
then some 22K+ ...
with a finishing
But
Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS] wrote:
Hi,
I'm trying to see if I can use termVectors for a use case I have. Essentially I want to
know is: where in the indexed value does the query hit occur? I think either tv.positions
or tv.offsets would provide that info but I don't really grok
Hi All,
I am using the SolrCache to store some external data in my search app (to be
used in a modified DisMaxHandler) and I was wondering if there is a way to
get at this data from the JSP pages? I then thought that it might be nice to
view more information about the respective caches like the cu
Are you using the DIH? If so, did you try setting clean=false in the URL
line? That prevents wiping out the index on load.
On Jan 23, 2010 4:06 PM, "Bogdan Vatkov" wrote:
After mass upload of docs in Solr I get some "REMOVING ALL DOCUMENTS FROM
INDEX" without any explanation.
I was running inde
I'd like to provide a hierarchical faceting functionality.
An example would be location drill down such as USA -> New York -> New York
City -> SoHo
The number of levels can be arbitrary. One way to handle this could be to use a
special character as separator, store values such as "USA|New York|
13 matches
Mail list logo