Saratv,
is there any unique-ID (defined in your schema.xml) that may be duplicate?
- Mitch
saratv wrote:
>
> I am trying to use DIH (where database has around 93k rows..from different
> tables), and when i ran full import few times, only 91k documents were
> indexed (not sure why and what doc
I don't know much about how Solr does its locking, so I'm guessing below:
It looks like one thread is doing a commit, by closing the writer, and
is likely holding a lock that prevents other (add/delete) ops from
running? Probably this lock is held because the writer is in the
process of being clos
Hi Solr Gurus
We are thinking about optimizing our production master slave solr setup,
just wanted to poll the group on following questions:
1. Currently we are using autocommit feature with setting of 50 docs and 5
mins. Now the requirement is to reduce this time. So we are analyzing the
situati
Hello,
I configured a Solr server to be able to extract data from various documents,
including pdfs. Unfortunately, the data extraction fails on several pdfs. I
have read around here that this may be due to the old Tika library being used?I
looked around and saw that the svn had a newer version
Marc, got anything in your logs?
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: Marc Ghorayeb
> To: solr-user@lucene.apache.org
> Sent: Fri, April 23, 2010 8:42:53 AM
> Subject: Probl
I'm launching it with the start.jar utility, and there doesn't seem to be
anything weird inside the console when i upload a pdf. Is there a way to output
the console to a log file? The only log file that get's updated is a log file
in the logs directory, and it seems to only show the input/oupu
Seems like i'm not the only one with this "no extraction"
problem:http://www.mail-archive.com/solr-user@lucene.apache.org/msg33609.htmlApparently
he tried the same thing, building from the trunk, and indexing a pdf, and no
extraction occured... Strange.
Marc G.
> From: dekay...@hotmail.com
> T
Seems like i'm not the only one with this "no extraction"
problem:http://www.mail-archive.com/solr-user@lucene.apache.org/msg33609.htmlApparently
he tried the same thing, building from the trunk, and indexing a pdf, and no
extraction occured... Strange.
Marc G.
We want to support that a user can register for interest in information,
based on a query he has defined himself. For example that he type in a
query, press a save button, provides his email and the system will now
email him with a daily digest.
As part of this, it would be nice to be able to t
Hi there,
Is it possible to do a search more than once, where only the filter query
changes. The response is the three different search results.
We want a page which shows a "clustered" view of 5 of each of the three
types (images, news articles, editorial articles), ordered by their score.
One
Hello Gert,
I think you'd have to apply custom heuristics that involves looking at top N
hits for each query and looking at the % overlap.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> Fr
Hi,
I'm working with Solr 1.4.
My schema has about 50 fields.
I'm using full text search in short strings (~ 30-100 terms) and
facetted search.
My index will have 100 000 documents.
The number of requests per second will be low. Let's say between 0 and
1000 because of auto-complete.
Is a st
Marc,
These are your request logs. You want to look at your Solr logs.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: Marc Ghorayeb
> To: solr-user@lucene.apache.org
> Sent: Fri, Ap
Hi,
I have a Question- Merging Solr Cores
The Wiki Documentation says that "Merged" core must exist prior to calling
the merge command
So I created the "Merged" core and pointed it to some "data dir".
However even after merging the cores it does still points to the old "data
dir"
Shouldn't th
Hi,
Yes, a custom SearchComponent will do this. We'd done stuff like this before
and actually have this sort of functionality in some of Sematext products - it
works well if you don't mind writing and adding another SearchComponent to your
chain.
Otis
Sematext :: http://sematext.com/ ::
Xavier,
0-1000 QPS is a pretty wide range. Plus, it depends on how good your
auto-complete is, which depends on types of queries it issues, among other
things.
100K short docs is small, so that will all fit in RAM nicely, assuming those
other processes leave enough RAM for the OS to cache the
Yes, the only log i can actually get is the one in the command console from
windows and there are no errors there ...
Here are the last lines when i upload a pdf to the update/extract url:
Apr 23, 2010 5:47:03 PM org.apache.solr.servlet.SolrServlet initINFO:
SolrServlet.init() doneApr 23, 2010 5
Le 23/04/2010 17:08, Otis Gospodnetic a écrit :
Xavier,
0-1000 QPS is a pretty wide range. Plus, it depends on how good your
auto-complete is, which depends on types of queries it issues, among other
things.
100K short docs is small, so that will all fit in RAM nicely, assuming those
other p
On Fri, Apr 23, 2010 at 5:48 PM, Marc Ghorayeb wrote:
>
> Yes, the only log i can actually get is the one in the command console from
> windows and there are no errors there ...
> Here are the last lines when i upload a pdf to the update/extract url:
I am pretty sure it is the tika itself that
Uggg I just got bit hard by this on a Tomcat project ...
https://issues.apache.org/jira/browse/SOLR-1238
Is there anyway to get access to that RequestEntity w/o patching? Also are
there security implications w/ using the repeatable payloads?
Thanks.
- Jon
Or, use facet.query to get the overlap. Here's ?
q=&facet=on&facet.query=
You'll get the hit count from query #1 in the results, and the
overlapping count to query #2 in the facet query response.
Erik - http://www.lucidimagination.com
On Apr 23, 2010, at 11:01 AM, Otis Gospodnetic
Hi All,
I am trying to restrict facets in solr response, by setting facet.mincount =
1, which does not work as the request and response are shown below:
REQUEST:
http://localhost:8983/solr/select/?q=*%3A*&version=2.2&rows=0&start=0&indent=on&facet=true&facet.field=Instrument&facet.field=Locati
Does anyone know of any advantages/disadvantages to running SOLR on
WebSphere versus Tomcat?
Thanks,
Ken
I've never used WebSphere, but I always got the impression that people have
more issues with it than with simpler solutions.
Personally, I would suggest Jetty. I've used it dozens of times and never had
issues with it. It's small, simple, and fast.
Otis
Sematext :: http://sematext.com/ ::
Xavier,
100-700 QPS is still high. I'm guessing your 1 box won't handle that without
sweating a lot (read: slow queries).
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: Xavier Schepl
Chris,
It looks like Mike already offered several solutions though I don't know
what Solr does without looking at the code.
But I'm curious:
* how big is your index? and do you know how large the segments being merged
are?
* do you batch docs or do you make use of Streaming SolrServer?
I'm
: basically, we are running query with field collapsing (Solr 1.4 with
: patch 236). The responses tells us that there are about 2700 documents
: matching our query. However, I can not get passed the 431th document.
: From this point on, the response will not contain any document.
isn't that ho
We have run SOLR in weblogic without problems. The only change we see is
some spurious extra logging info which we don¹t see in the case of tomcat.
Anyone have an idea of how to control that ?
Thanks
Shantanu
On 4/23/10 12:53 PM, "Ken Lane (kenlane)" wrote:
> Does anyone know of any advantages
I was thinking along the lines
1. Retrieve the top result for one query.
2. Take the resulting document and evaluate the score that it would get in
another query.
3. If the scores are similar, then the queries most likely overlap.
I guess that if I had two simple query strings "archive crash"
Gert,
In your second query example you used "qf=...". Did you mean "fq=" ? If
so, the answer is no - filter queries don't affect the score.
I haven't tried your approach, but intuitively feel that looking at % overlap
may work better.
Otis
Sematext :: http://sematext.com/ :: Solr -
Yes, your solution is much simpler, providing the result through a single
query. I didnt understand it the first time I read it.
I guess you would need to run it backwards as well to really evaluate the
relevance, i.e.
First
q=&facet=on&facet.query=
Then
q=&facet=on&facet.query=
Unfortunately you haven't answered my question, saratv.
The important question is, why did your DIH-configuration not import those
rows.
Without providing any schema-information or configuration-details of your
DIH, no one will be able to help you.
Just for the future: If something don't work, p
Umesh_ wrote:
Hi All,
I am trying to restrict facets in solr response, by setting facet.mincount =
1, which does not work as the request and response are shown below:
REQUEST:
http://localhost:8983/solr/select/?q=*%3A*&version=2.2&rows=0&start=0&indent=on&facet=true&facet.field=Instrument&fa
Is it possible to use boost function across the whole index/empty search
term?
I'm guessing the next question that would be asked is "Why would you want to
do that". Well with have a bunch of custom business metrics included in each
document (a product). I would like to only show the best produc
Hello list, first time posting here. I am trying to find an answer to
a strange search behaviour we're finding in our VuFind application. In
order to eliminate any VuFind related variables, I have used the
vanilla Solr example schema to try our problematic search.
I posted this xml to the e
35 matches
Mail list logo