Hi
I'm using Tika 0.10 for indexing my documents but I am not getting the
expected results when doing a search. Even after I delete the index and
started over again.
Some of the words in for example a PDF document can be found but most of
them not. Is it related to some language setting perhap
Hi
This topic has probably been covered before, but I havnt had the luck to
find the answer.
We are running solr instances with several cores inside. Solr running
out-of-the-box on top of jetty. I believe jetty is receiving all the
http-requests about indexing ned documents, and forwards it
Someone can help me?
Leonardo
--
View this message in context:
http://lucene.472066.n3.nabble.com/error-in-indexing-tp3709686p3712495.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi Arkadi,
You can try to extract text from your documents using Tika's CLI (more
details http://tika.apache.org/0.7/gettingstarted.html).
If you were succeeded that means that something goes wrong during the
indexing. Tika only extracts text and metadata from the documents and sends
this text to
hi,
I have an opinion mining application running solr that serves to retrieve
documents and perform some analytics using facet queries.
It works great. But I have a big issue.
The document has an attribute for opinion that is automatically detected,
but users can change it if it´s not correct.
A d
> I'm using Tika 0.10 for indexing my documents but I am not
> getting the expected results when doing a search. Even after
> I delete the index and started over again.
> Some of the words in for example a PDF document can be found
> but most of them not.
It could be the maxFieldLength setting in
Hello Arian,
Pls look into
http://sujitpal.blogspot.com/2011/05/custom-sorting-in-solr-using-external.htmlit
can be useful for your purpose. If you need to count facets against an
external field you need to develop your own component - shouldn't be a big
deal.
Solr's bolts are
http://lucidworks.lu
Prateesh:
I'm not understanding here. I believe Tamanjit is correct. Your example
works if and only if *all* the groups are returned, which happens in the
example case but not in the general case. Try your experiment with
&rows=3 and you'll find that (trunk, example)
This search:
http://localhos
Perhaps you could review:
http://wiki.apache.org/solr/UsingMailingLists
You really haven't shown us what it is that you're doing that
generates this error, about all you've said is "it doesn't work".
I'd start with trying to index a document with only the required
fields for your particular schem
Erick,
yes, u r correct. But with that example I only wanted to explain Tamanjit
that "matches" in solr response contains all docs which matched with group
query.
Tamanjit if u want the counts of docs of only first page according to
"rows" parameter then that is the only way which u mentioned...it
Unfortunately, the answer is "it depends(tm)".
First question: How are you indexing things? SolrJ? post.jar?
But some observations:
1> sure, using multiple cores will have some parallelism. So will
using a single core but using something like SolrJ and
StreamingUpdateSolrServer. Especial
hi Mikhail
external fields was one of the options, but I was not 100% sure if it would
fit.
I will study more about this option.
thank you so much for your reply
Arian
2012/2/3 Mikhail Khludnev
> Hello Arian,
>
> Pls look into
>
> http://sujitpal.blogspot.com/2011/05/custom-sorting-in-solr-usi
timeAllowed can be used outside distributed search. It is used by the
TimeL¡mitingCollector. When the search time is equal to timeAllowed it will
stop searching and will return the results that could find till then.
This can be a problem when using incremental indexing. Lucene starts
searching from
> external fields was one of the options, but I was not 100%
> sure if it would
> fit.
> I will study more about this option.
I was wondering if Lucene's ToChildBlockJoinQuery and/or ToParentBlockJoinQuery
can be replacement for ExternalFileField.
http://www.searchworkings.org/blog/-/blogs/toch
Hi!
I am having a weird issue with a search string not producing a match
where it should. I can reproduce it with both 3.4 and 3.5.
"Where it should" means that I am getting a hit in the "Analyse" tool
in the admin panel, but not in a query via /select.
Now when I try
select?q=Am+Heidstamm&.
UPDATE:
I set my app server[1] system property jetty.port to be equal to the app
servers open port and was able to get two Solr shards to talk.
The overall properties I set are:
App server domain 1:
bootstrap_confdir
collection.configName
jetty.port
solr.solr.home
zkRun
App server domain 2:
What about side of the field?
On Fri, Feb 3, 2012 at 6:11 PM, Marian Steinbach wrote:
> Hi!
>
> I am having a weird issue with a search string not producing a match
> where it should. I can reproduce it with both 3.4 and 3.5.
>
> "Where it should" means that I am getting a hit in the "Analyse"
Hi Mark,
Thanks for looking into the issue.
As for specifying the bootstrap dir for each instance with ZK, it was just
a typo on my side. I went back and looked at my script on the second and
3rd nodes and it did not have the bootstrp dir, so I had specified it for
only the very FIRST node that re
2012/2/3 Dmitry Kan :
> What about side of the field?
>
It's identical. At least that's what I think, since I din't specify
the type="query" or type="index" attribute for the analyzer part.
Marian
Actually, I wouldn't count on it and just specify index and query sides
explicitly. Just to play it safe.
On Fri, Feb 3, 2012 at 8:34 PM, Marian Steinbach wrote:
> 2012/2/3 Dmitry Kan :
> > What about side of the field?
> >
>
> It's identical. At least that's what I think, since I din't specify
No, don't do that. That's definitely not good advice. If the analysis chain
is the same for both index and query, just use .
As for Marian's issue... was there literally a + in the query or was that
urlencoded? Try debugQuery=true for both queries and see what you get for the
query parsing
Hi everyone!
I'm also having some zero match weirdness. When I execute this search:
?q=Create+a+self+contained+Part+Module&defType=edismax&qf=location^0.9+text^0.8+fileName^8.0+title^4.0
I get ZERO results.
If I remove the fileName qf parameter (an indexed but not stored field), I get
5 hits.
Is the following a reasonable approach to setting a connection timeout
with SolrJ?
queryCore.getHttpClient().getHttpConnectionManager().getParams()
.setConnectionTimeout(15000);
Right now I have all my solr server objects sharing a single HttpClient
that gets created us
2012/2/3 Erik Hatcher :
> As for Marian's issue... was there literally a + in the query or was that
> urlencoded? Try debugQuery=true for both queries and see what you get for
> the query parsing output.
>
I tested both + and %20 with and without quotes, it doesn't make a
difference whether I
It just got rid of the one field "aktenzeichen" never matching in the
qf string. Now it works fine. Solved for now.
Thanks!
:
?q=Create+a+self+contained+Part+Module&defType=edismax&qf=location^0.9+text^0.8+fileName^8.0+title^4.0
:
: I get ZERO results.
:
: If I remove the fileName qf parameter (an indexed but not stored field), I
get 5 hits.
lemme guess: fileName doesn't use stopwords but other fields do, correct?
Ok, thanks, Erick, good to know. Sorry for the confusion.
On Fri, Feb 3, 2012 at 9:42 PM, Erik Hatcher wrote:
> No, don't do that. That's definitely not good advice. If the analysis
> chain is the same for both index and query, just use .
>
> As for Marian's issue... was there literally a + in
On Feb 3, 2012, at 1:04 PM, Darren Govoni wrote:
> I deployed each war app into the "/solr" context. I presume its needed by
> remote URL addressing.
Yup - but you can override this by setting the hostContext in solr.xml. It
defaults to "solr" as that fits the example jetty distribution.
- Ma
: Subject: Re: error in indexing
FWIW: it's really crucial to state which version of Solr you are using
when you have bugs with error stack traces like this -- going back through
the versions i'm *guessing* that you are using Solr 1.4.1 (or possibly
older), correct?
Based on that assumption (
I'd like to use both the ReversedWildcardFilterFactory and
PorterStemFilterFactory on a text field that I have, I'd like to avoid
stemming the reversed fields and would also like to avoid reversing
the stemmed fields. My original thought was to have the
ReversedWildcardFilterFactory higher in the
: Has anyone had experience using frange with multi-valued fields? In
: solr 3.5 doing so results in the error: "can not use FieldCache on
: multivalued field"
correct.
: Here's the use case. We have multiple years attached to each document
: and want to be able to refine by a year range.
Looking closer I think I asked the wrong question, please disregard and I
will start a new chain with that question
On Friday, February 3, 2012, Jamie Johnson wrote:
> Is it possible to have multiple index analysis chains on a single field?
Hello,
I am experimenting with solr distributed search/random sharding (currently
use geo sharding), hope to gain some performance and also scalability in
the future. (index size keep growing and geo shard is hard to scale)
However I'm seeing worse performance with distributed search, on a testin
33 matches
Mail list logo