I have done some further analysis on this and I am now even more confused. When
I use the Field Analysis tool with the text 'chicken stock' it highlights that
text as a match.
The dismax query looks ok to me:
+(DisjunctionMaxQuery((ingredient_synonyms:chicken^0.6)~0.01)
DisjunctionMaxQuery((ingr
On 10 February 2012 04:15, alessio crisantemi
wrote:
> hi all,
> I would index on solr my pdf files wich includeds on my directory c:\myfile\
>
> so, I add on my solr/conf directory the file data-config.xml like the
> following:
[...]
> but this is the result:
[...]
Your Solr URL for dataimport
Is there any way to score not being affected by duplicated input in query?
When i have record with field
title: "The GIRL with the dragon tattoo"
If query is: "girl" it get less score then "girl girl girl". It find word in
the same position, why score is growing?
I need it to know if record i f
Will testing Solr based on duplicated data in the database result in same
performance statistics as compared to testing Solr with completely unique data?
By test I mean routine performance tests like time to index, time to search
etc. Will solr perform any kind of optimization that will result i
Hello!
In terms of query performance, Solr will use caches (of course, if
they are turned on). So if you will run similar queries (like the same
filters, sort and stuff like that) the performance may be different
than performance with unique queries.
The http://wiki.apache.org/solr/SolrCaching ha
I have problems with full import query.
no results.
I search in log files and after I write again..
tx
a.
2012/2/9 alessio crisantemi
> hi all,
> I would index on solr my pdf files wich includeds on my directory
> c:\myfile\
>
> so, I add on my solr/conf directory the file data-config.xml like t
Hi Zac,
Field Analysis tool (analysis.jsp) does not perform actual query parsing.
One thing to be aware of when Using Keyword Tokenizer at query time is: Query
string (chicken stock) is pre-tokenized according to white spaces, before it
reaches keyword tokenizer.
If you use quotes ("chicken st
On Thu, 2012-02-09 at 23:45 +0100, alessio crisantemi wrote:
> hi all,
> I would index on solr my pdf files wich includeds on my directory c:\myfile\
>
> so, I add on my solr/conf directory the file data-config.xml like the
> following:
>
>
>
>
>
> *0*
"""
DIH hasn't even retrieved any dat
Hello,
we use Solr 3.5 and Tika to index a lot of PDFs. The content of those PDFs
is searchable via a full-text search.
Also the terms are used to make search suggestions.
Unfortunately pdfbox seems to insert a space character, when there are
soft-hyphens in the content of the PDF
Thus the extrac
hey Tommaso,
That result grouping is during the query but i want to sort the
solrdocumentlist after it has been queried and i hv injected few solrdocs in
the solrdocumentlist. Thus i want this solrdocumentlist to be sorted based
on the fields i specify and cannot query the solr for result grouping
Hi Erick,
I have tried grouping with and without shards using solr 3.3. I know solr
3.3 does not support grouping with multiple shards. We have been waiting for
3.5.0 and nw it is available and we will try with that.
The reason i am looking for grouping is posted in this link. Please advice
me ho
Hi,
Maybe the pdf creator tool is not generating a "fluid" text, in pdf has
sections defined by objects, e.g. for "Medizin"
20 0 obj
(Medizin)
endobj
However this can happen
20 0 obj
(Me)
endobj
21 0 obj
(di)
endobj
22 0 obj
(zin)
endobj
See that, there are 3 text objects, the extraction tool
I think you need to control the parameter "enableAutoSpace" in PDFBox. There's
a JIRA for it, but it depends on some Tika1.1 stuff as far I can understand
https://issues.apache.org/jira/browse/SOLR-2930
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - ww
Thanks so far. I will have a closer look at the PDF.
I tried the enableautospace setting with pdfbox1.6 - did not work:
PDFParser parser = new PDFParser();
parser.setEnableAutoSpace(false);
ContentHandler handler = new BodyContentHandler();
Output:
Va ri an te Creut
On Fri, Feb 10, 2012 at 6:18 AM, Dirk Högemann
wrote:
>
> Our suggest component and parts of our search is getting hard to use by
> this. Any other ideas?
>
Looks like https://issues.apache.org/jira/browse/PDFBOX-371
The title of the issue is a bit confusing (I don't think it should go
to hyphen
My self I am Sumal who working as a Software Engineer. Currently I am
developing web based e-commerce applications using java and i am using e
commerce Konakart shopping cart as well. I am using
Konakart community edition. I am kindly requesting some information about
how to integrate solr in my
Hi, I don't think this is the right place for this question. You should
follow samples of solr client api integration in Java and develop your
way in konakart..
Regards!
Dalius Sidlauskas
On 10/02/12 08:25, sumal wrote:
My self I am Sumal who working as a Software Engineer. Currently I am
de
Hi,
I'm using the NGramFilterFactory for indexing and querying.
So if I'm searching for "overflow" it creates an query like this:
mySearchField:"ov ve ... erflow overflo verflow overflow"
But if I misspelled "overflow", i.e. "owerflow" there are no matches
because the quotes around the query:
> I'm using the NGramFilterFactory for indexing and querying.
>
> So if I'm searching for "overflow" it creates an query like
> this:
>
> mySearchField:"ov ve ... erflow overflo verflow overflow"
>
> But if I misspelled "overflow", i.e. "owerflow" there are no
> matches
> because the quotes arou
Hi Ahmet,
awesome! Now it works.
2012/2/10 Ahmet Arslan :
>> I'm using the NGramFilterFactory for indexing and querying.
>>
>> So if I'm searching for "overflow" it creates an query like
>> this:
>>
>> mySearchField:"ov ve ... erflow overflo verflow overflow"
>>
>> But if I misspelled "overflow",
2012/2/9 Mikhail Khludnev :
> Some time ago I tested backported patch from
> https://issues.apache.org/jira/browse/SOLR-2155
> it works.
OK, I would do that. But...
Against which version can/should I apply the patch? (I am not
restricted by other requirements so far.)
Then I tried both with the
I know that the latest Solr Cloud doesn't use standard replication but
I have a question about how it appears to be working. I currently
have the following cluster state
{"collection1":{
"slice1":{
"JamiesMac.local:8501_solr_slice1_shard1":{
"shard_id":"slice1",
"state":
Can you explain a little more how you doing this? How are you bringing the
cores down and then back up? Shutting down a full solr instance, unloading the
core?
On Feb 10, 2012, at 9:33 AM, Jamie Johnson wrote:
> I know that the latest Solr Cloud doesn't use standard replication but
> I have a q
Sorry, I shut down the full solr instance.
On Fri, Feb 10, 2012 at 9:42 AM, Mark Miller wrote:
> Can you explain a little more how you doing this? How are you bringing the
> cores down and then back up? Shutting down a full solr instance, unloading
> the core?
>
> On Feb 10, 2012, at 9:33 AM, J
Can you post the code? SUSS should essentially be a drop-in
replacement for CHSS.
It's not advisable to commit after every add, it's usually better
to use commitWithin, and perhaps commit at the very end of
the run.
Best
Erick
On Thu, Feb 9, 2012 at 4:00 PM, T Vinod Gupta wrote:
> Hi,
> I wrote
Jay,
Was the curly closing bracket "}" intentional? I'm using 3.4, which also
supports "fq=price:[10 TO 20]". The problem is the results are not working
properly.
From: Jan Høydahl
To: solr-user@lucene.apache.org; Yuhao
Sent: Thursday, February 9, 2012
Please re-read Hoss' response. There is no need to warm all queries, that will
be very slow for autowarming and you quickly reach a point of
diminishing returns.
Best
Erick
2012/2/9 Rong Kang :
> Thanks for your reply.
>
> I didn't use any other params except q(for example
> http://localhost:80
Erick,
Thanks for the suggestion. I think we're going to go that route.
Best,
--Chase
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Thursday, February 09, 2012 12:30 PM
To: solr-user@lucene.apache.org
Subject: Re: Index Start Question
Hmmm. You say:
"
I'll answer for Jan "Yes". Prior to 4.0, you cannot mix
inclusive and exclusive operators on a range query. see:
https://issues.apache.org/jira/browse/SOLR-355. If you
can't go to 4.0, you can cheat and make, say, your top
value a tiny bit less than the boundary. For an int-based
field [1 To 20] us
Hello,
I hope someone can help me.
I have several documents with the fields content, author, ... indexed.
Now I would like to make a faceted search.
The exact problem is with me following:
As a result (SolrResponse) for query I get: facet_fields= {author =
{first name=1, surname = 1}}...
Interesting thing is that the only Tool I found to handle my pdf correctly
was pdftotext.
2012/2/10 Robert Muir
> On Fri, Feb 10, 2012 at 6:18 AM, Dirk Högemann
> wrote:
> >
> > Our suggest component and parts of our search is getting hard to use by
> > this. Any other ideas?
> >
>
> Looks lik
Was there a fix recently to address sorting issues for Dates in solr
cloud? On my cluster I have a date field which when I sort across the
cluster I get incorrect order executing the following query I get
solr/select?distrib=true&q=paul&sort=datetime_dt%20desc&fl=datetime_dt
2009-10-31T1
Double check your default operator for a faceted search vs. regular
search. I caught this difference in my work that explained this
difference.
On Fri, 2012-02-10 at 07:45 -0800, Yuhao wrote:
> Jay,
>
> Was the curly closing bracket "}" intentional? I'm using 3.4, which also
> supports "fq=pric
Typically this is handle by defining
a second field of type string and use
copyField to copy from author to this
new field, say, author_facet.
Then do your facets on author_facet
but do searches on author.
Best
Erick
On Fri, Feb 10, 2012 at 11:19 AM, Torlaf15 wrote:
> Hello,
>
> I hope someon
On Fri, Feb 10, 2012 at 11:44 AM, Jamie Johnson wrote:
> Was there a fix recently to address sorting issues for Dates in solr
> cloud? On my cluster I have a date field which when I sort across the
> cluster I get incorrect order executing the following query I get
Yikes! There haven't been any
This is an snapshot of the solrcloud branch from somewhere between a
year and 6 months ago (can't really remember off hand) with some
custom components, I'm thinking that the custom components may be
messing something up. I'm removing them now to test this without
those to make sure that the issue
Marian,
Sorry, I completely forgot to mention.
Pls check David's instruction
https://issues.apache.org/jira/browse/SOLR-2155?focusedCommentId=13117350&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13117350
The patch you tried to use is just my amendment for the Dav
Hi,
that sounds very good.
Thank you
Toralf
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-define-field-type-tp3732986p3733350.html
Sent from the Solr - User mailing list archive at Nabble.com.
For anyone having this issue in the future:
I managed to narrow it down to Solr-RA 3.5. Installing Solr 3.5 solved the
issue. I don't really know how the internals of Solr-RA work, but it
appears that it was using AND operators even when I explicitly used OR
operators in the query. The other solut
In SolrJ, when using CommonsHttpSolrServer, SolrJ doesn't log anything
at or below the INFO level. When I have the logging turned on at that
level, I only see log messages that I have placed within my own code.
If I log at DEBUG, then I do see SolrJ log messages.
When I switched to Streaming
It looks like everything works fine without my custom component, which
is good for Solr, bad for me. The custom component does some
additional authorization processing to remove docs that the user does
not have access to. To do this we're iterating through
responseBuilder.getResults().docList and
hello,
>>
Or does your field in schema.xml have anything like
autoGeneratePhraseQueries="true" in it?
<<
there is no reference to this in our production schema.
this is extremely confusing.
i am not completely clear on the issue?
reviewing our previous messages - it looks like the data is bein
So looking at query component it appears to sort the entire doc list
at the end of process, my component is defined after this query so the
doclist that I get should be sorted, right? To me this should mean
that I can remove items from this list and shift everything left as
needed and it should wo
On Fri, Feb 10, 2012 at 2:48 PM, Jamie Johnson wrote:
> So looking at query component it appears to sort the entire doc list
> at the end of process, my component is defined after this query so the
> doclist that I get should be sorted, right? To me this should mean
> that I can remove items from
I'd like to look at the pseudo fields you're talking about (don't
really understand it right now), but need to get something working in
the short term. How do I go about removing these from the sort
values?
On Fri, Feb 10, 2012 at 3:06 PM, Yonik Seeley
wrote:
> On Fri, Feb 10, 2012 at 2:48 PM, J
doing some copying I came up with the following
boolean fsv =
req.getParams().getBool(ResponseBuilder.FIELD_SORT_VALUES,false);
if(fsv){
NamedList sortVals = (NamedList)
rsp.getValues().get("sort_values");
Sort sort = searcher.weightSort(r
2012/2/10 Mikhail Khludnev :
> Marian,
>
> Sorry, I completely forgot to mention.
> Pls check David's instruction
> https://issues.apache.org/jira/browse/SOLR-2155?focusedCommentId=13117350&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13117350
>
> The patch you trie
here is how i was playing with it..
StreamingUpdateSolrServer solrServer = new
StreamingUpdateSolrServer("http://localhost:8983/solr/";, 10, 1);
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( "pk_id", "id1");
doc1.addField("doc_type", "content");
with rootEntity="false" it's the same..
help!
2012/2/10 Chantal Ackermann
>
>
> On Thu, 2012-02-09 at 23:45 +0100, alessio crisantemi wrote:
> > hi all,
> > I would index on solr my pdf files wich includeds on my directory
> c:\myfile\
> >
> > so, I add on my solr/conf directory the file data-con
Well, that's certainly "hello world" .
But I'm kinda stumped, I have programs that look an awful lot like
this that terminate just fine.
Anything in your Solr logs? And are you just executing this once?
And what version of Solr are you using?
Best
Erick
On Fri, Feb 10, 2012 at 3:49 PM, T Vinod
Sorry for pinging this again, is more information needed on this? I
can provide more details but am not sure what to provide.
On Fri, Feb 10, 2012 at 10:26 AM, Jamie Johnson wrote:
> Sorry, I shut down the full solr instance.
>
> On Fri, Feb 10, 2012 at 9:42 AM, Mark Miller wrote:
>> Can you ex
Well, not super-new (it's in 3.4), but the spatial post-filtering is
brand new in 4.0 as of today, and I don't think cache=false and
post-filtering was really highlighted well before anyway.
http://www.lucidimagination.com/blog/2012/02/10/advanced-filter-caching-in-solr/
-Yonik
lucidimagination.c
Hello, Elisabeth
I am having the same issue with WebLogic 11 with Solr 3.5. I've tried your
solution and didn't work out, but I'm not sure if I'm doing it right.
I've tried to alter the
%SERVER_HOME%\servers\AdminServer\tmp\_WL_user\solr\t6nzak\war\WEB-INF\weblogic.xml
and restarted the server, b
I'm trying, but so far I don't see anything. I'll have to try and mimic your
setup closer it seems.
I tried starting up 6 solr instances on different ports as 2 shards, each with
a replication factor of 3.
Then I indexed 20k documents to the cluster and verified doc counts.
Then I shutdown all
Also, it will help if you can mention the exact version of solrcloud you are
talking about in each issue - I know you have one from the old branch, and I
assume a version off trunk you are playing with - so a heads up on which and if
trunk, what rev or day will help in the case that I'm trying t
Here is a stack:
SEVERE: Full Import failed
org.apache.solr.handler.
dataimport.DataImportHandlerException: Unable to load En
tityProcessor implementation for entity:9946435225838 Processing Document #
1
at org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocB
uilder.java:576)
.
Thanks, that explains why the individual terms 'chicken' and 'stock' are still
in the query (and are required).
So I have tried a few things to get around this, but to no avail:
Changed the query analyzer to use the WhitespaceTokenizerFactory with
autoGeneratePhraseQueries=true. This creates the
I set up a Solr project to run with Tomcat for indexing contents of a database
by following a web tutorial that described how to put the project directory
anywhere you want and then put a file called .xml in the
tomcat/conf/Catalina/localhost directory that contains contents like this:
I
nothing seems that different. In regards to the states of each I'll
try to verify tonight.
This was using a version I pulled from SVN trunk yesterday morning
On Fri, Feb 10, 2012 at 6:22 PM, Mark Miller wrote:
> Also, it will help if you can mention the exact version of solrcloud you are
> tal
Thanks.
If the given ZK snapshot was the end state, then two nodes are marked as
down. Generally that happens because replication failed - if you have not,
I'd check the logs for those two nodes.
- Mark
On Fri, Feb 10, 2012 at 7:35 PM, Jamie Johnson wrote:
> nothing seems that different. In r
On Feb 10, 2012, at 9:33 AM, Jamie Johnson wrote:
> jamiesmac
Another note:
Have no idea if this is involved, but when I do tests with my linux box and mac
I run into the following:
My linux box auto finds the address of halfmetal and my macbook mbpro.local. If
I accept those defaults, my ma
hmmperhaps I'm seeing the issue you're speaking of. I have
everything running right now and my state is as follows:
{"collection1":{
"slice1":{
"JamiesMac.local:8501_solr_slice1_shard1":{
"shard_id":"slice1",
"leader":"true",
"state":"active",
"core":
Hi,
This should be trivial question, still I am failing to get the details.I
have 2 cores+default collection,
*collection1:*
article_id
title
content
*core0:*
cluster_id
cluster_name
cluster_count
*core1:*
article_id
article_cluster_id
score
Given an article_id, I want to return top 10 ( based
I am trying to use Solr's DataImportHandler to index a large number of database
records in a SQL Server database that is owned and managed by a group we are
collaborating with. The indexing jobs I have run so far, except for the initial
very small test runs, have failed due to database connectio
64 matches
Mail list logo