Dear reader,
I'm trying to use solr for a hierarchical search:
metadata from the higher-levelled elements is copied to the lower ones,
and each element has the complete ocr text which it belongs to.
At volume level, of course, we will have the complete ocr text in one
and we need to store it for
ch happier (speed of some
> operations). Might be something to test if all else fails.
Ok...
Thanks,
J. Barth
> Regards,
>Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr
> proficiency
>
.com/ - Accelerating your Solr
> proficiency
>
>
> On Tue, Apr 29, 2014 at 3:28 PM, Jochen Barth
> wrote:
>> Dear reader,
>>
>> I'm trying to use solr for a hierarchical search:
>> metadata from the higher-levelled elements is copied to the lower on
Dear Shawn,
see attachment for my first "brute force" no-compression attempt.
Kind regards,
Jochen
Zitat von Shawn Heisey :
On 4/29/2014 4:20 AM, Jochen Barth wrote:
BTW: stored field compression:
are all "stored fields" within a document are put into one compressed ch
von Shawn Heisey :
On 4/29/2014 4:20 AM, Jochen Barth wrote:
BTW: stored field compression:
are all "stored fields" within a document are put into one compressed chunk,
or by per-field basis?
Here's the issue that added the compression to Lucene:
https://issues.apache.org/
buted on disk and more I/O is necessary to
retrieve the fields (usually this is a concern when storing large
fields, like the entire contents of a document)."
But in my case (with docValues=true) there should be no reason to
access *.fdt.
Kind regards,
Jochen
Zitat von Jo
I'll found out that "storing" Documents as separate docs+id does not
help either.
You must have an completely separate collection/core to get things work fast.
Kind regards,
Jochen
Zitat von Jochen Barth :
Ok, https://wiki.apache.org/solr/SolrPerformanceFactors
states th
q=ocr:abc AND (id:x1 OR id:x2 OR id:x3 OR id... ... id:x1000)
Why?
Kind regards,
Jochen barth
--
J. Barth * IT, Universitaetsbibliothek Heidelberg * 06221 / 54-2580
pgp public key:
http://digi.ub.uni-heidelberg.de/barth%40ub.uni-heidelberg.de.asc
ecause of direct match (and not via {!graph... ) ?
The only way to do so seems a {!boost before {!graph, but what I can do
there is not dependent on the match nor {!graph, I think.
Kind regards,
Jochen
--
Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580
"from" : "id",
"query" : "fulltext_ocr_txtlarge:troja",
"to" : "parent_ids",
"useAutn" : "true"
}
}
]
}
},
"to" : "id"
}
}
]
}
}
}
Kind regards,
Jochen
--
Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580
Dear reader, I've found an different solution for my problem
and don't need a depth dependent score anymore.
Kind regards, Jochen
Am 19.02.19 um 14:42 schrieb Jochen Barth:
Dear reader,
I'll have a hierarchical graph "like a book":
{ id:solr_doc1; title:book }
{ id
rcher are in unnamed module of loader
org.eclipse.jetty.webapp.WebAppClassLoader @c1fca1e)
Ooops... even commit does not work.
Did rollback. Helps.
Did delete without the -_query_:"..." part, works.
Kind regards.
Jochen
--
Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580
Dear reader,
why does +(-x_ss:y) finds 0 docs,
while -(+x_ss:y) finds many docs?
Ok... +(*:* -x_ss:y) works, too, but I'm a bit surprised.
Kind regards, J. Barth
ot;],"must_not":[{"join":{"from":"parent_ids","query":{"bool":{"must":[{"bool":{"should":[{"bool":{"should":[{"graph":{"from":"parent_ids","query&qu
Oops.. my thunderbird did not preserve color...
the keywords to look for in the query are
»facet.field=%7B%21ex%3Dtype_s%7Dtype_s«
and
»{"#type_s":"type_s:article"}«
Kind regards, Jochen
Am 26.08.19 um 15:25 schrieb Jochen Barth:
Dear reader,
I'm trying to do this:
}}]}},"class_s:meta"],"must_not":[{"join":{"from":"id","query":{"bool":{"must":[{"bool":{"must":["sort_shelflocator_s:cod\\
pal\\ lat\\ 00*"],"should":[{"graph":{"from":"parent_ids","
query":"parent_ids:\"/digi.ub.uni-heidelberg.de/collection/sammlung51\"","to":"id"}},{"graph":{"from":"parent_ids","query":"parent_ids:\"/digi.ub.uni-heidelberg.de/collection/sammlung52\"","to":"id"}}]}},"class_s:meta"]}}
,"to":"parent_ids"}}]}}}
--
Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580
ies possible?
I've got some complex queries with a "kernel" somewhat below the top
level...
Is "canonical" json important to match query cache entry?
Would it help to serialize this queries to standard syntax and then use
filter(...)?
Kind regards,
Jochen
--
Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580
mann\"],id=parent_ids] [TraversalFilter: +class_s:meta
-type_s:multivolume_work -type_s:periodical -type_s:issue
-type_s:journal][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false]))
+class_s:meta)],id=parent_ids][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false])"
wdiff of both:
"GraphQuery([[filter(+(([[meta_title_txt:\"h"+GraphQuery([[+filter(+(([[meta_title_txt:\"h
... [TraversalFilter:class_s:meta +class_s:meta -type_s:multivolume_work ...
[TraversalFilter:class_s:meta +class_s:meta -type_s:multivolume_work
...
so the + before the »filter(« shouldnt be strictly necessary nor be the
problem,
and the + efore class_s:meta isn't necessary, too, but can't be the
problem, too, in my opinion.
What I found out is, that, "+" and "-" have higher precedence than "AND"
and "OR"... but I don't see my error...
Does someone has a hint for me?
Kind regards,
Jochen
--
Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580
04.12.19 um 12:39 schrieb Jochen Barth:
Dear reader, I'm using solr 8.1.1.
I'm trying to switch from q.op=OR to q.op=AND, because of the parser I
generate the queries for solr is somewhat more simple to develop with
q.op=AND.
but the new query is returning less hits;
I have sho
Mea culpa ...
ran the different queries against two different solr instances.
Everything works fine.
Kind regards,
Jochen
Am 04.12.19 um 13:20 schrieb Jochen Barth:
Found
https://cwiki.apache.org/confluence/display/lucene/BooleanQuerySyntax
But this does not explain the problem...
Oh
20 matches
Mail list logo