Hi,
Using CursorMark we over come the Deep paging so far so good. As far
as I understand cursormark unique for each and every query depending on
sort values other than unique id and also depends up on number of rows.
But my concern is if solr internally creates a different set for each
an
I indexed an electronics e-commerce product catalog.
This is a typical document from my collection:
"docs": [
{
"prezzo_vendita_d": 39.9,
"codice_produttore_s": "DK00150020",
"codice_s": "5.BAT.27407",
"descrizione": "BATTERIA GO PRO HERO ",
"barcode
Hi,
Do you need to "crawl" XML files by MCF?
I would index XML files by DIH's XMLEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler
Or, I would parse XMLs and post to Solr on my own. Java(with SolrJ) or any
scripting language is convenient.
Sorry for if it's not helpful.
Regards,
T
I don't have formal benchmarks, but we did get significant performance
gains by switching from a RAMDirectory to a MMapDirectory on tmpfs,
especially under parallel queries. Locking seemed to pull down the former..
On 23 Jan 2015 06:35, "deniz" wrote:
> Would it boost any performance in case the
Try adding -Dauto=true and take away setting url. The type probably isn't
needed then either.
With the new Solr 5 bin/post it sets auto=true implicitly.
Erik
> On Jan 26, 2015, at 17:29, Mark wrote:
>
> Fantastic - that explians it
>
> Adding -Durl="
> http://localhost:8983/solr/upd
Hi,
I am getting the suggestion of both correct words and misspell
words but not getting, stop words suggestions. Why? Even I am not using
solr.StopFilterFactory.
Schema.xml :
**
Hi
I have successfully create a really cool Lucene41x8PostingsFormat class
(a copy of the Lucene41PostingsFormat class modified to use 8 times the
default block size), registered the format as required. In the
schema.xml I have created a field type string with this postingsformat
and lastly I
Hi,
could you resolve the problem?
I am facing the same. Highlighting is shown in XML results but not on the
velocity template.
-Nico
Hello Trym,
Can you clarify, which blockSize do you mean? And the second q, just to
avoid unnecessary explanation, do you know what's Pulsing?
On Tue, Jan 27, 2015 at 2:28 PM, Trym Møller wrote:
> Hi
>
> I have successfully create a really cool Lucene41x8PostingsFormat class (a
> copy of the Lu
Hi
Thanks for your clarifying questions.
In the constructor of the Lucene41PostingsFormat class the minimum and
maximum block size is provided. These sizes are used when creating the
BlockTreeTermsWriter (responsible for writing the .tim and .tip files of
the lucene index). It is the blocksiz
On Tue, Jan 27, 2015 at 3:29 AM, CKReddy Bhimavarapu
wrote:
> Hi,
> Using CursorMark we over come the Deep paging so far so good. As far
> as I understand cursormark unique for each and every query depending on
> sort values other than unique id and also depends up on number of rows.
> B
I¹m not sure the query you provided will do what you want, BUT I did find
the bug in the code that is causing the NullPointerException.
The variable context is supposed to be global, but when prepare() is
called, it is only defined in the scope of that function.
Here¹s the simple patch:
Index: c
I think the word break spellchecker will do what you want. But, if I were you,
I'd dial back "maxChanges" to 1 or 2. You don't want it slicing a word into 10
parts or trying to combine 10 adjacent words. You also need the
"minBreakLength" to be no more than 2, if you want it to break "go" (le
Thanks a lot! I'll try this out later this morning. If group.func and
group.field don't combine the way I think they might, I'll try to look for
a way to put it all in group.func.
On Tuesday, January 27, 2015, Jim.Musil wrote:
> I¹m not sure the query you provided will do what you want, BUT I
Can you give a little more information as to how you have the spellchecker
configured in solrsonfig.xml? Also, it would help if you showed a query and
the spell check response and then explain what you wanted it to return vs what
it actually returned.
My guess is that the stop words you ment
Hi Gurus,
We have take below error when using solr with hybris. Long time not get this
error, but when it begun, its still continue to get below error.
I search and googling but nothing found. I use solr 4.6.1, and I must use it
because of its embedded on hybris.
I increase maxidletime to 30
: But my concern is if solr internally creates a different set for each
: and every different queries upon sort values and they lasts for ever I
: think.
https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
"Cursors in Solr are a logical concept, that doesn't involve cach
A complete stub in the dark, as I don't know what the client does with the
sessions.
But it looks like the connection has been cut to the SolrJ client. If you
connect to the server, then do nothing for a while with a connection and
then starting getting errors, I would check whether you have a fir
Good, I'll try.
But imagine I have 100 documents containing "go pro" and 150 documents
containing "gopro".
Suggestions of the "other" term do not come up in any case.
2015-01-27 16:21 GMT+01:00 Dyer, James-2 [via Lucene] <
ml-node+s472066n4182254...@n3.nabble.com>:
> I think the word break spellc
Hi. I'm trying to sort on computed values that my QParserPlugin creates. I've
found out that I can make it happen by adding a fake scorer to the delegates
before collect() is called. The problem I have now is how to modify the sort
field in mid stream?
The user selects COUNT as a sort field, but o
Hello,
you can modify query params in prepare() before QueryComponent. You can set
sorting to score and turn your component to compute it based on params.
Just an oftop, literally sorting extension might be done via
FieldComparatorSource, which is described in LIA. this proper way, might
not suite
When using group.main=true, the results are not mixed as you expect:
"If true, the result of the last field grouping command is used as the
main result list in the response, using group.format=simple”
https://wiki.apache.org/solr/FieldCollapsing
Jim
On 1/27/15, 9:22 AM, "Ryan Josal" wrote:
>
Hm.. It's not blocks which I'm familiar with. Regarding performance impact
from bigger ID blocks: if you have ID and sends
update for existing docs. And IDs are also used for some of the distributed
search stages, I suppose. Here it is.
On Tue, Jan 27, 2015 at 4:33 PM, Trym Møller wrote:
> Hi
>
Hi Erick
I tried this link but do not see a straight forward answer.
For example it says:
/You can use the pseudo-field feature to return the distance along with the
stored fields of each document by adding fl=geodist() to the request/
So I tried:
...?q={!func}dist(2, lat, lng, 0, 0)&fl=geodist()
Interestingly, you can do something like this:
group=true&
group.main=true&
group.func=rint(scale(query({!type=edismax v=$q}),0,20))& // puts into
buckets
group.limit=20& // gives you 20 from each bucket
group.sort=category asc // this will sort by category within each bucket,
but this can be a f
Hi,
What is the recommended way to import and update index records?
I've read the documentation and I've experimented with full-import and
delta-import and I am not seeing the desired results.
Basically, I have 15 RSS feeds that I am importing through
rss-data-config.xml.
The first RSS fee
Also, if I try full-import and clean=false with the same XML file, I end
up with more records each time the import runs. How can I make SOLR
just add the records that are new by id, and update the ones that have
an id that matches the one in the existing index?
On 1/27/15, 11:32 AM, Carl Rob
What do you mean by "update"? If you mean partial update, DIH does not
do it AFAIK. If you mean replace, it should.
If you are getting duplicate records, maybe your uniqueKey is not set correctly?
clean=false looks to me like the right approach for incremental updates.
Regards,
Alex.
Sig
Did you just try just setting up the demo then trying the example? I often
find if I advance step-by-step from the example, it is much easier
to figure things out.
sfield is the name of the field containing the location information,
"store" in the examples
Both "store", and "pt" are required to be
Thanks Eric
However
java -classpath dist/solr-core-4.10.3.jar -Dauto=true
org.apache.solr.util.SimplePostTool C:/temp/samplemsg/*.msg
Fails with:
osting files to base url http://localhost:8983/solr/update..
ntering auto mode. File endings considered are
xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xl
Hello!
SpellingQueryConverter "parses" the incoming query in sort of a quick and
dirty way with a regular expression. Is there a reason the query string
isn't parsed with the _actual_ parser, if one was configured for that type
of request? Even better, could the parsed query object be added to the
Hi,
i think i found a bug in the AdminUI.
When i create a new collection with the Collection API, the name of the
core is displayed wrong in the AdminUI.
This is the call:
http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&collection.configName=myconfig&property.name=m
Hello,
Thinking to combine the output provided from the IndexBasedSpellChecker
with the language rules from the languagetool(languagetool.org).
Wondering if this is already implemented ?
Thanks,
Peter
Sorry, there is no great workaroud. You might try raising the max idle time
for your container - perhaps that makes it less frequent.
- Mark
On Tue Jan 20 2015 at 1:56:54 PM Nishanth S wrote:
> Thank you Mike.Sure enough,we are running into the same issue you
> mentoined.Is there a quick fix fo
Brilliant! I didn't know what the prepare() method of a SearchComponent could
do. I can modify the SolrParams by request.setParams() and add custom data
to the request context to tell my Collector how to figure the score. Thank
you!
--
View this message in context:
http://lucene.472066.n3.nabbl
Your IDs seem to be the file names, which you are probably also getting
from your parsing the file. Can't you just set (or copyField) that as an ID
on the Solr side?
Alternatively, if you don't actually have good IDs, you could look into
UpdateRequestProcessor chains with UUID generator.
Regards
Hi all,
Recently I got an interesting use case that I'm not sure how to implement, the
idea is that the client wants a fixed number of documents, let's call it N, to
appear in the top of the results. Let me explain a little we're working with
web documents so the idea is too promote the documen
You need to set "spellcheck.alternativeTermCount" to a value greater than zero.
Without it, spellcheck will never suggest for something in the index.
See
https://cwiki.apache.org/confluence/display/solr/Spell+Checking#SpellChecking-Thespellcheck.alternativeTermCountParameter
James Dyer
Ingram
Having worked with the spellchecking code for the last few years, I've often
wondered the same thing, but I never looked seriously into it. I'm sure
there's probably some serious hurdles, hence the Query Converter. The easy
thing to do here is to use "spellcheck.q", and then pass in space-deli
Hi Alex,
On an individual file basis that would work, since you could set the ID on
an individual basis.
However recuring a folder it doesn't work, and worse still the server
complains, unless on the server side you can use the UpdateRequestProcessor
chains with UUID generator as you suggested.
This is great, thanks Jim. Your patch worked and the sorting solution
meets the goal, although group.limit seems like it could cut various
results out of the middle of the result set. I will play around with it
and see if it proves helpful. Can you let me know the Jira so I can keep
an eye on it
On 1/27/2015 10:40 AM, Alexander Albrecht wrote:
> Hi,
> i think i found a bug in the AdminUI.
>
> When i create a new collection with the Collection API, the name of the
> core is displayed wrong in the AdminUI.
>
> This is the call:
>
> http://localhost:8983/solr/admin/collections?action=CREATE&
Hi Shawn,
I got it to work by using this script to start my instance of Solr:
java *-Dhttp.proxyHost=http-proxy-server -Dhttp.http.proxyPort=80
-Dhttps.proxyHost=http-proxy-server -Dhttps.proxyPort=80*
-Dlog4j.debug=true
-Dlog4j.configuration=file:///Users/carlroberts/dev/solr-4.10.3/log4j.xm
HI Alex, thanks for clarifying this for me. I'll take a look at my
setup of the uniqueKey. Perhaps I did not set it right.
On 1/27/15, 12:09 PM, Alexandre Rafalovitch wrote:
What do you mean by "update"? If you mean partial update, DIH does not
do it AFAIK. If you mean replace, it should.
I
the last advice, have a look how Solr modifies params by chaining, defaults
and appends. I don't know why it's exactly done in such trendy way. It just
seems serious. Keep the same way.
On Tue, Jan 27, 2015 at 8:58 PM, tedsolr wrote:
> Brilliant! I didn't know what the prepare() method of a Sear
Hello,
if I get you right it's frequently requested feature, but it requires
really deep hack like
https://issues.apache.org/jira/browse/LUCENE-6066
On Tue, Jan 27, 2015 at 9:28 PM, Jorge Luis Betancourt González <
jlbetanco...@uci.cu> wrote:
> Hi all,
>
> Recently I got an interesting use case
Can this be done as a custom post-filter with the recent Solr improvements?
Regards,
Alex.
Sign up for my Solr resources newsletter at http://www.solr-start.com/
On 27 January 2015 at 14:22, Mikhail Khludnev
wrote:
> Hello,
> if I get you right it's frequently requested feature, but it
Yes, I’m trying to pin down exactly what conditions cause the bug to
appear. It seems as though it’s only when using the query function.
Jim
On 1/27/15, 12:44 PM, "Ryan Josal" wrote:
>This is great, thanks Jim. Your patch worked and the sorting solution
>meets the goal, although group.limit se
I am using Solr 4.2
I added
http://";>https://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4
and have spatial4j-0.3.jar in my project.
When running the indexer I started getting this error:
java.lang.NoClassDefFoundError: com/google/common/cache/CacheBuilder
at
org.apache.solr.sch
I think it can, but it's sort of tricky thing to implement.
On Tue, Jan 27, 2015 at 10:29 PM, Alexandre Rafalovitch
wrote:
> Can this be done as a custom post-filter with the recent Solr improvements?
>
> Regards,
>Alex.
>
> Sign up for my Solr resources newsletter at http://www.solr-st
Hi,
I have tried to reindex to add a new field named product-info and no
matter what I do, I cannot get the new field to appear in the index
after import via DIH.
Here is the rss-data-config.xml configuration (field product-info is the
new field I added):
readTimeout="3"/>
In the end I didn't find a way to add a new file/ mime type for recursing a
folder.
So I added msg to the static dtring and Mime map.
private static final String DEFAULT_FILE_TYPES =
"xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log,msg";
mimeMap.put("msg"
Hi vit,
FYI that’s the old wiki and it contains a notice in red at the top that you
should refer to its replacement at:
https://cwiki.apache.org/confluence/display/solr/Spatial+Search That said,
there are some small details that haven’t been migrated but there is
outdated info too so beware.
The
Here’s the issue:
On 1/27/15, 12:44 PM, "Ryan Josal" wrote:
>This is great, thanks Jim. Your patch worked and the sorting solution
>meets the goal, although group.limit seems like it could cut various
>results out of the middle of the result set. I will play around with it
>and see if it prov
Here’s the issue:
https://issues.apache.org/jira/browse/SOLR-7046
Jim
On 1/27/15, 12:44 PM, "Ryan Josal" wrote:
>This is great, thanks Jim. Your patch worked and the sorting solution
>meets the goal, although group.limit seems like it could cut various
>results out of the middle of the resul
I too am running into what appears to be the same thing.
Everything works and data is imported but I cannot see the new field in
the result.
Hi Shawn,
Here is some update. We found the main issue
We have configured our cluster to run under jetty and when we tried full
indexing, we did not see the original Invalid Chunk error. However the
replicas still went into recovery
All this time we been trying to look into replicas logs to diagnos
Hello,
We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes and three
zookeeper instances. We have noticed that when a leader node goes down the
replica never takes over as a leader, cloud becomes unusable and we have to
bounce entire cloud for replica to assume leader role. Is this
I have this in my solrconfig:
explicit
10
catch_all
on
default
wordbreak
false
5
Well - I got this to work. Noticed that when log4j is enabled
product-info was in the import as product-info=[], so I then played with
the field and got this definition to work in the rss-data-config.xml file:
xpath="/nvd/entry/vulnerable-software-list/product" commonField="false"
regex=":" r
One xpath per field definition. You had two fields for the same xpath.
If they were the same value, the best bet would be to deal with it via
copyField in the schema.
No idea why regex thing makes a difference, are you sure the other
field is also still being indexed?
Regards,
Alex.
Sign
If you can index stuff into a new schema for test, try defining one
with dynamicField name=* stored=true indexed=true type=string. Your
schema may have one like this commented out and/or set to false.
This would show you exactly what you are indexing and solve whether
you have any spelling or form
You are right - I just checked and now the other field
(vulnerable-software) that is using the same xpath has blank values.
BTW - It looks like this also works:
commonField="false" regex=":" replaceWith=" "/>
Here are the results for one row in json:
"responseHeader":{
"status":0,
On 27 January 2015 at 17:47, Carl Roberts wrote:
> commonField="false" regex=":" replaceWith=" "/>
Yes, that works because the transformer copies it, not the
EntityProcessor. So, no conflict on xpath.
Regards,
Alex.
Sign up for my Solr resources newsletter at http://www.solr-start.com/
Apologies in advance for hijacking the thread, but somewhat related, does
anyone have experience with using cursorMark and elevations at the same
time? When I tried this, either passing elevatedIds via solrJ or specifying
them in elevate.xml, I got an AIOOBE if a cursorMark was also specified.
When
OK - I did a little testing and with full-import and clean=false, I get
more and more records when I import the same XML file. I have also
checked and I see that my uniqueKey is defined correctly.
Here are my fields in schema.xml:
multiValued="true"/>
indexed="true" stored=
On 27 January 2015 at 18:44, Carl Roberts wrote:
> OK - I did a little testing and with full-import and clean=false, I get more
> and more records when I import the same XML file. I have also checked and I
> see that my uniqueKey is defined correctly.
1) Is this a SolrCloud or a single core setu
>
Make that id field a string and reindex. text_general is not the right
type for a unique key.
Regards,
Alex.
Yep - it works with string. Thanks a lot!
On 1/27/15, 7:08 PM, Alexandre Rafalovitch wrote:
Make that id field a string and reindex. text_general is not the right
type for a unique key.
Regards,
Alex.
On 1/27/2015 2:52 PM, Vijay Sekhri wrote:
> Hi Shawn,
> Here is some update. We found the main issue
> We have configured our cluster to run under jetty and when we tried full
> indexing, we did not see the original Invalid Chunk error. However the
> replicas still went into recovery
> All this tim
What version of Solr? This is an ongoing area of improvements and several
are very recent.
Try searching the JIRA for Solr for details.
Best,
Erick
On Tue, Jan 27, 2015 at 1:51 PM, Joshi, Shital wrote:
> Hello,
>
> We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes and three
> zoo
Hi,
I am attempting to run all these curl commands from a script so that I
can put them in a crontab job, however, it seems that only the first one
executes and the other ones return with an error (below):
curl
"http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-import&clean=false&en
Hi All,
I am using Solrcloud 4.6.1 In that if I use CloudSolrServer to add a record
to solr, then I see the following commit update command in both master and
in slave node :
2015-01-27 15:20:23,625 INFO org.apache.solr.update.UpdateHandler: start
commit{,optimize=false,openSearcher=true,waitSear
Hi Jorge,
We have done similar thing with N=3. We issue separate two queries/requests,
display 'special N' above the results.
We excluded 'special N' with -id:(1 2 3 ... N) type query. all done on client
side.
Ahmet
On Tuesday, January 27, 2015 8:28 PM, Jorge Luis Betancourt González
wrote
Po
On Jan 28, 2015 1:54 AM, "vit" wrote:
> I am using Solr 4.2
>
> I added
> class="solr.SpatialRecursivePrefixTreeFieldType"
>
> according to
> http://";>
> https://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4
> and have spatial4j-0.3.jar in my project.
>
> When running the indexer I sta
I want to reindex my data in order to change a value of some field according
to value of another. ( both field are existing )
For this purpose I run a "clue" utility in order to get a list of IDs.
Then I created an update processor , which can set a value of field A
according to value of field
Hi Guys,
I have multiple cores setup in my solr server. I would like read/import data
from one core(source) into another core(target) and index it..Is there is a
easy way in solr to do so?
I was thinking of using SolrEntityProcessor for this purpose..any other
suggestions is appreciated..
http:
77 matches
Mail list logo