Hi,
I have some product details when i am looking for the different products at
a time it is not working.
I am using edismax. Configured filter query in the following way.
{!edismax v=$c}
_query_:"{!field f=product v=$p}"
For Example i using the following query for filtering results.
http://
Hi All,
I did some research on this and found some alternatives useful to my
usecase. Please give your ideas.
Can I update all documents indexed after a /dataimport query using the
last_indexed_time in dataimport.properties?
If so can anyone please give me some pointers?
What I currently have in
I'm guessing that "id" in your schema.xml is also a unique key field.
If so, each document must have an id field or Solr will refuse to
index them.
DataImportHandler will map the id field in your table to Solr schema's
id field only if you have not specified a mapping.
On Thu, Jan 23, 2014 at 3:0
Unfortunately we can't do sharding right now.
If we optimize on master and slave separately the file names and sizes are
same. I think it's just the version no that is different. Maybe if there
was a to copy master version to slave that would resolve this issue?
Hi,
I have stored=true for my "content" field, but I get an error saying there is a
mismatch of settings on that field (I think) because of the "term*=true"
settings.
Thanks again,
Fatima
> -Original Message-
> From: Ahmet Arslan [mailto:iori...@yahoo.com]
> Sent: Wednesday, January
A legitimate question that only you can answer is
"what's the value of faceting on fields with so many unique values?"
Consider the ridiculous case of faceting on . There's
almost exactly zero value in faceting on it, since all counts will be 1.
By analogy, with millions of tag values, will there
When you're doing hard commits, is it with openSeacher = true or
false? It should probably be false...
Here's a rundown of the soft/hard commit consequences:
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
I suspect (but, of course, can't prove)
I've been thinking of using nodejs as a thin layer between the client and solr
servers. it seems pretty handy for adding features like throttling, load
balancing and basic authentications. -lianyi
On Wed, Jan 22, 2014 at 7:36 PM, Alexandre Rafalovitch
wrote:
> I thought about Go, but that doe
I thought about Go, but that does not give the advantages of spanning
client and server like Dart and Node/Javascript. Which is why Dart
felt a bit more interesting, especially with tree-shaking of unused
code.
But then, neither language has enough adoption to be an answer to my
original question
Thanks Mark. I tried updating clusterstate manually, things went haywire J.
So to fix it, had to take 30secs-1min downtime where I stopped solr and zk,
deleted "/zookeeper_data/version-2" directory and restarted everything
again.
I have auotmated these commands via fabric, so was easily able to re
Zitat von Mikhail Khludnev :
On Wed, Jan 22, 2014 at 10:17 PM, wrote:
I know that I can't just make a query like this: {!parent
which=is_parent:true}+Term, most likely I'll get this error: child query
must only match non-parent docs, but parent docID= matched
childScorer=class org.apache
Hopefully an issue that has been fixed then. We should look into that.
You should be able to fix it by directly modifying the clusterstate.json in
ZooKeeper. Remember to back it up first!
There are a variety of tools you can use to work with ZooKeeper - I like the
eclipse plug-in that you can
solr 4.4.0
On Wed, Jan 22, 2014 at 3:12 PM, Mark Miller wrote:
> What version of Solr are you running?
>
> - Mark
>
>
>
> On Jan 22, 2014, 5:42:30 PM, Utkarsh Sengar
> wrote: I am not sure what happened, I updated merchant collection and then
> restarted all the solr machines.
>
> This is what
What version of Solr are you running?
- Mark
On Jan 22, 2014, 5:42:30 PM, Utkarsh Sengar wrote: I am
not sure what happened, I updated merchant collection and then
restarted all the solr machines.
This is what I see right now: http://i.imgur.com/4bYuhaq.png
merchant collection looks fine.
You will need to use DocValues if you want to use facets with this amount of
terms and not blow the heap.
I have facets with ~39M of unique terms, the response time is about 10 ~ 40
seconds, in my case is not a problem.
--
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
Hi,
I am going to evaluate some Lucene/Solr capabilities on handling faceted
queries, in particular, with a single facet field that contains large number
(say up to 1 million) of distinct values. Does anyone have some experience
on how lucene performs in this scenario?
e.g.
Doc1 has tags A B C D
I am not sure what happened, I updated merchant collection and then
restarted all the solr machines.
This is what I see right now: http://i.imgur.com/4bYuhaq.png
merchant collection looks fine. But deals and prodinfo collections should
have a total of 3 shards. But someone shard1 has converted to
Cool Mark, I'll keep an eye on this one.
L
On 22/01/2014 22:36, Mark Miller wrote:
Whoops, hit the send keyboard shortcut.
I just created a JIRA issue for the first bit I’ll be working on:
SOLR-5656: When using HDFS, the Overseer should have the ability to reassign
the cores from failed nod
Hi,
I am trying to use dataimporthandler(Solr 4.6) from oracle database, but I
have some issues in mapping the data.
I have 3 columns in the test_table,
column1,
column2,
id
dataconfig.xml
Is
Salman,
To my knowledge, there's not a great way of doing this.
Perhaps if your dataset were based on a time series, you could shard by
date, and then only a smaller segment of your data would be updated and
therefore need to be sent each week?
Michael Della Bitta
Applications Developer
o: +1
On Wed, Jan 22, 2014 at 10:17 PM, wrote:
> I know that I can't just make a query like this: {!parent
> which=is_parent:true}+Term, most likely I'll get this error: child query
> must only match non-parent docs, but parent docID= matched
> childScorer=class org.apache.lucene.search.TermScorer
Whoops, hit the send keyboard shortcut.
I just created a JIRA issue for the first bit I’ll be working on:
SOLR-5656: When using HDFS, the Overseer should have the ability to reassign
the cores from failed nodes to running nodes.
- Mark
On Jan 22, 2014, 12:57:46 PM, Lajos wrote: Thanks
I just created a JIRA issue for the first bit I’ll be working on:
- Mark
On Jan 22, 2014, 12:57:46 PM, Lajos wrote: Thanks Mark ...
indeed, some doc updates would help.
Regarding what seems to be a popular question on sharding. It seems that
it would be a Good Thing that the shards for
I would love to see some proxy-like application implemented in go (partly for
my desire of having time to check out go).
- Original Message -
From: "Shawn Heisey"
To: solr-user@lucene.apache.org
Sent: Wednesday, January 22, 2014 10:38:34 AM
Subject: Re: Solr middle-ware?
On 1/22/2014 12
Hello Daniel,
I have an idea to try to use coord() here. Check
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.htmland
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/package-summary.html
So, if you can override similar
I am using the fuzzy search functionality with solr 4.1 and am having
problems with the fuzzy search results when fuzzy level 2 is used.
Here is a description of the issue;
I have an index that consists of one main core that is generated by merging
many other cores together.
If I fuzzy search wi
Thank you very much!!
Just to recap.
My solrconfig.xml had the tvComponent and when I removed that it works as
expected although not as fast as I had hoped. I'll do some more reading on
best practices and probably ask a new question later...
tvComponent
- Svante
2014/
A suggestion would be to hard commit much less often, ie every 10
minutes, and see if there is a change.
- Will try this
How much system RAM ? JVM Heap ? Enough space in RAM for system disk cache ?
- We have 18G of ram 12 dedicated to Solr but as of right now the total
index size is only 5GB
Ah
Hi Daniel,
How about trying something like this (you'll have to play with the boosts to
tune this), search all the fields with all the terms using edismax and use the
minimum should match parameter, but require all terms to match in the
allMetadata field.
https://wiki.apache.org/solr/Extend
Hello again,
I'm using the solr block-join feature to index a journal and all of
it's articles.
Here a short example:
527fcbf8-c140-4ae6-8f51-68cd2efc1343
Sozialmagazin
8
2008
0340-8469
.
Thanks Mark ... indeed, some doc updates would help.
Regarding what seems to be a popular question on sharding. It seems that
it would be a Good Thing that the shards for a collection running HDFS
essentially be pointers to the HDFS-replicated index. Is that what your
thinking is?
I've been
Looking at the list of changes on the 21st and 22nd, I don’t see a smoking gun.
- Mark
On Jan 22, 2014, 11:13:26 AM, Markus Jelsma wrote:
Hi - this likely belongs to an existing open issue. We're seeing the stuff
below on a build of the 22nd. Until just now we used builds of the 20th and
Markus,
With some help from another user on the Nutch list I did a dump and found that
the URLs I am trying to capture are in Nutch. However, when I index them with
Solr I am not getting them. What I get in the dump is this:
http://www.example.com/pdfs/article1.pdf
Status: 2 (db_fetched)
Fetch
If you are upgrading from SolrCloud 4.x to a later version 4.y, and
basically want your end-system to seem as if it had been running 4.y (no
legacy mode or anything) all along, you might find some inspiration here
http://solrlucene.blogspot.dk/2014/01/upgrading-from-solrcloud-4x-to-4y-as-if.htm
Hi - this likely belongs to an existing open issue. We're seeing the stuff
below on a build of the 22nd. Until just now we used builds of the 20th and
didn't have the issue. This is either a bug or did some data format in
Zookeeper change? Until now only two cores of the same shard through the e
Yonik has brought up this feature a few times as well. I’ve always felt about
the same as Shawn. I’m fine with it being optional, default to off. A cluster
reload can be a fairly heavy operation.
- Mark
On Jan 22, 2014, 4:36:19 AM, Mohit Jain wrote: Thanks
Shawn. I appreciate you sharing
Right - solr.hdfs.home is the only setting you should use with SolrCloud.
The documentation should probably be improved.
If you set the data dir or ulog location in solrconfig.xml explicitly, it will
be the same for every collection. SolrCloud shares the solrconfig.xml across
SolrCore’s, an
On 1/22/2014 12:25 AM, Raymond Wiker wrote:
> Speaking for myself, I avoid using "client apis" like SolrNet, SolrJ and
> FAST DSAPI for the simple reason that I feel that the abstractions they
> offer are so thin that I may just as well talk directly to the HTTP
> interface. Doing that also lets me
Hi Fatima,
To enable higlighting (both standard and fastvector) you need to make
stored="true".
Term vectors may speed up standard highlighter. Plus they are mandatory for
FastVectorHighligher.
https://cwiki.apache.org/confluence/display/solr/Field+Properties+by+Use+Case
Ahmet
On Wednes
Hi Viresh,
A couple of things:
1) / character is a special query parser character now. It wasn't before. It is
used for regular expression searches.
http://lucene.apache.org/core/4_6_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches
What happens when you
Hi All,
I have a Solr requirement to send all the documents imported from a
/dataimport query to go through another update chain as a separate
background process.
Currently I have configured my custom update chain in the /dataimport
handler itself. But since my custom update process need to conne
Hi all,
Reading here
http://wiki.apache.org/solr/SolrReplication#How_are_configuration_files_replicated.3F
I don't understand what is the observed behaviour in case
- confFiles contains schema.xml
- schema doesn't change between replication cycles
I mean, I read that the file is physically repl
When i use dismax query type handler in *SOLR 1.4 *and then same for *SOLR
4.3 *then both give different numFound record both have same index profile
as well.
means Solr 1.4 gives 9 records
and Solr 4.3 gives 99 records.
*My Query is:*
start=0&rows=10&hl=true&hl.fl=content&qt=dismax
&q=syste
Uugh. I just realised I should have take out the data dir and update log
definitions! Now it works fine.
Cheers,
L
On 22/01/2014 11:47, Lajos wrote:
Hi all,
I've been running Solr on HDFS, and that's fine.
But I have a Cloud installation I thought I'd try on HDFS. I uploaded
the configs fo
Hi all,
I've been running Solr on HDFS, and that's fine.
But I have a Cloud installation I thought I'd try on HDFS. I uploaded
the configs for the core that runs in standalone mode already on HDFS
(on another cluster). I specify the HdfsDirectoryFactory, HDFS data dir,
solr.hdfs.home, and HDF
Apologies for the late response as this mail was lost somewhere in filters.
Issue was that CommonGramsQueryFilterFactory should be used for searching
and CommonGramsFilterFactory for indexing. We were using
CommonGramsFilterFactory for both due to which it was not dropping single
tokens for common
We do. We have a lot of updates/deletes every day and a weekly optimization
definitely gives a considerable improvement so don't see a downside to it
except the complete replication part which is not an issue on local
network.
Thanks Shawn. I appreciate you sharing the philosophy behind Solr's
implementation. I absolutely agree with the design principle and the fact
that it helps to debug unknown issues. Moreover it definitely gives more
control over the software.
However there are *small* number of applications that mi
I always go for SolrJ as the intermediate layer, usually in a Spring app.
I have sometimes proxied directly to Solr itself, but since we use a lot
of Ajax, I'm not comfortable with exposing the Solr URIs directly, even
if controlled via a proxy.
Having it go through a webapp gives me a layer
Also my highlighting defaults...
on
content documentname
html
0
documentname
3
200
content
750
> -Original Message-
> From: Fatima Issawi [mailto:issa...@qu.edu.qa]
> Sent: Wednesda
Hello,
I'm trying to highlight content that is returned from a Solr query, but I can't
seem to get it working.
I would like to highlight the "documentname" and the "pagetext" or "content"
results, but when I run the search I don't get anything returned. I thought
that the "content" field is su
1 node having more load should be the leader (because of the extra work
of receiving and distributing updates, but my experiences show only a
bit more CPU usage, and no difference in disk IO).
A suggestion would be to hard commit much less often, ie every 10
minutes, and see if there is a change.
52 matches
Mail list logo