i already indexing 180 data in solr index. all files were in json format.
so data was like -
[
{
"id":1,
"first_name":"anurag",
"last_name":"jain",
...
},
{
"id":2,
"first_name":"abhishek",
"last_name":"jain",
...
}, ...
]
now i have to add a field in data like
[
{
"id":1,
"first_name":
Easier than:
solrpost.sh a*.xml > a.log &
solrpost.sh b*.xml > b.log &
solrpost.sh c*.xml > c.log &
and so on?
We have a fair selection of Solr servers where I work (Chegg), loaded several
different ways, and one of our production cores is loaded with curl sending in
a CSV file and checking fo
Heh, I've considered all sorts of things :-) Including precisely what
you are referring to :-) In the end, I need something that will require
the minimum of effort for a new user, so updating post.jar is going to
be the most straight-forward, as otherwise I'd need to find a cross
platform multithre
Have you considered writing a script to upload them with curl and running
multiple copies of the script in the background?
wunder
On Feb 4, 2013, at 8:22 PM, Upayavira wrote:
> Thx Jan,
>
> All I know is I've got a data set of 500k documents, Solr formatted, and
> I want it to be as easy as po
Thx Jan,
All I know is I've got a data set of 500k documents, Solr formatted, and
I want it to be as easy as possible to get them into Solr. I also want
to be able to show the benefit of multithreading. The outcome would
really be "make sure your code uses multiple threads to push to Solr"
rather
Thanks man
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-use-SolrCloud-in-multi-threaded-indexing-tp4037641p4038482.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks man
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-use-SolrCloud-in-multi-threaded-indexing-tp4037641p4038481.html
Sent from the Solr - User mailing list archive at Nabble.com.
First, I'd be sure that it's worth the effort. Are you doing a massive
number of updates? HTTP isn't all that time-consuming, so how many
documents are you talking about here?
Doing this inside Solr will stress Solr as well, even in the atomic update
world. Your poor Solr disk has to go out and co
Me too, it fails randomly with test classes. We use Solr4.0 for testing, no
maven, only ant.
--roman
On 4 Feb 2013 20:48, "Mike Schultz" wrote:
> Yes. Just today actually. I had some unit test based on
> AbstractSolrTestCase which worked in 4.0 but in 4.1 they would fail
> intermittently with t
Hi,
What are your documents and queries like? What does your performance
monitoring tool say about various server, jvm, and solr metrics?
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Feb 4, 2013 3:36 PM, "sausarkar" wrote:
> When the query time jump to more than 10 seconds the linu
Yes. Just today actually. I had some unit test based on
AbstractSolrTestCase which worked in 4.0 but in 4.1 they would fail
intermittently with that error message. The key to this behavior is found
by looking at the code in the lucene class: TestRuleSetupAndRestoreClassEnv.
I don't understand i
I have a 4.x cluster that is 10 wide and 4 deep. One of the nodes of a shard
went down. I provisioned a replacement node and introduced into the
cluster, but it ended up on a random shard, not the shard with the downed
node.
Is there a maintenance step I need to perform before introducing a node
I believe, for the example directory (as in relative to start.jar),
contexts directory has the url mapping to solr (/solr), etc has some global
jetty properties and solr-webapp/webapp/WEB-INF contains some Solr's
specific jetty configuration.
Beware that the last one however is a decompressed vers
I just tried it with Solr 4.0 and it worked fine, for me.
I used the "cat" field of the Solr example, which is a multivalued "string"
field.
My data:
curl http://localhost:8983/solr/update?commit=true -H
'Content-type:application/csv' -d '
id,cat,cat,cat
d-1,aardvark,abacus,accord
d-2,aboar
Hello all,
How do I change the configuration for the Jetty that is shipped with Apache
Solr? Where are the configuration files located? I want to restrict the IP
address that can connect to that instance of Solr
Thanks,
Saqib
Unfortunately we need data by minute we cannot go hour, is there an option
for 3 minutes or 5 minutes? something is like NOW/3MIN?
I am also noticing when I generating around 110 queries per second (date
range ones) after sometime solr does not respond and just freezes. Is there
a way to cure this
The core concept behind filter queries is that they cache their results
(list of documents) so that they can filter the main query more efficiently.
But is your query keeps changing, like every minute, the cached results need
to get thrown away and recalculated, every minute.
Can you try hourl
Ah, OK, sorry to be terse!
1. Create a class that implements SolrServer from the SolrJ project:
http://lucene.apache.org/solr/4_1_0/solr-solrj/org/apache/solr/client/solrj/SolrServer.html
2. Make the constructor of that class take as arguments the config you
need to make an HttpSolrServer object a
On 2/4/2013 2:38 PM, Michael Della Bitta wrote:
Hi Shawn,
Why don't you write a delegating SolrServer class that lazily
instantiates an HttpSolrServer and catches and logs exceptions when
something's down?
I only about half understood that, and I'm not sure how to do it. I'm
willing to learn
Hi Shawn,
Why don't you write a delegating SolrServer class that lazily
instantiates an HttpSolrServer and catches and logs exceptions when
something's down?
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271
ww
I have a SolrJ app that talks to four Solr servers. For each server, it
creates 14 HttpSolrServer objects, each for a different core.
Once it's running, the app knows how to deal with errors, especially
those caused when the server goes down.
The problem is at program startup -- when 'new Ht
Seems to work for me, at least at the Solr HTTP level. I just did a quick
test using the stock Solr 4.0 example.
I added a couple of mini-docs that had spaces in their ids, both leading and
trailing:
curl http://localhost:8983/solr/update?commit=true -H
'Content-type:application/csv' -d '
i
Hi,
Hmm, the tool is getting bloated for a one-class no-deps tool already :)
Guess it would be useful too with real-life code examples using SolrJ and other
libs as well (such as robots.txt lib, commons-cli etc), but whether that should
be an extension of SimplePostTool or a totally new tool fro
Ahhh. On this system, I am not using SolrCloud. On a separate
system that I'm building with SolrCloud, I'm only using it for high
availability, not for distributed search.
Is there already an issue for a configurable unique field? If not, I
can make one.
Thanks,
Shawn
On 2/4/2013 12
You could use LocalSolrQueryRequest to create the request, but it is not
necessary, if all what you need is to get the lucene query parser, just do:
import org.apache.lucene.queryparser.classic.QueryParser
qp = new QueryParser(Version.LUCENE_40, defaultField, new SimpleAnalyzer());
Query q = qp.p
When the query time jump to more than 10 seconds the linux load average
spikes up to more than 100 in a 16 CPU machine. Any one has any suggestions?
--
View this message in context:
http://lucene.472066.n3.nabble.com/date-query-performance-TrieDate-tp4038419p4038422.html
Sent from the Solr - Us
Hi everyone,
i am new to SOLR and I have experienced a strange behavior:
I have defined a field "name" of type "string". This field contains values
with trailing spaces such as " John".
This is intended and the space needs to "survive".
The space is in the index as I can see when using Luke.
Bu
we are experiencing performance issues with date range queries. We have
configured the date fields as following:
Our queries are rounded every minute:
qt=ads&debugQuery=false&fl=id,StartDt_t110,...&fq=Status_i110:2&fq=StartDt_t110:{
NOW/MINUTE-150DAYS TO NOW/MINUTE-90DAYS }&start=0&rows=20
I don't have the source handy. I believe that SolrCloud hard-codes 'id'
as the field name for defining shards.
On 02/04/2013 10:19 AM, Shawn Heisey wrote:
On 2/4/2013 10:58 AM, Lance Norskog wrote:
A side problem here is text analyzers: the analyzers have changed how
they split apart text for
Of course I did not mean to multiple cores of the same shard...
A normal SolrCloud configuration, let's say 4 shards, on 4 servers, using
replicationFactor=3.
Of course, no matter what core was requested, the request will be forwarded
to one core of each shard.
My question is - whether this *first*
On 2/4/2013 12:06 PM, Isaac Hebsh wrote:
LBHttpSolrServer is only solrj feature.. doesn't it?
I think that Solr does not balance queries among cores in the same server.
You can claim that it's a non-issue, if a single core can completely serve
multiple queries on the same time, and passing requ
LBHttpSolrServer is only solrj feature.. doesn't it?
I think that Solr does not balance queries among cores in the same server.
You can claim that it's a non-issue, if a single core can completely serve
multiple queries on the same time, and passing requests through different
cores does nothing.
On 2/4/2013 11:42 AM, Mikhail Khludnev wrote:
I don't know how post.jar built on, but FWIW,
http://lucene.apache.org/solr/api-4_0_0-BETA/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.html
sounds like what you need.
Just an FYI - unless you override the handleError method in that
I don't know how post.jar built on, but FWIW,
http://lucene.apache.org/solr/api-4_0_0-BETA/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.htmlsounds
like what you need.
On Mon, Feb 4, 2013 at 10:19 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:
> Hello,
>
> Threads were
Hello,
Threads were cut off from the DIH at 4.0.
My proposal for server-side concurrency is
https://issues.apache.org/jira/browse/SOLR-3585
Regards
On Sun, Feb 3, 2013 at 9:42 PM, Upayavira wrote:
> I haven't tried DIH, although if it does support multithreading, I might
> be inclined to.
>
>
On 2/4/2013 10:58 AM, Lance Norskog wrote:
A side problem here is text analyzers: the analyzers have changed how
they split apart text for searching, and are matched pairs. That is, the
analyzer queries are created matching what the analyzer did when
indexing. If you do this binary upgrade sequen
Mickael,
feel free to find performance comparison between query and index time joins.
http://blog.griddynamics.com/2012/08/block-join-query-performs.html
On Mon, Feb 4, 2013 at 8:20 PM, Mickael Magniez wrote:
> The problem is that i have to have right count on my resultpage, so i have
> to
> r
A side problem here is text analyzers: the analyzers have changed how
they split apart text for searching, and are matched pairs. That is, the
analyzer queries are created matching what the analyzer did when
indexing. If you do this binary upgrade sequence, the indexed data will
not match what
Just to add a little to the good stuff Shawn has shared here - Solr 4.1
does not support 1.4.1 indexes. If you cannot re-index (by far
recommended), then first upgrade to 3.6, then optimize your index, which
will convert it to 3.6 format. Then you will be able to use that index
in 4.1. The simple l
Hi,
i want same thing as the following but with jboss:
http://knackforge.com/blog/sivaji/how-protect-apache-solr-admin-console
how can i do that? any hint, or tutorial what can be helpful?
regards,
hassan
--
View this message in context:
http://lucene.472066.n3.nabble.com/securing-solr-with
Hi,
It does not work for distributed search:
org.apache.solr.handler.component.ShardFieldSortedHitQueue.getCachedComparator(ShardDoc.java:193)
...
case DOC:
// TODO: we can support this!
throw new RuntimeException("Doc sort not supported");
...
Try to sort by unique ID.
Regards.
Hey Tim, long time no talk to!
The UUID type isn't a native UUID like you're expecting. It's a String
that gets validated as a UUID at input time. You're probably being
saved by the .toString() method at index time, but that's not going to
help you during retrieval.
You're probably going to want
On Mon, Feb 4, 2013 at 10:34 AM, Mickael Magniez
wrote:
> group.ngroups=true
This is currently very inefficient - if you can live without
retrieving the total number of groups, performance should be much
better.
-Yonik
http://lucidworks.com
On 2/4/2013 7:20 AM, Artem OXSEED wrote:
I need to upgrade our Solr installation from 1.4.1 to the latest 4.1.0
version. The question is how to deal with indexes. AFAIU there are two
things to be aware of: file format and index format (excuse me for
possible term mismatch, I'm new to Solr) - and
Hi,
Field collapsing is built-in and is also called Result Grouping:
http://wiki.apache.org/solr/FieldCollapsing
You simply enable it with ...&group=true&group.field=myfield
If that does not work for you, please respond with detailed error messages so
we can help you further.
--
Jan Høydahl, s
Hi!
I need to do some periodical updates in my Solr app(v.4.1).
There is some logic for this updates operation, and all the data I need for
this updates I already have in the index. Basically I have some small
collection which is synchronized with external out-side storage, and few
very big collec
Hi,
I need to upgrade our Solr installation from 1.4.1 to the latest 4.1.0
version. The question is how to deal with indexes. AFAIU there are two
things to be aware of: file format and index format (excuse me for
possible term mismatch, I'm new to Solr) - and while file format can
(and will a
see an example at
http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/solr/contrib/uima/src/test-files/uima/uima-tokenizers-schema.xml?view=diff&r1=1442116&r2=1442117&pathrev=1442117where
the 'ngramsize' parameter is set, that's defined in
AggregateSentenceAE.xml descriptor and is then set w
thanks
I have instaled cygwin and its running fine now thanks..
On Mon, Feb 4, 2013 at 6:14 PM, Gora Mohanty wrote:
> On 4 February 2013 17:50, Rohan Thakur wrote:
> > yup I am downloading cygwin now...will be working through there let see
> it
> > should work though...
> [...]
>
> We are gett
Regarding configuration parameters have a look at
https://issues.apache.org/jira/browse/LUCENE-4749
Regards,
Tommaso
2013/2/4 Tommaso Teofili
> Thanks Kai for your feedback, I'll look into it and let you know.
> Regards,
> Tommaso
>
>
> 2013/2/1 Kai Gülzau
>
>> I now use the "stupid" way to use
On 4 February 2013 17:50, Rohan Thakur wrote:
> yup I am downloading cygwin now...will be working through there let see it
> should work though...
[...]
We are getting highly off-topic now, but f you have RAM
available on the machine, you should seriously consider
running Linux in a VM.
Regards,
yup I am downloading cygwin now...will be working through there let see it
should work though...
On Mon, Feb 4, 2013 at 5:14 PM, Gora Mohanty wrote:
> On 4 February 2013 16:58, Rohan Thakur wrote:
> >
> > hi
> >
> > I think I have found the problem its windows which is acctualy not able
> to
>
On 4 February 2013 16:58, Rohan Thakur wrote:
>
> hi
>
> I think I have found the problem its windows which is acctualy not able to
> distinguish between double and single quote and thus curl is trying to
> resolve the host under double quote individually after -d and thus causing
> the error but
hi
I think I have found the problem its windows which is acctualy not able to
distinguish between double and single quote and thus curl is trying to
resolve the host under double quote individually after -d and thus causing
the error but how do I rectify this in windows that is what I am looking
f
hi
were you able to do atomic update for specific field using curlim using
curl from windows cmd but getting error like host can not be resolved im
using this command:
C:\Users\rohan>curl www.localhost.com:8983/solr/update?commit=true -H
"Content-t
ype:text/json" -d '[{"value":"samsung-wave-s5
hi gora
I have tried what you told but now its giving error like:
C:\Users\rohan>curl 127.0.0.1:8983/solr/update?commit=true -H
"Content-type:appl
ication/json" -d '[{"value":"samsung-wave-s5253-silver",
"value":{"set":"samsung
-111"}}]'
{"responseHeader":{"status":500,"QTime":1},"error":{"msg":"
Thanks Kai for your feedback, I'll look into it and let you know.
Regards,
Tommaso
2013/2/1 Kai Gülzau
> I now use the "stupid" way to use the german corpus for UIMA: copy + paste
> :-)
>
> I modified the Tagger-2.3.1.jar/HmmTagger.xml to use the german corpus
> ...
>
> file:german/TuebaMode
Hi,
Thanks for the reply. In my application, I am using some servlets to receive
the request from user since I need to authenticate the user and adding
conditions like userid= before sending the request to Solr Server using
one of the two approaches
1) Using SolrServer
SolrServer server = ne
Hi
Is it possible to get the distinct count of a given facet field in Solr?
A query like this q=*:*&facet=true&facet.field=cat display the counts of all
the unique categories present like
electronics: 100
applicances:200 etc..
But if the list is big.. i dont want to get the entire list and
On 4 February 2013 13:29, Rohan Thakur wrote:
> hi arcadius
>
> can you also help me with partial document update...I have followed what is
> written in this blog but its giving me error
> http://solr.pl/en/2012/07/09/solr-4-0-partial-documents-update/
>
> error im getting after this command :
hi arcadius
I also tried going by this blog but in this too I am not able to use curl
for update now it gives can not resolve host even if I can open the host
using browserplease can you help me with thisI want to do partial
document update for specific field...
thanks
regards
Rohan
On M
Hi
I am using Solr 4.0 and i am getting below socket exception very frequently.
Could any one have any idea, why this error is generated.
Tomcat is running but it it is not accepting any connection. I need to kill the
process and restart tomact.
INFO: Retrying request
Feb 3, 2013 3:58:20 PM org
Hi everybody
Please, I need to know if anybody has done similar something.
I have to developed a notification when commit event hapened on Solr
server, but I have to know updated records for creating correctly the
notification. Develop java class extends of AbstractSolrEventListener, I
don't
63 matches
Mail list logo