Re: Solrcloud and Node.js

Luis Cappa Banda Sat, 15 Dec 2012 03:17:06 -0800

Hello, Per.

Thanks for your answer! I jave worked a lot with SolrJ and in the last two
months also with the new SolrJ 4.0 and specifically with Zookeeper and
CloudSolrServer implementation. I've developed a search engine wrapper that
dispatches queries to SolrCloud using a CloudSolrServer pool. The whole
WebApp is built in Java, and works fine, but some months ago I met Node.js
and question myself: "hey, Node.js is awesome dispatching queries with up
to 250K requests in a single machine. Why not to try to to work with Solr?"


And then I started to built a Node.js Solr client to execute queries just
into one Solr server instance. You can't imagine how damn good is that
combination. Just a simple example: Node.js with Express.js back-end
dispatching queries to just one Solr server instance with a simple and very
basic Solr.js client that I've developed. The server host has just* 1 core*and
*1GB RAM*. I indexed *3 Million* of documents and test the whole system
with an ab test. *Result*: ab -c 1000 -n 100000 + CPU: 24% RAM: 768M. So
good.

However, the power of Solr 4.0 resides in SolrCloud, and I would like to
build an smarter Node.js client that uses sharding queries over collection
shards. I think that I don't have enougth time to build a complete
CloudSolrServer Node.js version, and compiling Java code into Javascript
one doesn't sound good in terms of performance and best practices.

Maybe if I can access someway frequently to Zookeeper data status I can
update this Solr.js client status to execute queries to those collections
that have shards alive, balancing between leader and replica shards. It
won't work as smart as CloudSolrServer.java, but not as dumb as executing
distributed queries without any Zookeeper cluster status knowledge.

Do you know if SolrCloud replica shards have 100% the same data as the
leader ones every time? Probably wen synchronizing with leaders there
exists a delay, so executing queries to replicas won't be a good idea.

Thank you very much in advance.

Best regards,


2012/12/15 Per Steffensen <st...@designware.dk>

> As Mark mentioned Solr(Cloud) can be accessed through HTTP and return e.g.
> JSON which should be easy to handle in a javascript. But the client-part
> (SolrJ) of Solr is not just a dumb client interface - it provides a lot of
> client-side functionality, e.g. some intelligent decision making based on
> ZK state. I would probably try to see if I could make SolrJ and in
> particular CloudSolrServer (yes its a client, even though the name does not
> indicate) work. Maybe you will successful using one of:
> * 
> https://github.com/**nearinfinity/node-java<https://github.com/nearinfinity/node-java>to
>  embed CloudSolrServer in node.js
> * use GWT to compile CloudSolrServer to javascript (I would imagine it
> will be hard to make it work though)
>
> Regards, Per Steffensen
>
> Luis Cappa Banda skrev:
>
>  Hello!
>>
>> I've always used Java as the backend language to program search modules,
>> and I know that CloudSolrServer implementation is the way to interact with
>> SolrCloud. However, I'm starting to love Node.js and I was wondering if
>> there exists the posibility to launch queries to a SolrCloud with the "old
>> fashioned" sharding syntax.
>>
>> Thank you in advance!
>>
>> Best regards.
>>
>>
>>
>
>


-- 

- Luis Cappa

Re: Solrcloud and Node.js

Reply via email to