Multi-core support for indexing multiple servers

2013-11-06 Thread Rob Veliz
Trying to find specific information to support the following scenario:

- I have one site running on one server with marketing content, blog, etc.
I want to index.
- I have another site running on Magento on a different server with
ecommerce content (products).
- Both servers live in completely different environments.
- I would like to create one single search index between both sites and
make that index searchable from both sites.

I think I can/should use the multi-core approach and spin off a new server
to host Solr but can anyone verify this is the best/most appropriate
approach?  Are there any other details I need to consider?  Can anyone
provide a step by step for making this happen to validate my own technical
plan?  Any help appreciated...was initially thinking I needed SolrCloud but
that seems like overkill for my primary use case.


Re: Multi-core support for indexing multiple servers

2013-11-06 Thread Rob Veliz
Great feedback, thanks.  So the multi-core structure I have then is a
single Solr server set up, essentially hosted by one domain owner (but to
be used by both).  My question is how does that Solr server connect to the
2 Web applications to create the 1 master index (to be used when searching
on either Web app)?  It feels like I just reference the Solr server from
within the Web app search templates (e.g. PHP files).  That is logical in
terms of pulling the data into the Web apps, but it's still not clear to me
how the data from those 2 Web apps actually gets into the Solr server if
Solr server doesn't live on the same server as the Web app(s).  Any
thoughts?


On Wed, Nov 6, 2013 at 10:57 PM, Shawn Heisey  wrote:

> On 11/6/2013 11:38 PM, Rob Veliz wrote:
> > Trying to find specific information to support the following scenario:
> >
> > - I have one site running on one server with marketing content, blog,
> etc.
> > I want to index.
> > - I have another site running on Magento on a different server with
> > ecommerce content (products).
> > - Both servers live in completely different environments.
> > - I would like to create one single search index between both sites and
> > make that index searchable from both sites.
> >
> > I think I can/should use the multi-core approach and spin off a new
> server
> > to host Solr but can anyone verify this is the best/most appropriate
> > approach?  Are there any other details I need to consider?  Can anyone
> > provide a step by step for making this happen to validate my own
> technical
> > plan?  Any help appreciated...was initially thinking I needed SolrCloud
> but
> > that seems like overkill for my primary use case.
>
> SolrCloud makes for *easy* redundancy.  There is a three-server minimum
> if you want it to be fault-tolerant for both Solr and Zookeeper.  The
> third server would only run zookeeper and could be an extremely
> inexpensive machine.  The other two servers would run both Solr and
> Zookeeper.  Redundancy without cloud is possible, it's just not as
> automated, and can be done with two servers.
>
> It is highly recommended that redundant servers are not separated
> geographically.  This is especially important with SolrCloud, as
> Zookeeper redundancy requires that a majority of the servers be
> operational.  That can be extremely difficult to guarantee in a
> multi-datacenter model, if one assumes that an entire datacenter can
> disappear from the network.
>
> If you don't care about redundancy, then you'd just run a single server,
> and SolrCloud wouldn't provide much benefit.
>
> Multiple cores is a good way to go -- the two indexes would be logically
> separate, but you'd be able to use either one.  With SolrCloud, it would
> be multiple collections.
>
> Thanks,
> Shawn
>
>


-- 
*Rob Veliz*, Founder | *Mavenbridge* | rob...@mavenbridge.com | M: +1 (206)
909 - 3490

Follow us at: http://twitter.com/mavenbridge


Re: Multi-core support for indexing multiple servers

2013-11-06 Thread Rob Veliz
I've been reading about Solarium--definitely useful.  Could you elaborate
here:

If you are planning a single master index, that's not multicore.  Having
more than one document type in a single index is possible, they just
have to overlap on at least one field - whatever field is the uniqueKey
for the index.

What I'm trying to do is index marketing pages from one server AND index
product pages from a different ecommerce server and then combine those
results into a single index, so when I search for "foo" from either site, I
get the exact same results for "foo".  If that's not multi-core, what's the
right approach to accomplish this?


On Wed, Nov 6, 2013 at 11:29 PM, Shawn Heisey  wrote:

> On 11/7/2013 12:07 AM, Rob Veliz wrote:
> > Great feedback, thanks.  So the multi-core structure I have then is a
> > single Solr server set up, essentially hosted by one domain owner (but to
> > be used by both).  My question is how does that Solr server connect to
> the
> > 2 Web applications to create the 1 master index (to be used when
> searching
> > on either Web app)?  It feels like I just reference the Solr server from
> > within the Web app search templates (e.g. PHP files).  That is logical in
> > terms of pulling the data into the Web apps, but it's still not clear to
> me
> > how the data from those 2 Web apps actually gets into the Solr server if
> > Solr server doesn't live on the same server as the Web app(s).  Any
> > thoughts?
>
> Solr uses HTTP calls.  It is REST-like, though there has been some
> recent work to make parts of it actually use true REST, that paradigm
> might later be extended to the entire interface.
>
> There are a number of Solr API packages for PHP that give you an
> obect-oriented interface to Solr that won't require learning Solr's HTTP
> interface - you write PHP code to access Solr.  These are two of them
> that I have heard about.  I've not actually used these, as I have little
> personal experience with writing PHP:
>
> http://pecl.php.net/package/solr
> http://www.solarium-project.org/
>
> If you are planning a single master index, that's not multicore.  Having
> more than one document type in a single index is possible, they just
> have to overlap on at least one field - whatever field is the uniqueKey
> for the index.
>
> Thanks,
> Shawn
>
>


-- 
*Rob Veliz*, Founder | *Mavenbridge* | rob...@mavenbridge.com | M: +1 (206)
909 - 3490

Follow us at: http://twitter.com/mavenbridge


Querying for results

2013-12-04 Thread Rob Veliz
Hello,

I am running Solr from Magento and using DIH to import/index data from 1
other source (external).  I am trying to query for results...two questions:

1. The query I'm using runs against "fulltext_1_en" which is a specific
shard created by the Magento deployment in solrconfig.xml.  Should I be
using/querying from another field/store (e.g. not fulltext_1*) to get
results from both Magento and the other data source?  How would I add the
data from my DIH indexing to that specific shard so it was all in the same
place?

2. OR do I need to add another shard to correspond to the DIH data elements?

3. OR is there something else I'm missing in trying to query for data from
2 sources?

Thanks!


Re: Querying for results

2013-12-04 Thread Rob Veliz
Follow-up: Would anyone very familiar with DIH be willing to jump on a side
thread with me and my developer to help troubleshoot some issues we're
having?  Please little r me at: robert [at] mavenbridge.com.  Thanks!




On Wed, Dec 4, 2013 at 1:14 PM, Rob Veliz  wrote:

> Hello,
>
> I am running Solr from Magento and using DIH to import/index data from 1
> other source (external).  I am trying to query for results...two questions:
>
> 1. The query I'm using runs against "fulltext_1_en" which is a specific
> shard created by the Magento deployment in solrconfig.xml.  Should I be
> using/querying from another field/store (e.g. not fulltext_1*) to get
> results from both Magento and the other data source?  How would I add the
> data from my DIH indexing to that specific shard so it was all in the same
> place?
>
> 2. OR do I need to add another shard to correspond to the DIH data
> elements?
>
> 3. OR is there something else I'm missing in trying to query for data from
> 2 sources?
>
> Thanks!
>
>
>


-- 
*Rob Veliz*, Founder | *Mavenbridge* | rob...@mavenbridge.com | M: +1 (206)
909 - 3490

Follow us at: http://twitter.com/mavenbridge