Multi-core support for indexing multiple servers
Trying to find specific information to support the following scenario: - I have one site running on one server with marketing content, blog, etc. I want to index. - I have another site running on Magento on a different server with ecommerce content (products). - Both servers live in completely different environments. - I would like to create one single search index between both sites and make that index searchable from both sites. I think I can/should use the multi-core approach and spin off a new server to host Solr but can anyone verify this is the best/most appropriate approach? Are there any other details I need to consider? Can anyone provide a step by step for making this happen to validate my own technical plan? Any help appreciated...was initially thinking I needed SolrCloud but that seems like overkill for my primary use case.
Re: Multi-core support for indexing multiple servers
Great feedback, thanks. So the multi-core structure I have then is a single Solr server set up, essentially hosted by one domain owner (but to be used by both). My question is how does that Solr server connect to the 2 Web applications to create the 1 master index (to be used when searching on either Web app)? It feels like I just reference the Solr server from within the Web app search templates (e.g. PHP files). That is logical in terms of pulling the data into the Web apps, but it's still not clear to me how the data from those 2 Web apps actually gets into the Solr server if Solr server doesn't live on the same server as the Web app(s). Any thoughts? On Wed, Nov 6, 2013 at 10:57 PM, Shawn Heisey wrote: > On 11/6/2013 11:38 PM, Rob Veliz wrote: > > Trying to find specific information to support the following scenario: > > > > - I have one site running on one server with marketing content, blog, > etc. > > I want to index. > > - I have another site running on Magento on a different server with > > ecommerce content (products). > > - Both servers live in completely different environments. > > - I would like to create one single search index between both sites and > > make that index searchable from both sites. > > > > I think I can/should use the multi-core approach and spin off a new > server > > to host Solr but can anyone verify this is the best/most appropriate > > approach? Are there any other details I need to consider? Can anyone > > provide a step by step for making this happen to validate my own > technical > > plan? Any help appreciated...was initially thinking I needed SolrCloud > but > > that seems like overkill for my primary use case. > > SolrCloud makes for *easy* redundancy. There is a three-server minimum > if you want it to be fault-tolerant for both Solr and Zookeeper. The > third server would only run zookeeper and could be an extremely > inexpensive machine. The other two servers would run both Solr and > Zookeeper. Redundancy without cloud is possible, it's just not as > automated, and can be done with two servers. > > It is highly recommended that redundant servers are not separated > geographically. This is especially important with SolrCloud, as > Zookeeper redundancy requires that a majority of the servers be > operational. That can be extremely difficult to guarantee in a > multi-datacenter model, if one assumes that an entire datacenter can > disappear from the network. > > If you don't care about redundancy, then you'd just run a single server, > and SolrCloud wouldn't provide much benefit. > > Multiple cores is a good way to go -- the two indexes would be logically > separate, but you'd be able to use either one. With SolrCloud, it would > be multiple collections. > > Thanks, > Shawn > > -- *Rob Veliz*, Founder | *Mavenbridge* | rob...@mavenbridge.com | M: +1 (206) 909 - 3490 Follow us at: http://twitter.com/mavenbridge
Re: Multi-core support for indexing multiple servers
I've been reading about Solarium--definitely useful. Could you elaborate here: If you are planning a single master index, that's not multicore. Having more than one document type in a single index is possible, they just have to overlap on at least one field - whatever field is the uniqueKey for the index. What I'm trying to do is index marketing pages from one server AND index product pages from a different ecommerce server and then combine those results into a single index, so when I search for "foo" from either site, I get the exact same results for "foo". If that's not multi-core, what's the right approach to accomplish this? On Wed, Nov 6, 2013 at 11:29 PM, Shawn Heisey wrote: > On 11/7/2013 12:07 AM, Rob Veliz wrote: > > Great feedback, thanks. So the multi-core structure I have then is a > > single Solr server set up, essentially hosted by one domain owner (but to > > be used by both). My question is how does that Solr server connect to > the > > 2 Web applications to create the 1 master index (to be used when > searching > > on either Web app)? It feels like I just reference the Solr server from > > within the Web app search templates (e.g. PHP files). That is logical in > > terms of pulling the data into the Web apps, but it's still not clear to > me > > how the data from those 2 Web apps actually gets into the Solr server if > > Solr server doesn't live on the same server as the Web app(s). Any > > thoughts? > > Solr uses HTTP calls. It is REST-like, though there has been some > recent work to make parts of it actually use true REST, that paradigm > might later be extended to the entire interface. > > There are a number of Solr API packages for PHP that give you an > obect-oriented interface to Solr that won't require learning Solr's HTTP > interface - you write PHP code to access Solr. These are two of them > that I have heard about. I've not actually used these, as I have little > personal experience with writing PHP: > > http://pecl.php.net/package/solr > http://www.solarium-project.org/ > > If you are planning a single master index, that's not multicore. Having > more than one document type in a single index is possible, they just > have to overlap on at least one field - whatever field is the uniqueKey > for the index. > > Thanks, > Shawn > > -- *Rob Veliz*, Founder | *Mavenbridge* | rob...@mavenbridge.com | M: +1 (206) 909 - 3490 Follow us at: http://twitter.com/mavenbridge
Querying for results
Hello, I am running Solr from Magento and using DIH to import/index data from 1 other source (external). I am trying to query for results...two questions: 1. The query I'm using runs against "fulltext_1_en" which is a specific shard created by the Magento deployment in solrconfig.xml. Should I be using/querying from another field/store (e.g. not fulltext_1*) to get results from both Magento and the other data source? How would I add the data from my DIH indexing to that specific shard so it was all in the same place? 2. OR do I need to add another shard to correspond to the DIH data elements? 3. OR is there something else I'm missing in trying to query for data from 2 sources? Thanks!
Re: Querying for results
Follow-up: Would anyone very familiar with DIH be willing to jump on a side thread with me and my developer to help troubleshoot some issues we're having? Please little r me at: robert [at] mavenbridge.com. Thanks! On Wed, Dec 4, 2013 at 1:14 PM, Rob Veliz wrote: > Hello, > > I am running Solr from Magento and using DIH to import/index data from 1 > other source (external). I am trying to query for results...two questions: > > 1. The query I'm using runs against "fulltext_1_en" which is a specific > shard created by the Magento deployment in solrconfig.xml. Should I be > using/querying from another field/store (e.g. not fulltext_1*) to get > results from both Magento and the other data source? How would I add the > data from my DIH indexing to that specific shard so it was all in the same > place? > > 2. OR do I need to add another shard to correspond to the DIH data > elements? > > 3. OR is there something else I'm missing in trying to query for data from > 2 sources? > > Thanks! > > > -- *Rob Veliz*, Founder | *Mavenbridge* | rob...@mavenbridge.com | M: +1 (206) 909 - 3490 Follow us at: http://twitter.com/mavenbridge