So there are a couple of other possibilities: 1> go ahead and index things as you are now, then just move the index itself (<solr_home>/data) up to AWS. This is just a straight file copy. 2> Could you consider having your public server set up as a slave and the master (running DIH if you prefer) being behind the appropriate firewalls. Then the firewall only lets your public instance do the replication.
Best Erick On Sun, Sep 9, 2012 at 2:13 PM, Alexandre Rafalovitch <arafa...@gmail.com> wrote: > Hmm, that might actually work. > > My current prototype is using DIH and TIKA for stop-the-world index > re-population, so I assumed it would have to be done by a local SOLR > instance. > > But I guess for production, I can run TIKA on the client and not use > DIH at all. This might be enough. > > Thank you, > Alex. > > Personal blog: http://blog.outerthoughts.com/ > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch > - Time is the quality of nature that keeps events from happening all > at once. Lately, it doesn't seem to be working. (Anonymous - via GTD > book) > > > On Sat, Sep 8, 2012 at 4:22 PM, Erick Erickson <erickerick...@gmail.com> > wrote: >> These are really unrelated. Presumably you have some >> program that accesses your system of record, that you >> want to keep private. No problem, that program (SolrJ?) >> is accessing your private data and sending the SolrInputDocuments >> to the cloud-based Solr program for searching. >> >> Or I don't understand the problem at all <G>.. >> >> Best >> Erick >> >> On Fri, Sep 7, 2012 at 11:43 AM, Alexandre Rafalovitch >> <arafa...@gmail.com> wrote: >>> Hello, >>> >>> I have a bunch of documents that I would like to index on a local >>> server behind the firewall. But then, the actual search will happen on >>> a public infrastructure (Amazon, etc). The documents themselves are >>> not quite public, so I want just the index content (indexed, not >>> stored) being available outside the firewall. >>> >>> Is that something that is doable with Solr Cloud or index copying, etc? >>> >>> Regards, >>> Alex. >>> >>> Personal blog: http://blog.outerthoughts.com/ >>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch >>> - Time is the quality of nature that keeps events from happening all >>> at once. Lately, it doesn't seem to be working. (Anonymous - via GTD >>> book)