Yes, It won't work if you are using OpenOffice. However it works fine with Microsoft Word.
Hope it helps. Albert On 8 April 2011 14:55, Andy <angelf...@yahoo.com> wrote: > I can't view the document either -- it showed up empty. > > Has anyone succeeded in viewing it? > > Andy > > --- On Fri, 4/8/11, Albert Vila <a...@imente.com> wrote: > >> From: Albert Vila <a...@imente.com> >> Subject: Re: Very very large scale Solr Deployment = how to do (Expert >> Question)? >> To: solr-user@lucene.apache.org >> Date: Friday, April 8, 2011, 3:43 AM >> Ephraim, I still can't view the >> document. >> >> Don't know if I'm doing something wrong, but I downloaded >> it and It >> appears to be empty. >> >> Albert >> >> On 7 April 2011 09:32, Ephraim Ofir <ephra...@icq.com> >> wrote: >> > You can't view it online, but you should be able to >> download it from: >> > https://docs.google.com/leaf?id=0BwOEbnJ7oeOrNmU5ZThjODUtYzM5MS00YjRlLWI >> > 2OTktZTEzNDk1YmVmOWU4&hl=en&authkey=COGel4gP >> > >> > Enjoy, >> > Ephraim Ofir >> > >> > >> > -----Original Message----- >> > From: Jens Mueller [mailto:supidupi...@googlemail.com] >> > Sent: Thursday, April 07, 2011 8:30 AM >> > To: solr-user@lucene.apache.org >> > Subject: Re: Very very large scale Solr Deployment = >> how to do (Expert >> > Question)? >> > >> > Hello Ephraim, hello Lance, hello Walter, >> > >> > thanks for your replies: >> > >> > Ephraim, thanks very much for the further detailed >> explanation. I will >> > try >> > to setup a demo system in the next few days and use >> your advice. >> > LoadBalancers are an important aspect of your design. >> Can you recommend >> > one >> > LB specificallly? (I would be using haproxy.1wt.eu) . >> I think the Idea >> > with >> > uploading your document is very good. However >> Google-Docs seemed not be >> > be >> > working (at least for me with the docx format?), but >> maybe you can >> > simply >> > output the document as PDF and then I think Google >> Docs is working, so >> > all >> > the others can also have a look at your concept. The >> best approach would >> > be >> > if you could upload your advice directly somewhere to >> the solr wiki as >> > it is >> > really helpful.I found some other documents meanwhile, >> but yours is much >> > clearer and more complete, with the LBs and the >> Aggregators ( >> > http://lucene-eurocon.org/slides/Solr-In-The-Cloud_Mark-Miller.pdf) >> > >> > Lance, thanks I will have a look at what linkedin is >> doing. >> > >> > Walter, thanks for the advice: Well you are right, >> mentioning google. My >> > question was also to understand how such large systems >> like >> > google/facebook >> > are actually working. So my numbers are just >> theoretical and made up. My >> > system will be smaller, but I would be very happy to >> understand how >> > such >> > large systems are build and I think the approach >> Ephraim showd should be >> > working quite well at large scale. If you know a good >> documents (besides >> > the >> > bigtable research paper that I already know) that >> technically describes >> > how >> > google is working in detail that would be of great >> interest. You seem to >> > be >> > working for a company that handles large datasets. >> Does google use this >> > approach, sharing the index into N writers, and the >> procuded index is >> > then >> > replicated to N "read only searchers"? >> > >> > thank you all. >> > best regards >> > jens >> > >> > >> > >> > 2011/4/7 Walter Underwood <wun...@wunderwood.org> >> > >> >> The bigger answer is that you cannot get to this >> size by just >> > configuring >> >> Solr. You may have to invent a lot of stuff. Like >> all of Google. >> >> >> >> Where did you get these numbers? The proposed >> query rate is twice as >> > big as >> >> Google (Feb 2010 estimate, 34K qps). >> >> >> >> I work at MarkLogic, and we scale to 100's of >> terabytes, with fast >> > update >> >> and query rates. If you want a real system that >> handles that, you >> > might want >> >> to look at our product. >> >> >> >> wunder >> >> >> >> On Apr 6, 2011, at 8:06 PM, Lance Norskog wrote: >> >> >> >> > I would not use replication. LinkedIn >> consumer search is a flat >> > system >> >> > where one process indexes new entries and >> does queries >> > simultaneously. >> >> > It's a custom Lucene app called Zoie. Their >> stuff is on Github.. >> >> > >> >> > I would get documents to indexers via a >> multicast IP-based queueing >> >> > system. This scales very well and there's a >> lot of hardware support. >> >> > >> >> > The problem with distributed search is that >> it is a) inherently >> > slower >> >> > and b) has inherently more and longer jitter. >> The "airplane wing" >> >> > distribution of query times becomes longer >> and flatter. >> >> > >> >> > This is going to have to be a "federated" >> system, where the >> > front-end >> >> > app aggregates results rather than Solr. >> >> > >> >> > On Mon, Apr 4, 2011 at 6:25 PM, Jens Mueller >> > <supidupi...@googlemail.com> >> >> wrote: >> >> >> Hello Experts, >> >> >> >> >> >> >> >> >> >> >> >> I am a Solr newbie but read quite a lot >> of docs. I still do not >> >> understand >> >> >> what would be the best way to setup very >> large scale deployments: >> >> >> >> >> >> >> >> >> >> >> >> Goal (threoretical): >> >> >> >> >> >> A.) Index-Size: 1 Petabyte (1 Document >> is about 5 KB in Size) >> >> >> >> >> >> B) Queries: 100000 Queries/ per Second >> >> >> >> >> >> C) Updates: 100000 Updates / per >> Second >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Solr offers: >> >> >> >> >> >> 1.) Replication => Scales Well >> for B) BUT A) and C) are not >> >> satisfied >> >> >> >> >> >> >> >> >> 2.) Sharding => Scales well for >> A) BUT B) and C) are not >> > satisfied >> >> (=> As >> >> >> I understand the Sharding approach all >> goes through a central >> > server, >> >> that >> >> >> dispatches the updates and assembles the >> quries retrieved from the >> >> different >> >> >> shards. But this central server has also >> some capacity limits...) >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> What is the right approach to handle such >> large deployments? I >> > would be >> >> >> thankfull for just a rough sketch of the >> concepts so I can >> >> experiment/search >> >> >> further... >> >> >> >> >> >> >> >> >> Maybe I am missing something very trivial >> as I think some of the >> > "Solr >> >> >> Users/Use Cases" on the homepage are that >> kind of large >> > deployments. How >> >> are >> >> >> they implemented? >> >> >> >> >> >> >> >> >> >> >> >> Thanky very much!!! >> >> >> >> >> >> Jens >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> > >> >> >> >> -- >> Albert Vila Puig >> <a...@imente.com> >> iMente.com <http://www.imente.com> >> > -- Albert Vila Puig <a...@imente.com> iMente.com <http://www.imente.com>