Jure Pečar wrote: > In my expirience the "brick wall" you describe is what happens when disks > reach a certain point of random IO that they cannot keep up with. >
The problem with a technical audience, is that everyone thinks they have a workaround or probable fix you haven't already thought of. No offense. I am guilty of it myself but it's very hard to sometimes say "I DON'T KNOW" and dig through telemetry and instrument the software until you know all the answers. With something as complex as Cyrus, this is harder than you think. Unfortunately when it comes to something like a production mail service these days it's nearly impossible to get the funding and manhours and approvals to run experiments on live guinea pigs to really get to the bottom of problems. We throw systems at the problem and move on. But in answer to your point, our iostat numbers for busy or service time didn't indicate there to be any I/O issue. That was the first thing we looked at of course. Even by eyeball our array drives are more idle than busy. ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html