Fwd: [cfarm-admins] Extremely Slow Disk Access On GCC119
Hello list, has anyone experienced problems on the AIX POWER8 system? -- Forwarded message -- From: R0b0t1 Date: Sun, Sep 10, 2017 at 9:58 AM Subject: Re: [cfarm-admins] Extremely Slow Disk Access On GCC119 To: David Edelsohn Do you care to explain? I'm not trying to tell you how to do your job, I'm trying to make you aware of a problem. I do not understand how this could only be a problem with the disk controller, because I have never encountered such poor performance. If it is only a problem with the disk, has the disk been failing this whole time? I hope you do not mind, but I will be forwarding this to the GCC mailing list. I am concerned by your replies. On Sun, Sep 10, 2017 at 3:47 AM, David Edelsohn wrote: > You're analysis is completely wrong. > > David > > On Sep 10, 2017 10:08 AM, "R0b0t1" wrote: >> >> Thank you for the response! >> >> I don't necessarily mean to request funding for the AIX system in this >> ticket but I feel like I should point out that the system is more or >> less unusable if something requires any IO. Assuming the HD is >> anything modern it seems to me this is a scheduling or caching issue >> in the AIX kernel. >> >> To reiterate, something that should take a few minutes took a day due >> to slow IO. Something that should have taken a fraction of a second >> was taking minutes. >> >> That it could possibly be the AIX kernel makes me want to ask if it >> would be possible to migrate the system to Linux. However, I would >> assume someone needs it for doing testing on AIX. I can also just as >> well use GCC112. >> >> I am doing my best to not to monopolize GCC119, but it is very hard to >> design around certain IO operations being very slow. I am not sure if >> this is impacting any users; it doesn't seem like the AIX system >> receives regular use. >> >> Cheers, >> R0b0t1 >> >> On Sun, Sep 10, 2017 at 1:45 AM, David Edelsohn wrote: >> > The backing disk array for the virtual I/O disks was under-designed >> > for the configuration of the system. There is a request to upgrade >> > the disk controller, but unclear if that will be funded. The I/O >> > system already has been tuned with bigger buffers so the performance >> > is much better than it was when originally installed. >> > >> > Please remember that all of the systems are shared systems and if >> > there is some limitations to the system, it affects all users, so >> > please try not to monopolize or overload the systems. >> > >> > Thanks, David >> > >> > On Sun, Sep 10, 2017 at 8:18 AM, R0b0t1 via cfarm-admins >> > wrote: >> >> Hello, >> >> >> >> I apologize for creating another ticket. I am in the process of >> >> running rm on a directory structure which is very quickly removed on >> >> the Linux CF machines I have used the command on. >> >> >> >> In addition the setup of a few software packages took about a day in >> >> total due to disk access when running configure scripts. On Linux, >> >> this completes in a matter of minutes. >> >> >> >> Is this a problem with AIX/jfs? >> >> >> >> Respectfully, >> >> R0b0t1 >> >> ___ >> >> cfarm-admins mailing list >> >> cfarm-adm...@lists.tetaneutral.net >> >> https://lists.tetaneutral.net/listinfo/cfarm-admins
Fwd: [cfarm-admins] Extremely Slow Disk Access On GCC119
Yes, the disks are very slow. You just have to live with it. Obviously migrating gcc119 to Linux would be silly when we have other Linux machines. The whole point of this one is to have an AIX host for testing in AIX. On Sunday, 10 September 2017, R0b0t1 wrote: > > Hello list, has anyone experienced problems on the AIX POWER8 system? > > -- Forwarded message -- > From: R0b0t1 > Date: Sun, Sep 10, 2017 at 9:58 AM > Subject: Re: [cfarm-admins] Extremely Slow Disk Access On GCC119 > To: David Edelsohn > > > Do you care to explain? I'm not trying to tell you how to do your job, > I'm trying to make you aware of a problem. > > I do not understand how this could only be a problem with the disk > controller, because I have never encountered such poor performance. If > it is only a problem with the disk, has the disk been failing this > whole time? > > I hope you do not mind, but I will be forwarding this to the GCC > mailing list. I am concerned by your replies. > > > On Sun, Sep 10, 2017 at 3:47 AM, David Edelsohn wrote: > > You're analysis is completely wrong. > > > > David > > > > On Sep 10, 2017 10:08 AM, "R0b0t1" wrote: > >> > >> Thank you for the response! > >> > >> I don't necessarily mean to request funding for the AIX system in this > >> ticket but I feel like I should point out that the system is more or > >> less unusable if something requires any IO. Assuming the HD is > >> anything modern it seems to me this is a scheduling or caching issue > >> in the AIX kernel. > >> > >> To reiterate, something that should take a few minutes took a day due > >> to slow IO. Something that should have taken a fraction of a second > >> was taking minutes. > >> > >> That it could possibly be the AIX kernel makes me want to ask if it > >> would be possible to migrate the system to Linux. However, I would > >> assume someone needs it for doing testing on AIX. I can also just as > >> well use GCC112. > >> > >> I am doing my best to not to monopolize GCC119, but it is very hard to > >> design around certain IO operations being very slow. I am not sure if > >> this is impacting any users; it doesn't seem like the AIX system > >> receives regular use. > >> > >> Cheers, > >> R0b0t1 > >> > >> On Sun, Sep 10, 2017 at 1:45 AM, David Edelsohn wrote: > >> > The backing disk array for the virtual I/O disks was under-designed > >> > for the configuration of the system. There is a request to upgrade > >> > the disk controller, but unclear if that will be funded. The I/O > >> > system already has been tuned with bigger buffers so the performance > >> > is much better than it was when originally installed. > >> > > >> > Please remember that all of the systems are shared systems and if > >> > there is some limitations to the system, it affects all users, so > >> > please try not to monopolize or overload the systems. > >> > > >> > Thanks, David > >> > > >> > On Sun, Sep 10, 2017 at 8:18 AM, R0b0t1 via cfarm-admins > >> > wrote: > >> >> Hello, > >> >> > >> >> I apologize for creating another ticket. I am in the process of > >> >> running rm on a directory structure which is very quickly removed on > >> >> the Linux CF machines I have used the command on. > >> >> > >> >> In addition the setup of a few software packages took about a day in > >> >> total due to disk access when running configure scripts. On Linux, > >> >> this completes in a matter of minutes. > >> >> > >> >> Is this a problem with AIX/jfs? > >> >> > >> >> Respectfully, > >> >> R0b0t1 > >> >> ___ > >> >> cfarm-admins mailing list > >> >> cfarm-adm...@lists.tetaneutral.net > >> >> https://lists.tetaneutral.net/listinfo/cfarm-admins
Re: [cfarm-admins] Extremely Slow Disk Access On GCC119
On Sun, Sep 10, 2017 at 1:06 PM, Jonathan Wakely wrote: > Yes, the disks are very slow. You just have to live with it. > The commands I was having problems with are rm and tar (to decompress an xz archive, and to delete a failed compilation environment setup). The software project in question is Gentoo's Portage, which is known for "stressing" filesystems due to the high file/inode count of its resource base (automated build scripts for various software projects). Unxz ran for the better part of a day with no end in sight. It should take 2 minutes or less. The compilation of necessary core packages (Gentoo's @system) took, again, over a day, with no end in sight, but on the Linux machines was done in a few minutes. (I am actually very thankful that there seem to be no hard limits on CF user accounts. The space required for portage isn't gigantic but might be larger than what some people are willing to let users drop into their $HOME.) It is this that makes me think the issue is not solely hardware. Bad management of writes seems like a more likely cause especially because while some IO limited operations seem to be slower than other disks I have used, they are not extremely slow. If the issue has been provably linked to the disk controller, I would appreciate an explanation as I am interested in how that was done. I apologize for bothering anyone but I would stress I am very willing to listen. I am simply not very smart, sirs. If my lack of intelligence is insulting please say so and I will leave. > Obviously migrating gcc119 to Linux would be silly when we have other > Linux machines. The whole point of this one is to have an AIX host for > testing in AIX. > True, which is why I pointed it out. However the system is very hard to use. I suppose this is useful information about AIX. I also have an outstanding request for PowerKVM support (on GCC112) and access to the hypervisor, but in that case I do not expect a prompt response at all. However, I am slightly worried that it may never be addressed (even if the answer is "no," which would be very sad). As for the value of what I am doing: most of the CF systems are so outdated as to be useless for modern development work. I can use a Gentoo Prefix/libc (https://wiki.gentoo.org/wiki/Prefix/libc) installation to use modern software on an outdated system. I'm currently fixing a lot of ppc64(le) issues, but I did get an environment working on GCC10. Respectfully, R0b0t1 > > > On Sunday, 10 September 2017, R0b0t1 wrote: >> >> Hello list, has anyone experienced problems on the AIX POWER8 system? >> >> -- Forwarded message -- >> From: R0b0t1 >> Date: Sun, Sep 10, 2017 at 9:58 AM >> Subject: Re: [cfarm-admins] Extremely Slow Disk Access On GCC119 >> To: David Edelsohn >> >> >> Do you care to explain? I'm not trying to tell you how to do your job, >> I'm trying to make you aware of a problem. >> >> I do not understand how this could only be a problem with the disk >> controller, because I have never encountered such poor performance. If >> it is only a problem with the disk, has the disk been failing this >> whole time? >> >> I hope you do not mind, but I will be forwarding this to the GCC >> mailing list. I am concerned by your replies. >> >> >> On Sun, Sep 10, 2017 at 3:47 AM, David Edelsohn wrote: >> > You're analysis is completely wrong. >> > >> > David >> > >> > On Sep 10, 2017 10:08 AM, "R0b0t1" wrote: >> >> >> >> Thank you for the response! >> >> >> >> I don't necessarily mean to request funding for the AIX system in this >> >> ticket but I feel like I should point out that the system is more or >> >> less unusable if something requires any IO. Assuming the HD is >> >> anything modern it seems to me this is a scheduling or caching issue >> >> in the AIX kernel. >> >> >> >> To reiterate, something that should take a few minutes took a day due >> >> to slow IO. Something that should have taken a fraction of a second >> >> was taking minutes. >> >> >> >> That it could possibly be the AIX kernel makes me want to ask if it >> >> would be possible to migrate the system to Linux. However, I would >> >> assume someone needs it for doing testing on AIX. I can also just as >> >> well use GCC112. >> >> >> >> I am doing my best to not to monopolize GCC119, but it is very hard to >> >> design around certain IO operations being very slow. I am not sure if >> >> this is impacting any users; it doesn't seem like the AIX system >> >> receives regular use. >> >> >> >> Cheers, >> >> R0b0t1 >> >> >> >> On Sun, Sep 10, 2017 at 1:45 AM, David Edelsohn wrote: >> >> > The backing disk array for the virtual I/O disks was under-designed >> >> > for the configuration of the system. There is a request to upgrade >> >> > the disk controller, but unclear if that will be funded. The I/O >> >> > system already has been tuned with bigger buffers so the performance >> >> > is much better than it was when originally installed. >> >> > >> >> > Please
Re: [cfarm-admins] Extremely Slow Disk Access On GCC119
On Sun, Sep 10, 2017 at 9:08 PM, R0b0t1 wrote: > On Sun, Sep 10, 2017 at 1:06 PM, Jonathan Wakely > wrote: >> Yes, the disks are very slow. You just have to live with it. >> > > The commands I was having problems with are rm and tar (to decompress > an xz archive, and to delete a failed compilation environment setup). > The software project in question is Gentoo's Portage, which is known > for "stressing" filesystems due to the high file/inode count of its > resource base (automated build scripts for various software projects). > Unxz ran for the better part of a day with no end in sight. It should > take 2 minutes or less. The compilation of necessary core packages > (Gentoo's @system) took, again, over a day, with no end in sight, but > on the Linux machines was done in a few minutes. > > (I am actually very thankful that there seem to be no hard limits on > CF user accounts. The space required for portage isn't gigantic but > might be larger than what some people are willing to let users drop > into their $HOME.) > > It is this that makes me think the issue is not solely hardware. Bad > management of writes seems like a more likely cause especially because > while some IO limited operations seem to be slower than other disks I > have used, they are not extremely slow. > > If the issue has been provably linked to the disk controller, I would > appreciate an explanation as I am interested in how that was done. I > apologize for bothering anyone but I would stress I am very willing to > listen. I am simply not very smart, sirs. If my lack of intelligence > is insulting please say so and I will leave. I and a number of AIX VIOS experts performed an assessment of the system with AIX performance tools. The system was configured to maximize diskspace and flexibility. It now is supporting six, separate VMs. The disk array was configured as a single physical volume, mapped to a single logical volume, that then is partitioned into virtual I/O devices mapped to the VMs, which then are formatted for AIX filesystems. It's a lot of virualization layers. I already have increased the disk queues in the AIX VMs, which increased performance relative to the initial installation. Also, the VIOS was slightly under-sized for the current amount of usage, but I have avoided rebooting the entire system to adjust that. Ideally it would be better to directly partition the disk array and map the partitions to the AIX VMs, but it is difficult and disruptive to implement now. There is a proposal to replace the disk array device adapter with a write-caching adapter, which may or may not happen. > >> Obviously migrating gcc119 to Linux would be silly when we have other >> Linux machines. The whole point of this one is to have an AIX host for >> testing in AIX. >> > > True, which is why I pointed it out. However the system is very hard > to use. I suppose this is useful information about AIX. > > > I also have an outstanding request for PowerKVM support (on GCC112) > and access to the hypervisor, but in that case I do not expect a > prompt response at all. However, I am slightly worried that it may > never be addressed (even if the answer is "no," which would be very > sad). GCC112 will not provide user access to the hypervisor. You can ask the OSUOSL Powerdev cloud if they will provide such access. Thanks, David > > As for the value of what I am doing: most of the CF systems are so > outdated as to be useless for modern development work. I can use a > Gentoo Prefix/libc (https://wiki.gentoo.org/wiki/Prefix/libc) > installation to use modern software on an outdated system. I'm > currently fixing a lot of ppc64(le) issues, but I did get an > environment working on GCC10. > > Respectfully, > R0b0t1 > >> >> >> On Sunday, 10 September 2017, R0b0t1 wrote: >>> >>> Hello list, has anyone experienced problems on the AIX POWER8 system? >>> >>> -- Forwarded message -- >>> From: R0b0t1 >>> Date: Sun, Sep 10, 2017 at 9:58 AM >>> Subject: Re: [cfarm-admins] Extremely Slow Disk Access On GCC119 >>> To: David Edelsohn >>> >>> >>> Do you care to explain? I'm not trying to tell you how to do your job, >>> I'm trying to make you aware of a problem. >>> >>> I do not understand how this could only be a problem with the disk >>> controller, because I have never encountered such poor performance. If >>> it is only a problem with the disk, has the disk been failing this >>> whole time? >>> >>> I hope you do not mind, but I will be forwarding this to the GCC >>> mailing list. I am concerned by your replies. >>> >>> >>> On Sun, Sep 10, 2017 at 3:47 AM, David Edelsohn wrote: >>> > You're analysis is completely wrong. >>> > >>> > David >>> > >>> > On Sep 10, 2017 10:08 AM, "R0b0t1" wrote: >>> >> >>> >> Thank you for the response! >>> >> >>> >> I don't necessarily mean to request funding for the AIX system in this >>> >> ticket but I feel like I should point out that the system is more or >>> >> less unusable if something
Bounce GCC119 (was: [cfarm-admins] Extremely Slow Disk Access On GCC119)
Hi David, > The system was configured to maximize diskspace and flexibility. It > now is supporting six, separate VMs. The disk array was configured as > a single physical volume, mapped to a single logical volume, that then > is partitioned into virtual I/O devices mapped to the VMs, which then > are formatted for AIX filesystems. It's a lot of virualization > layers. I already have increased the disk queues in the AIX VMs, which > increased performance relative to the initial installation. Also, the > VIOS was slightly under-sized for the current amount of usage, but I > have avoided rebooting the entire system to adjust that. I can't speak for others, but if GCC119 needs a reboot then do it. I'm working on it now. It won't bother me one bit if you bounce it and then I have to log back in. It won't bother me if my home directory gets blown away and I have to re-clone. Thanks for all the hard work. Jeff
Re: Bounce GCC119 (was: [cfarm-admins] Extremely Slow Disk Access On GCC119)
On Sun, Sep 10, 2017 at 10:42 PM, Jeffrey Walton wrote: > Hi David, > >> The system was configured to maximize diskspace and flexibility. It >> now is supporting six, separate VMs. The disk array was configured as >> a single physical volume, mapped to a single logical volume, that then >> is partitioned into virtual I/O devices mapped to the VMs, which then >> are formatted for AIX filesystems. It's a lot of virualization >> layers. I already have increased the disk queues in the AIX VMs, which >> increased performance relative to the initial installation. Also, the >> VIOS was slightly under-sized for the current amount of usage, but I >> have avoided rebooting the entire system to adjust that. > > I can't speak for others, but if GCC119 needs a reboot then do it. The issue is not rebooting gcc119 AIX VM or wiping out the AIX /home filesystem. To expand the VIOS partition I need to reboot the hypervisor host and all of the AIX partitions, which is more difficult to schedule. Also, there does not appear to be a mechanism to squeeze down the physical volume on the disk array and carve out new devices. Thanks, David
gcc-8-20170910 is now available
Snapshot gcc-8-20170910 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/8-20170910/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 251952 You'll find: gcc-8-20170910.tar.xzComplete GCC SHA256=dc120706ce1a6d208e3f514547be81ba711f32d73a6092bbdec373a2bc0fee48 SHA1=8954b329a625a5e2d56e304768a3fdd34c12f563 Diffs from 8-20170903 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: [cfarm-admins] Extremely Slow Disk Access On GCC119
Thank you for the reply! On Sun, Sep 10, 2017 at 2:49 PM, David Edelsohn wrote: > On Sun, Sep 10, 2017 at 9:08 PM, R0b0t1 wrote: >> On Sun, Sep 10, 2017 at 1:06 PM, Jonathan Wakely >> wrote: >>> Yes, the disks are very slow. You just have to live with it. >>> >> >> The commands I was having problems with are rm and tar (to decompress >> an xz archive, and to delete a failed compilation environment setup). >> The software project in question is Gentoo's Portage, which is known >> for "stressing" filesystems due to the high file/inode count of its >> resource base (automated build scripts for various software projects). >> Unxz ran for the better part of a day with no end in sight. It should >> take 2 minutes or less. The compilation of necessary core packages >> (Gentoo's @system) took, again, over a day, with no end in sight, but >> on the Linux machines was done in a few minutes. >> >> (I am actually very thankful that there seem to be no hard limits on >> CF user accounts. The space required for portage isn't gigantic but >> might be larger than what some people are willing to let users drop >> into their $HOME.) >> >> It is this that makes me think the issue is not solely hardware. Bad >> management of writes seems like a more likely cause especially because >> while some IO limited operations seem to be slower than other disks I >> have used, they are not extremely slow. >> >> If the issue has been provably linked to the disk controller, I would >> appreciate an explanation as I am interested in how that was done. I >> apologize for bothering anyone but I would stress I am very willing to >> listen. I am simply not very smart, sirs. If my lack of intelligence >> is insulting please say so and I will leave. > > I and a number of AIX VIOS experts performed an assessment of the > system with AIX performance tools. > Would you mind describing what was done in detail? I understand if you do not have the time, but I am genuinely interested. I ask because what I am doing can be pathological on Linux, and using IBM-provided tools doesn't mean you're going to discover why something is happening. It's going to show you what it thinks might be happening, and suggest fixes that are (hopefully) within your power to actualize. Most importantly an IBM-provided tool is unlikely to criticize anything made by IBM. Consequently, I see no reason that performing an assessment (of what and how?) disproves any claim that there is something wrong with the AIX kernel. I seem to be able to find no better way to express the sentiment, so please understand I mean no disrespect: You may as well have told me you were an expert in propeller beanies. No explanation is owed to me but I can't in good conscience take the explanation given as comprehensive. > The system was configured to maximize diskspace and flexibility. It > now is supporting six, separate VMs. The disk array was configured as > a single physical volume, mapped to a single logical volume, that then > is partitioned into virtual I/O devices mapped to the VMs, which then > are formatted for AIX filesystems. It's a lot of virualization > layers. I already have increased the disk queues in the AIX VMs, which > increased performance relative to the initial installation. Also, the > VIOS was slightly under-sized for the current amount of usage, but I > have avoided rebooting the entire system to adjust that. > If I understand the setup properly, this should produce little noticeable slowdown. The layers hand off data with very little processing. Linux host systems tend to have similar setups with LVM2. It might even be a good idea to keep PV/LV setup despite the overhead because it's so much more flexible. > Ideally it would be better to directly partition the disk array and > map the partitions to the AIX VMs, but it is difficult and disruptive > to implement now. There is a proposal to replace the disk array > device adapter with a write-caching adapter, which may or may not > happen. > I'm not entirely sure. In an absolute sense there would be less overhead but relative to other slowdowns on the system I am not sure the inefficiency of the abstraction layers matters. It might just be the case that the other guests are nearly saturating the disk IO. It's not my place to ask what they're doing, but if they are doing that, then I suppose there's nothing to be done about it. In my personal experience however, it seems like Linux would fare better in this situation. Blaming the AIX kernel might be useful if there are tuning parameters exposed that could be changed. IO queues are very hard to get right. >> >>> Obviously migrating gcc119 to Linux would be silly when we have other >>> Linux machines. The whole point of this one is to have an AIX host for >>> testing in AIX. >>> >> >> True, which is why I pointed it out. However the system is very hard >> to use. I suppose this is useful information about AIX. >> >> >> I also have an outstanding request for P
Re: Bounce GCC119 (was: [cfarm-admins] Extremely Slow Disk Access On GCC119)
On Sun, Sep 10, 2017 at 4:21 PM, David Edelsohn wrote: > On Sun, Sep 10, 2017 at 10:42 PM, Jeffrey Walton wrote: >> Hi David, >> >>> The system was configured to maximize diskspace and flexibility. It >>> now is supporting six, separate VMs. The disk array was configured as >>> a single physical volume, mapped to a single logical volume, that then >>> is partitioned into virtual I/O devices mapped to the VMs, which then >>> are formatted for AIX filesystems. It's a lot of virualization >>> layers. I already have increased the disk queues in the AIX VMs, which >>> increased performance relative to the initial installation. Also, the >>> VIOS was slightly under-sized for the current amount of usage, but I >>> have avoided rebooting the entire system to adjust that. >> >> I can't speak for others, but if GCC119 needs a reboot then do it. > > The issue is not rebooting gcc119 AIX VM or wiping out the AIX /home > filesystem. To expand the VIOS partition I need to reboot the > hypervisor host and all of the AIX partitions, which is more difficult > to schedule. Also, there does not appear to be a mechanism to squeeze > down the physical volume on the disk array and carve out new devices. > > Thanks, David Hello, I hope I'm not adding too much noise to the list. Thank you both for your consideration. I think I understand the problem a bit better now. Close to the full capacity of the machine seems to be exposed to the VM, which made me overlook that it might be a VM. (To host AIX for testing the current setup makes sense.) I hope it doesn't seem like I do not appreciate the services offered; I realize (at least I hope) that I am extremely privileged to use the machines that make up the GCC CF. Hopefully my surprise at the results of running the commands I did seems reasonable. In any case, my computations on that machine are proceeding now; at least as well as they can. Respectfully, R0b0t1