Re: [Beowulf] Intel combines Xeon and FPGA in a single socket

2014-06-19 Thread Mark Hahn
but not more than that. Maybe this has changed significantly in the last decade, but I doubt it. There is only a limited surface area per die and Xeon's are not small. if not area, then power. but maybe these are going to be somewhat exotic chips, to which commodity constraints apply more l

Re: [Beowulf] Intel combines Xeon and FPGA in a single socket

2014-06-19 Thread Joe Landman
On 06/19/2014 10:52 PM, Adam DeConinck wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 This is interesting... http://www.theregister.co.uk/2014/06/18/intel_fpga_custom_chip/ This is what tensilica did previously though. The issue that we had found playing with it (about a decade ag

[Beowulf] Intel combines Xeon and FPGA in a single socket

2014-06-19 Thread Adam DeConinck
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 This is interesting... http://www.theregister.co.uk/2014/06/18/intel_fpga_custom_chip/ - From the article: "The chip company announced on Wednesday at GigaOm Structure in San Francisco that it is preparing to sell a Xeon E5-FPGA hybrid chip to som

Re: [Beowulf] Monitoring and reporting Infiniband errors

2014-06-19 Thread John Hearns
pps. I guess I could clear the errors every time this runs, but have decided to just do an initial clear of the errors and look at the cumulative rate. ppps. there is a better list for this chatter, isn't there... On 19 June 2014 15:10, John Hearns wrote: > If anyone is interested, here is my

Re: [Beowulf] Monitoring and reporting Infiniband errors

2014-06-19 Thread John Hearns
If anyone is interested, here is my solution, which seems good enough. Someone will no doubt say there is a neater way! A shell script which runs ibqueryerrors and returns 1 if anything is found: #!/bin/bash # check for errors on the Infiniband fabric 0 # another script runs for port 1 errors=`/

[Beowulf] Monitoring and reporting Infiniband errors

2014-06-19 Thread John Hearns
Does anyone have good tips on moniroting a cluster for Infiniband errors? Specifically Mellanox/OpenFabrics on an SGI cluster. I am thinking of running ibcheckerrors or ibqueryerrors and parsing the output. I have Monit set up on the cluster head node http://mmonit.com/monit/ which I find quite