[Bug 705562] Re: ami-6836dc01 8.04 32 bit AMI kernel lock bug

Stefan Bader Wed, 29 Jun 2011 02:27:55 -0700

** Description changed:

+ SRU Justification:
+ 
+ Impact: For i386 PGDs are stored in a linked list. For this two elements of
+ struct page are (mis-)used. To have a backwards pointer, the private field is
+ assigned a pointer to the index field of the previous struct page. The main
+ problem there was that list_add and list_del operations accidentally were done
+ twice. Which leads to accesses to (after first list operation) innocent struct
+ pages.
+ 
+ Fix: This is a bit more than needed to fix the bug itself, but it will bring 
our
+ code more into a shape that resembles upstream (factually there is only a 
2.6.18
+ upstream but that code did not do the double list access).
+ 
+ Testcase: Running a 32bit domU (64bit Hardy dom0, though that should not 
matter)
+ with the xen kernel and doing a lot of process starts (like the aslr qa
+ regression test does) would quite soon crash because the destructor of a PTE
+ (which incidentally is stored in index) was suddenly overwritten.
+ 
+ ---
+ 
  For months we have been working around a bug in  ami-6836dc01, but
  this seems not to be reported any place.  Is this a known issue?
  
  When we use ruby/puppet (from the Canonical repo) on an instance with
  this AMI (e.g. a c1.medium) or in some cases when using java
  applications the instance gets locked up.
  
  Our work-around is using kernel 2.6.27-22-xen instead  - the person
  who created the fixed AMI used this method:
  
  - launch instance of ami-7e28ca17 (instance #1)
  - modprobe loop on instance #1
  - copy up creds, jdk and ec2-ami-tools to /dev/shm on instance #1
  - launch instance of ami-69d73000
  (canonical-beta-us/ubuntu-intrepid-beta2-20090226-i386.manifest.xml)
  to grab kernel modules from (instance #2)
  - tar.gz /lib/modules/2.6.27-22-xen on instance #2
-        - scp to instance #1 and untar in /lib/modules
+        - scp to instance #1 and untar in /lib/modules
  - rm -rf the old /lib/modules/2.6.24-10-xen dir on instance #1
  - edit quick-bundle script on instance #1 to hard-code AKI to
  aki-20c12649, ARI to ari-21c12648 (the AKI and ARI from instance #2).
-        - hard-coded manifest name, bucket to whatever.
+        - hard-coded manifest name, bucket to whatever.
  - run pre-clean script on instance #1
  - run quick-bundle script on instance #1
  
- 
  The console output from a locked instance is attached


** Description changed:

  SRU Justification:
  
- Impact: For i386 PGDs are stored in a linked list. For this two elements of
- struct page are (mis-)used. To have a backwards pointer, the private field is
- assigned a pointer to the index field of the previous struct page. The main
- problem there was that list_add and list_del operations accidentally were done
- twice. Which leads to accesses to (after first list operation) innocent struct
- pages.
+ Impact: For i386 PGDs are stored in a linked list. For this two elements
+ of struct page are (mis-)used. To have a backwards pointer, the private
+ field is assigned a pointer to the index field of the previous struct
+ page. The main problem there was that list_add and list_del operations
+ accidentally were done twice. Which leads to accesses to (after first
+ list operation) innocent struct pages.
  
- Fix: This is a bit more than needed to fix the bug itself, but it will bring 
our
- code more into a shape that resembles upstream (factually there is only a 
2.6.18
- upstream but that code did not do the double list access).
+ Fix: This is a bit more than needed to fix the bug itself, but it will
+ bring our code more into a shape that resembles upstream (factually
+ there is only a 2.6.18 upstream but that code did not do the double list
+ access).
  
- Testcase: Running a 32bit domU (64bit Hardy dom0, though that should not 
matter)
- with the xen kernel and doing a lot of process starts (like the aslr qa
- regression test does) would quite soon crash because the destructor of a PTE
- (which incidentally is stored in index) was suddenly overwritten.
+ Testcase: Running a 32bit domU (64bit Hardy dom0, though that should not
+ matter) with the xen kernel and doing a lot of process starts (like the
+ aslr qa regression test does) would quite soon crash because the
+ destructor of a PTE (which incidentally is stored in index) was suddenly
+ overwritten.
  
  ---
  
  For months we have been working around a bug in  ami-6836dc01, but
  this seems not to be reported any place.  Is this a known issue?
  
  When we use ruby/puppet (from the Canonical repo) on an instance with
  this AMI (e.g. a c1.medium) or in some cases when using java
  applications the instance gets locked up.
  
  Our work-around is using kernel 2.6.27-22-xen instead  - the person
  who created the fixed AMI used this method:
  
  - launch instance of ami-7e28ca17 (instance #1)
  - modprobe loop on instance #1
  - copy up creds, jdk and ec2-ami-tools to /dev/shm on instance #1
  - launch instance of ami-69d73000
  (canonical-beta-us/ubuntu-intrepid-beta2-20090226-i386.manifest.xml)
  to grab kernel modules from (instance #2)
  - tar.gz /lib/modules/2.6.27-22-xen on instance #2
         - scp to instance #1 and untar in /lib/modules
  - rm -rf the old /lib/modules/2.6.24-10-xen dir on instance #1
  - edit quick-bundle script on instance #1 to hard-code AKI to
  aki-20c12649, ARI to ari-21c12648 (the AKI and ARI from instance #2).
         - hard-coded manifest name, bucket to whatever.
  - run pre-clean script on instance #1
  - run quick-bundle script on instance #1
  
  The console output from a locked instance is attached

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/705562

Title:
  ami-6836dc01 8.04 32 bit AMI kernel lock bug

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/705562/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 705562] Re: ami-6836dc01 8.04 32 bit AMI kernel lock bug

Reply via email to