Hello again,
I think that is a serious regression. If you manage to make it work
correctly, please do get in touch and we can prepare a stable update for
the memory leak.
I think I've done it, looking at the source of chef 11.12.8-2 (from
Jessie). I'm not a programmer and I know Ruby only a little, so it would
probably be possible to do it better, but it's been working fine on
about 150 nodes for over 5 months. The stacktrace is still visable in
the output (and also in logs), but the chef-stacktrace.out file is also
produced properly, including the forked client stacktrace. I attach the
patch and chef-stacktrace.out files to compare - from modified Wheezy
package and from Jessie.
Is the result satisfying?
adding new features in a stable update is generally not a good idea,
no matter what.
OK, I understand.  However, I found another bug in the package, related
to FileEdit. Both the issue and the solution is here:
https://github.com/chef/chef/pull/754
Please let me know if it is possible to apply it together with the
forked runs (technically it is), or should I open another issue.

--
Greetings,
Piotr Pańczyk


________________________________

Asseco Business Solutions S.A.
ul. Konrada Wallenroda 4c
20-607 Lublin
tel.: +48 81 535 30 00
fax: +48 81 535 30 05

Sąd Rejonowy Lublin-Wschód w Lublinie z siedzibą w Świdniku
VI Wydział Gospodarczy Krajowego Rejestru Sądowego
KRS 0000028257
NIP 522-26-12-717
kapitał zakładowy 167 090 965,00 zł (w całości opłacony)
www.assecobs.pl

________________________________

Powyższa korespondencja przeznaczona jest wyłącznie dla osoby lub podmiotu, do 
którego jest adresowana i może zawierać informacje o charakterze poufnym lub 
zastrzeżonym. Nieuprawnione wykorzystanie informacji zawartych w wiadomości 
e-mail przez osobę lub podmiot nie będący jej adresatem jest zabronione 
odpowiednimi przepisami prawa. Odbiorca korespondencji, który otrzymał ją 
omyłkowo, proszony jest o niezwłoczne zawiadomienie nadawcy drogą elektroniczną 
lub telefonicznie i usunięcie tej treści z poczty elektronicznej. Dziękujemy. 
Asseco Business Solutions S.A.

________________________________

Weź pod uwagę ochronę środowiska, zanim wydrukujesz ten e-mail.

________________________________

This information is intended only for the person or entity to which it is 
addressed and may contain confidential and/or privileged material. Unauthorized 
use of this information by person or entity other than the intended recipient 
is prohibited by law. If you received this by mistake, please immediately 
contact the sender by e-mail or by telephone and delete this information from 
any computer. Thank you. Asseco Business Solutions S.A.

________________________________

Please consider your environmental responsibility before printing this e-mail.
--- chef-10.12.0.orig/lib/chef/config.rb
+++ chef-10.12.0/lib/chef/config.rb
@@ -184,6 +184,7 @@ class Chef
     run_command_stdout_timeout 120
     solo  false
     splay nil
+    client_fork true
 
     # Set these to enable SSL authentication / mutual-authentication
     # with the server
--- chef-10.12.0.orig/lib/chef/daemon.rb
+++ chef-10.12.0/lib/chef/daemon.rb
@@ -24,6 +24,7 @@ class Chef
   class Daemon
     class << self
       attr_accessor :name
+      attr_accessor :forked_child
 
       # Daemonize the current process, managing pidfiles and process uid/gid
       #
@@ -46,7 +47,7 @@ class Chef
             $stdout.reopen("/dev/null", "a")
             $stderr.reopen($stdout)
             save_pid_file
-            at_exit { remove_pid_file }
+            at_exit { remove_pid_file unless forked_child }
           rescue NotImplementedError => e
             Chef::Application.fatal!("There is no fork: #{e.message}")
           end
--- chef-10.12.0.orig/lib/chef/client.rb
+++ chef-10.12.0/lib/chef/client.rb
@@ -36,6 +36,7 @@ require 'chef/cookbook/file_vendor'
 require 'chef/cookbook/file_system_file_vendor'
 require 'chef/cookbook/remote_file_vendor'
 require 'chef/version'
+require 'chef/daemon'
 require 'ohai'
 require 'rbconfig'
 
@@ -135,50 +136,52 @@ class Chef
     end
 
     # Do a full run for this Chef::Client.  Calls:
+    # * do_run
     #
-    #  * run_ohai - Collect information about the system
-    #  * build_node - Get the last known state, merge with local changes
-    #  * register - If not in solo mode, make sure the server knows about this client
-    #  * sync_cookbooks - If not in solo mode, populate the local cache with the node's cookbooks
-    #  * converge - Bring this system up to date
-    #
+    # This provides a wrapper around #do_run allowing the 
+    # run to be optionally forked.
     # === Returns
-    # true:: Always returns true.
+    # boolean:: Return value from #do_run. Should always returns true.
     def run
-      run_context = nil
-
-      Chef::Log.info("*** Chef #{Chef::VERSION} ***")
-      enforce_path_sanity
-      run_ohai
-      register unless Chef::Config[:solo]
-      build_node
-
-      begin
-
-        run_status.start_clock
-        Chef::Log.info("Starting Chef Run for #{node.name}")
-        run_started
-
-        run_context = setup_run_context
-        converge(run_context)
-        save_updated_node
-
-        run_status.stop_clock
-        Chef::Log.info("Chef Run complete in #{run_status.elapsed_time} seconds")
-        run_completed_successfully
+      if(Chef::Config[:client_fork] && Process.respond_to?(:fork))
+        Chef::Log.info "Forking chef instance to converge..."
+        pid = fork do
+	  begin
+            Chef::Daemon.forked_child = true
+            Chef::Log.debug "Forked instance now converging"
+            do_run
+          rescue Exception => e
+            Chef::Log.error(e.to_s)
+            exit 1
+	  else
+            exit 0
+          end
+        end
+        Chef::Log.debug "Fork successful. Waiting for new chef pid: #{pid}"
+        result = Process.waitpid2(pid)
+        if handle_child_exit(result)
+          Chef::Log.debug "Forked child successfully reaped (pid: #{pid})"
+	else
+          Chef::Log.error("Sleeping for #{Chef::Config[:interval]} seconds before trying again")
+	end
         true
-      rescue Exception => e
-        run_status.stop_clock
-        run_status.exception = e
-        run_failed
-        Chef::Log.debug("Re-raising exception: #{e.class} - #{e.message}\n#{e.backtrace.join("\n  ")}")
-        raise
-      ensure
-        run_status = nil
+      else
+        do_run
       end
-      true
     end
 
+    def handle_child_exit(pid_and_status)
+      status = pid_and_status[1]
+      return true if status.success?
+      message = if status.signaled?
+        "Chef run process terminated by signal #{status.termsig} (#{Signal.list.invert[status.termsig]})"
+      else
+        "Chef run process exited unsuccessfully (exit code #{status.exitstatus})"
+      end
+      Chef::Log.fatal message
+      exit 1 unless Chef::Config[:daemonize]
+      false
+    end
 
     # Configures the Chef::Cookbook::FileVendor class to fetch file from the
     # server or disk as appropriate, creates the run context for this run, and
@@ -333,6 +336,54 @@ class Chef
 
     private
 
+    # Do a full run for this Chef::Client.  Calls:
+    #
+    #  * run_ohai - Collect information about the system
+    #  * build_node - Get the last known state, merge with local changes
+    #  * register - If not in solo mode, make sure the server knows about this client
+    #  * sync_cookbooks - If not in solo mode, populate the local cache with the node's cookbooks
+    #  * converge - Bring this system up to date
+    #
+    # === Returns
+    # true:: Always returns true.
+    def do_run
+      run_context = nil
+
+      begin
+        Chef::Log.info("*** Chef #{Chef::VERSION} ***")
+        enforce_path_sanity
+        run_ohai
+        register unless Chef::Config[:solo]
+        build_node
+
+        run_status.start_clock
+        Chef::Log.info("Starting Chef Run for #{node.name}")
+        run_started
+
+        run_context = setup_run_context
+        converge(run_context)
+        save_updated_node
+
+        run_status.stop_clock
+        Chef::Log.info("Chef Run complete in #{run_status.elapsed_time} seconds")
+        run_completed_successfully
+        true
+      rescue Exception => e
+        # If we failed really early, we may not have a run_status yet. Too early for these to be of much use.
+        if run_status
+          run_status.stop_clock
+          run_status.exception = e
+          run_failed
+        end
+        Chef::Log.debug("Re-raising exception: #{e.class} - #{e.message}\n#{e.backtrace.join("\n  ")}")
+        Chef::Application.debug_stacktrace(e) if Chef::Daemon.forked_child
+        raise
+      ensure
+        run_status = nil
+      end
+      true
+    end
+
     # Ensures runlist override contains RunListItem instances
     def runlist_override_sanity_check!
       @override_runlist = @override_runlist.split(',') if @override_runlist.is_a?(String)
--- chef-10.12.0.orig/lib/chef/application/client.rb
+++ chef-10.12.0/lib/chef/application/client.rb
@@ -153,6 +153,12 @@ class Chef::Application::Client < Chef::
       }
     }
 
+  option :client_fork,
+    :short        => "-f",
+    :long         => "--[no-]fork",
+    :description  => "Fork client",
+    :boolean      => true
+
   attr_reader :chef_client_json
 
   def initialize
--- chef-10.12.0.orig/lib/chef/application/solo.rb
+++ chef-10.12.0/lib/chef/application/solo.rb
@@ -122,6 +122,12 @@ class Chef::Application::Solo < Chef::Ap
       }
     }
 
+  option :client_fork,
+    :short        => "-f",
+    :long         => "--[no-]fork",
+    :description  => "Fork client",
+    :boolean      => true
+
   attr_reader :chef_solo_json
 
   def initialize
Generated at 2016-01-08 15:18:41 +0100
RuntimeError: ruby_block[fail the run] (test::default line 22) had an error: 
RuntimeError: deliberately fail the run
/var/chef/cache/cookbooks/test/recipes/default.rb:24:in `block (2 levels) in 
from_file'
/usr/lib/ruby/vendor_ruby/chef/provider/ruby_block.rb:33:in `call'
/usr/lib/ruby/vendor_ruby/chef/provider/ruby_block.rb:33:in `block in 
action_run'
/usr/lib/ruby/vendor_ruby/chef/mixin/why_run.rb:52:in `call'
/usr/lib/ruby/vendor_ruby/chef/mixin/why_run.rb:52:in `add_action'
/usr/lib/ruby/vendor_ruby/chef/provider.rb:155:in `converge_by'
/usr/lib/ruby/vendor_ruby/chef/provider/ruby_block.rb:32:in `action_run'
/usr/lib/ruby/vendor_ruby/chef/provider.rb:120:in `run_action'
/usr/lib/ruby/vendor_ruby/chef/resource.rb:637:in `run_action'
/usr/lib/ruby/vendor_ruby/chef/runner.rb:49:in `run_action'
/usr/lib/ruby/vendor_ruby/chef/runner.rb:81:in `block (2 levels) in converge'
/usr/lib/ruby/vendor_ruby/chef/runner.rb:81:in `each'
/usr/lib/ruby/vendor_ruby/chef/runner.rb:81:in `block in converge'
/usr/lib/ruby/vendor_ruby/chef/resource_collection.rb:98:in `block in 
execute_each_resource'
/usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:116:in 
`call'
/usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:116:in 
`call_iterator_block'
/usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:85:in 
`step'
/usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:104:in 
`iterate'
/usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:55:in 
`each_with_index'
/usr/lib/ruby/vendor_ruby/chef/resource_collection.rb:96:in 
`execute_each_resource'
/usr/lib/ruby/vendor_ruby/chef/runner.rb:80:in `converge'
/usr/lib/ruby/vendor_ruby/chef/client.rb:345:in `converge'
/usr/lib/ruby/vendor_ruby/chef/client.rb:431:in `do_run'
/usr/lib/ruby/vendor_ruby/chef/client.rb:213:in `block in run'
/usr/lib/ruby/vendor_ruby/chef/client.rb:207:in `fork'
/usr/lib/ruby/vendor_ruby/chef/client.rb:207:in `run'
/usr/lib/ruby/vendor_ruby/chef/application.rb:217:in `run_chef_client'
/usr/lib/ruby/vendor_ruby/chef/application/client.rb:328:in `block in 
run_application'
/usr/lib/ruby/vendor_ruby/chef/application/client.rb:317:in `loop'
/usr/lib/ruby/vendor_ruby/chef/application/client.rb:317:in `run_application'
/usr/lib/ruby/vendor_ruby/chef/application.rb:67:in `run'
/usr/bin/chef-client:25:in `<main>'

Generated at 2016-01-08 15:18:47 +0100
RuntimeError: ruby_block[fail the run] (test::default line 22) had an error: 
RuntimeError: deliberately fail the run
/var/chef/cache/cookbooks/test/recipes/default.rb:24:in `block (2 levels) in 
from_file'
/usr/lib/ruby/vendor_ruby/chef/provider/ruby_block.rb:28:in `call'
/usr/lib/ruby/vendor_ruby/chef/provider/ruby_block.rb:28:in `action_create'
/usr/lib/ruby/vendor_ruby/chef/resource.rb:472:in `run_action'
/usr/lib/ruby/vendor_ruby/chef/runner.rb:49:in `run_action'
/usr/lib/ruby/vendor_ruby/chef/runner.rb:85:in `block (2 levels) in converge'
/usr/lib/ruby/vendor_ruby/chef/runner.rb:85:in `each'
/usr/lib/ruby/vendor_ruby/chef/runner.rb:85:in `block in converge'
/usr/lib/ruby/vendor_ruby/chef/resource_collection.rb:94:in `block in 
execute_each_resource'
/usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:116:in 
`call'
/usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:116:in 
`call_iterator_block'
/usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:85:in 
`step'
/usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:104:in 
`iterate'
/usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:55:in 
`each_with_index'
/usr/lib/ruby/vendor_ruby/chef/resource_collection.rb:92:in 
`execute_each_resource'
/usr/lib/ruby/vendor_ruby/chef/runner.rb:80:in `converge'
/usr/lib/ruby/vendor_ruby/chef/client.rb:333:in `converge'
/usr/lib/ruby/vendor_ruby/chef/client.rb:364:in `do_run'
/usr/lib/ruby/vendor_ruby/chef/client.rb:152:in `block in run'
/usr/lib/ruby/vendor_ruby/chef/client.rb:148:in `fork'
/usr/lib/ruby/vendor_ruby/chef/client.rb:148:in `run'
/usr/lib/ruby/vendor_ruby/chef/application/client.rb:260:in `block in 
run_application'
/usr/lib/ruby/vendor_ruby/chef/application/client.rb:247:in `loop'
/usr/lib/ruby/vendor_ruby/chef/application/client.rb:247:in `run_application'
/usr/lib/ruby/vendor_ruby/chef/application.rb:70:in `run'
/usr/bin/chef-client:25:in `<main>'

Reply via email to