Hello again,
I think that is a serious regression. If you manage to make it work correctly, please do get in touch and we can prepare a stable update for the memory leak.
I think I've done it, looking at the source of chef 11.12.8-2 (from Jessie). I'm not a programmer and I know Ruby only a little, so it would probably be possible to do it better, but it's been working fine on about 150 nodes for over 5 months. The stacktrace is still visable in the output (and also in logs), but the chef-stacktrace.out file is also produced properly, including the forked client stacktrace. I attach the patch and chef-stacktrace.out files to compare - from modified Wheezy package and from Jessie. Is the result satisfying?
adding new features in a stable update is generally not a good idea, no matter what.
OK, I understand. However, I found another bug in the package, related to FileEdit. Both the issue and the solution is here: https://github.com/chef/chef/pull/754 Please let me know if it is possible to apply it together with the forked runs (technically it is), or should I open another issue.
-- Greetings, Piotr Pańczyk ________________________________ Asseco Business Solutions S.A. ul. Konrada Wallenroda 4c 20-607 Lublin tel.: +48 81 535 30 00 fax: +48 81 535 30 05 Sąd Rejonowy Lublin-Wschód w Lublinie z siedzibą w Świdniku VI Wydział Gospodarczy Krajowego Rejestru Sądowego KRS 0000028257 NIP 522-26-12-717 kapitał zakładowy 167 090 965,00 zł (w całości opłacony) www.assecobs.pl ________________________________ Powyższa korespondencja przeznaczona jest wyłącznie dla osoby lub podmiotu, do którego jest adresowana i może zawierać informacje o charakterze poufnym lub zastrzeżonym. Nieuprawnione wykorzystanie informacji zawartych w wiadomości e-mail przez osobę lub podmiot nie będący jej adresatem jest zabronione odpowiednimi przepisami prawa. Odbiorca korespondencji, który otrzymał ją omyłkowo, proszony jest o niezwłoczne zawiadomienie nadawcy drogą elektroniczną lub telefonicznie i usunięcie tej treści z poczty elektronicznej. Dziękujemy. Asseco Business Solutions S.A. ________________________________ Weź pod uwagę ochronę środowiska, zanim wydrukujesz ten e-mail. ________________________________ This information is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Unauthorized use of this information by person or entity other than the intended recipient is prohibited by law. If you received this by mistake, please immediately contact the sender by e-mail or by telephone and delete this information from any computer. Thank you. Asseco Business Solutions S.A. ________________________________ Please consider your environmental responsibility before printing this e-mail.
--- chef-10.12.0.orig/lib/chef/config.rb +++ chef-10.12.0/lib/chef/config.rb @@ -184,6 +184,7 @@ class Chef run_command_stdout_timeout 120 solo false splay nil + client_fork true # Set these to enable SSL authentication / mutual-authentication # with the server --- chef-10.12.0.orig/lib/chef/daemon.rb +++ chef-10.12.0/lib/chef/daemon.rb @@ -24,6 +24,7 @@ class Chef class Daemon class << self attr_accessor :name + attr_accessor :forked_child # Daemonize the current process, managing pidfiles and process uid/gid # @@ -46,7 +47,7 @@ class Chef $stdout.reopen("/dev/null", "a") $stderr.reopen($stdout) save_pid_file - at_exit { remove_pid_file } + at_exit { remove_pid_file unless forked_child } rescue NotImplementedError => e Chef::Application.fatal!("There is no fork: #{e.message}") end --- chef-10.12.0.orig/lib/chef/client.rb +++ chef-10.12.0/lib/chef/client.rb @@ -36,6 +36,7 @@ require 'chef/cookbook/file_vendor' require 'chef/cookbook/file_system_file_vendor' require 'chef/cookbook/remote_file_vendor' require 'chef/version' +require 'chef/daemon' require 'ohai' require 'rbconfig' @@ -135,50 +136,52 @@ class Chef end # Do a full run for this Chef::Client. Calls: + # * do_run # - # * run_ohai - Collect information about the system - # * build_node - Get the last known state, merge with local changes - # * register - If not in solo mode, make sure the server knows about this client - # * sync_cookbooks - If not in solo mode, populate the local cache with the node's cookbooks - # * converge - Bring this system up to date - # + # This provides a wrapper around #do_run allowing the + # run to be optionally forked. # === Returns - # true:: Always returns true. + # boolean:: Return value from #do_run. Should always returns true. def run - run_context = nil - - Chef::Log.info("*** Chef #{Chef::VERSION} ***") - enforce_path_sanity - run_ohai - register unless Chef::Config[:solo] - build_node - - begin - - run_status.start_clock - Chef::Log.info("Starting Chef Run for #{node.name}") - run_started - - run_context = setup_run_context - converge(run_context) - save_updated_node - - run_status.stop_clock - Chef::Log.info("Chef Run complete in #{run_status.elapsed_time} seconds") - run_completed_successfully + if(Chef::Config[:client_fork] && Process.respond_to?(:fork)) + Chef::Log.info "Forking chef instance to converge..." + pid = fork do + begin + Chef::Daemon.forked_child = true + Chef::Log.debug "Forked instance now converging" + do_run + rescue Exception => e + Chef::Log.error(e.to_s) + exit 1 + else + exit 0 + end + end + Chef::Log.debug "Fork successful. Waiting for new chef pid: #{pid}" + result = Process.waitpid2(pid) + if handle_child_exit(result) + Chef::Log.debug "Forked child successfully reaped (pid: #{pid})" + else + Chef::Log.error("Sleeping for #{Chef::Config[:interval]} seconds before trying again") + end true - rescue Exception => e - run_status.stop_clock - run_status.exception = e - run_failed - Chef::Log.debug("Re-raising exception: #{e.class} - #{e.message}\n#{e.backtrace.join("\n ")}") - raise - ensure - run_status = nil + else + do_run end - true end + def handle_child_exit(pid_and_status) + status = pid_and_status[1] + return true if status.success? + message = if status.signaled? + "Chef run process terminated by signal #{status.termsig} (#{Signal.list.invert[status.termsig]})" + else + "Chef run process exited unsuccessfully (exit code #{status.exitstatus})" + end + Chef::Log.fatal message + exit 1 unless Chef::Config[:daemonize] + false + end # Configures the Chef::Cookbook::FileVendor class to fetch file from the # server or disk as appropriate, creates the run context for this run, and @@ -333,6 +336,54 @@ class Chef private + # Do a full run for this Chef::Client. Calls: + # + # * run_ohai - Collect information about the system + # * build_node - Get the last known state, merge with local changes + # * register - If not in solo mode, make sure the server knows about this client + # * sync_cookbooks - If not in solo mode, populate the local cache with the node's cookbooks + # * converge - Bring this system up to date + # + # === Returns + # true:: Always returns true. + def do_run + run_context = nil + + begin + Chef::Log.info("*** Chef #{Chef::VERSION} ***") + enforce_path_sanity + run_ohai + register unless Chef::Config[:solo] + build_node + + run_status.start_clock + Chef::Log.info("Starting Chef Run for #{node.name}") + run_started + + run_context = setup_run_context + converge(run_context) + save_updated_node + + run_status.stop_clock + Chef::Log.info("Chef Run complete in #{run_status.elapsed_time} seconds") + run_completed_successfully + true + rescue Exception => e + # If we failed really early, we may not have a run_status yet. Too early for these to be of much use. + if run_status + run_status.stop_clock + run_status.exception = e + run_failed + end + Chef::Log.debug("Re-raising exception: #{e.class} - #{e.message}\n#{e.backtrace.join("\n ")}") + Chef::Application.debug_stacktrace(e) if Chef::Daemon.forked_child + raise + ensure + run_status = nil + end + true + end + # Ensures runlist override contains RunListItem instances def runlist_override_sanity_check! @override_runlist = @override_runlist.split(',') if @override_runlist.is_a?(String) --- chef-10.12.0.orig/lib/chef/application/client.rb +++ chef-10.12.0/lib/chef/application/client.rb @@ -153,6 +153,12 @@ class Chef::Application::Client < Chef:: } } + option :client_fork, + :short => "-f", + :long => "--[no-]fork", + :description => "Fork client", + :boolean => true + attr_reader :chef_client_json def initialize --- chef-10.12.0.orig/lib/chef/application/solo.rb +++ chef-10.12.0/lib/chef/application/solo.rb @@ -122,6 +122,12 @@ class Chef::Application::Solo < Chef::Ap } } + option :client_fork, + :short => "-f", + :long => "--[no-]fork", + :description => "Fork client", + :boolean => true + attr_reader :chef_solo_json def initialize
Generated at 2016-01-08 15:18:41 +0100 RuntimeError: ruby_block[fail the run] (test::default line 22) had an error: RuntimeError: deliberately fail the run /var/chef/cache/cookbooks/test/recipes/default.rb:24:in `block (2 levels) in from_file' /usr/lib/ruby/vendor_ruby/chef/provider/ruby_block.rb:33:in `call' /usr/lib/ruby/vendor_ruby/chef/provider/ruby_block.rb:33:in `block in action_run' /usr/lib/ruby/vendor_ruby/chef/mixin/why_run.rb:52:in `call' /usr/lib/ruby/vendor_ruby/chef/mixin/why_run.rb:52:in `add_action' /usr/lib/ruby/vendor_ruby/chef/provider.rb:155:in `converge_by' /usr/lib/ruby/vendor_ruby/chef/provider/ruby_block.rb:32:in `action_run' /usr/lib/ruby/vendor_ruby/chef/provider.rb:120:in `run_action' /usr/lib/ruby/vendor_ruby/chef/resource.rb:637:in `run_action' /usr/lib/ruby/vendor_ruby/chef/runner.rb:49:in `run_action' /usr/lib/ruby/vendor_ruby/chef/runner.rb:81:in `block (2 levels) in converge' /usr/lib/ruby/vendor_ruby/chef/runner.rb:81:in `each' /usr/lib/ruby/vendor_ruby/chef/runner.rb:81:in `block in converge' /usr/lib/ruby/vendor_ruby/chef/resource_collection.rb:98:in `block in execute_each_resource' /usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:116:in `call' /usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:116:in `call_iterator_block' /usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:85:in `step' /usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:104:in `iterate' /usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:55:in `each_with_index' /usr/lib/ruby/vendor_ruby/chef/resource_collection.rb:96:in `execute_each_resource' /usr/lib/ruby/vendor_ruby/chef/runner.rb:80:in `converge' /usr/lib/ruby/vendor_ruby/chef/client.rb:345:in `converge' /usr/lib/ruby/vendor_ruby/chef/client.rb:431:in `do_run' /usr/lib/ruby/vendor_ruby/chef/client.rb:213:in `block in run' /usr/lib/ruby/vendor_ruby/chef/client.rb:207:in `fork' /usr/lib/ruby/vendor_ruby/chef/client.rb:207:in `run' /usr/lib/ruby/vendor_ruby/chef/application.rb:217:in `run_chef_client' /usr/lib/ruby/vendor_ruby/chef/application/client.rb:328:in `block in run_application' /usr/lib/ruby/vendor_ruby/chef/application/client.rb:317:in `loop' /usr/lib/ruby/vendor_ruby/chef/application/client.rb:317:in `run_application' /usr/lib/ruby/vendor_ruby/chef/application.rb:67:in `run' /usr/bin/chef-client:25:in `<main>'
Generated at 2016-01-08 15:18:47 +0100 RuntimeError: ruby_block[fail the run] (test::default line 22) had an error: RuntimeError: deliberately fail the run /var/chef/cache/cookbooks/test/recipes/default.rb:24:in `block (2 levels) in from_file' /usr/lib/ruby/vendor_ruby/chef/provider/ruby_block.rb:28:in `call' /usr/lib/ruby/vendor_ruby/chef/provider/ruby_block.rb:28:in `action_create' /usr/lib/ruby/vendor_ruby/chef/resource.rb:472:in `run_action' /usr/lib/ruby/vendor_ruby/chef/runner.rb:49:in `run_action' /usr/lib/ruby/vendor_ruby/chef/runner.rb:85:in `block (2 levels) in converge' /usr/lib/ruby/vendor_ruby/chef/runner.rb:85:in `each' /usr/lib/ruby/vendor_ruby/chef/runner.rb:85:in `block in converge' /usr/lib/ruby/vendor_ruby/chef/resource_collection.rb:94:in `block in execute_each_resource' /usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:116:in `call' /usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:116:in `call_iterator_block' /usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:85:in `step' /usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:104:in `iterate' /usr/lib/ruby/vendor_ruby/chef/resource_collection/stepable_iterator.rb:55:in `each_with_index' /usr/lib/ruby/vendor_ruby/chef/resource_collection.rb:92:in `execute_each_resource' /usr/lib/ruby/vendor_ruby/chef/runner.rb:80:in `converge' /usr/lib/ruby/vendor_ruby/chef/client.rb:333:in `converge' /usr/lib/ruby/vendor_ruby/chef/client.rb:364:in `do_run' /usr/lib/ruby/vendor_ruby/chef/client.rb:152:in `block in run' /usr/lib/ruby/vendor_ruby/chef/client.rb:148:in `fork' /usr/lib/ruby/vendor_ruby/chef/client.rb:148:in `run' /usr/lib/ruby/vendor_ruby/chef/application/client.rb:260:in `block in run_application' /usr/lib/ruby/vendor_ruby/chef/application/client.rb:247:in `loop' /usr/lib/ruby/vendor_ruby/chef/application/client.rb:247:in `run_application' /usr/lib/ruby/vendor_ruby/chef/application.rb:70:in `run' /usr/bin/chef-client:25:in `<main>'