yea, getting better :) Have a look to diff of my fork, what could work: https://github.com/sameergithub5/prometheusrole/pull/1
It is still pretty raw, but it contains the idea. Dne neděle 19. listopadu 2023 v 8:54:59 UTC+1 uživatel Sameer Modak napsal: > Thanks a lot Zdenek. > > I got it now i have heard your comments and converted this to something > closer . > > > https://github.com/sameergithub5/prometheusrole/tree/main/node_exporter_and_prometheus_jmx_exporter > > Can you plz spot if there is a room for an improvement . > > On Thursday, November 16, 2023 at 8:10:02 PM UTC+5:30 Zdenek Pyszko wrote: > >> Hello Sameer, >> my two cents here as i made a quick lookup to your repo. >> I would suggest to refactor your repo to use roles. >> You have three different playbooks referenced in main.yml, which are >> doing more or less the same job. >> Create a role 'enable prometheus' which will be dynamic enough to make >> decision based on input variables (zookeeper, Kafka,...) >> And one tiny role to restart the services(if needed). >> Outcome: single playbook, one prometheus role, one service mgmt(restart) >> role, no DRY code(dont repeat yourself), re-usable. >> >> Dne čtvrtek 9. listopadu 2023 v 17:29:28 UTC+1 uživatel Sameer Modak >> napsal: >> >>> Hello Todd, >>> >>> I tried serial and it works but my problem is, serial works in playbook >>> so when i write import_playbook inside include_task: zookeeper.yaml it >>> fails saying u cant import playbook inside task. >>> Now, How do i do it then?? >>> >>> ok so let me give you how i am running basically i have created role >>> prometheus which you can find here in below my personal public repo. Role >>> has its usual main.yml which includes tasks and i have created >>> Restartandcheck.yml which i am unable to use because import_playbook error >>> if i put in zookeeper.yml file >>> >>> >>> https://github.com/sameergithub5/prometheusrole/tree/main/prometheus >>> >>> >>> On Friday, November 3, 2023 at 9:00:13 PM UTC+5:30 Todd Lewis wrote: >>> >>>> That's correct; serial is not a task or block key word. It's a playbook >>>> key word. >>>> >>>> - name: One host at a time >>>> hosts: ducks_in_a_row >>>> serial: 1 >>>> max_fail_percentage: 0 >>>> tasks: >>>> - task1 >>>> - task2 >>>> - task3 >>>> >>>> Read up on serial >>>> <https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_strategies.html#setting-the-batch-size-with-serial> >>>> >>>> and max_fail_percentage >>>> <https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_error_handling.html#setting-a-maximum-failure-percentage> >>>> >>>> . Blocks don't come into it. >>>> >>>> >>>> On 11/3/23 9:22 AM, Sameer Modak wrote: >>>> >>>> Hello will, >>>> >>>> >>>> >>>> i tried to do it with block and serial no it does not work say's block >>>> cant have serial >>>> >>>> tasks: >>>> >>>> - name: block check >>>> >>>> block: >>>> >>>> - name: run this shell >>>> >>>> shell: 'systemctl restart "{{zookeeper_service_name}}"' >>>> >>>> >>>> - name: debug >>>> >>>> debug: >>>> >>>> msg: "running my task" >>>> >>>> >>>> - name: now run this task >>>> >>>> shell: timeout -k 3 1m sh -c 'until nc -zv localhost >>>> {{hostvars[inventory_hostname].zk_port}}; do sleep 1; done' >>>> >>>> >>>> when: >>>> >>>> - not zkmode is search('leader') >>>> >>>> serial: 1 >>>> >>>> ~ >>>> >>>> On Wednesday, November 1, 2023 at 3:39:54 PM UTC+5:30 Sameer Modak >>>> wrote: >>>> >>>>> Let me try with block and serial and get back to you >>>>> >>>>> On Wednesday, November 1, 2023 at 5:33:14 AM UTC+5:30 Will McDonald >>>>> wrote: >>>>> >>>>>> Edit: s/along with a failed_when/along with wait_for/ >>>>>> >>>>>> On Tue, 31 Oct 2023 at 23:58, Will McDonald <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I don't entirely understand your approach, constraints or end-to-end >>>>>>> requirements here, but trying to read between the lines... >>>>>>> >>>>>>> 1. You have a cluster of zookeeper nodes (presumably 2n+1 so 3, 5 or >>>>>>> more nodes) >>>>>>> 2. You want to do a rolling restart of these nodes 1 at a time, wait >>>>>>> for the node to come back up, check it's functioning, and if that >>>>>>> doesn't >>>>>>> work, fail the run >>>>>>> 3. With your existing approach you can limit the restart of a >>>>>>> service using throttle at the task level, but then don't know how to >>>>>>> handle >>>>>>> failure in a subsequent task >>>>>>> 4. You don't think wait_for will work because you only throttle on >>>>>>> the restart task >>>>>>> >>>>>>> (Essentially you want your condition "has the service restarted >>>>>>> successfully" to be in the task itself.) >>>>>>> >>>>>>> Again some thoughts that might help you work through this... >>>>>>> >>>>>>> 1. Any reason you couldn't just use serial at a playbook level? If >>>>>>> so, what is that? >>>>>>> 2. If you must throttle rather than serial, consider using it in a >>>>>>> block along with a failed_when >>>>>>> 3. Try and avoid using shell and use builtin constructs like >>>>>>> service, it'll save you longer term pain >>>>>>> >>>>>>> Read through the links I posted earlier and explain what might stop >>>>>>> you using the documented approach. >>>>>>> >>>>>>> This post from Vladimir on Superuser might be useful too: >>>>>>> https://superuser.com/questions/1664197/ansible-keyword-throttle >>>>>>> (loads of other 2n+1 rolling update/restart examples out there too: >>>>>>> https://stackoverflow.com/questions/62378317/ansible-rolling-restart-multi-cluster-environment >>>>>>> ) >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, 31 Oct 2023 at 17:54, Sameer Modak <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello Will, >>>>>>>> >>>>>>>> I have used throttle so that part is sorted. But i dont think >>>>>>>> wait_for works here for example. >>>>>>>> task 1 restart. <--- now in this task already he has restarted all >>>>>>>> hosts one by one >>>>>>>> task 2 wait_for <-- this will fail if port does not come up but no >>>>>>>> use because restart is triggered. >>>>>>>> >>>>>>>> we just want to know if in one task it restarts and checks if fails >>>>>>>> aborts play thats it. Now we got the results but used shell module. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tuesday, October 31, 2023 at 7:53:31 PM UTC+5:30 Will McDonald >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I'd suggest reading up on rolling updates using serial: >>>>>>>>> >>>>>>>>> >>>>>>>>> https://docs.ansible.com/ansible/latest/playbook_guide/guide_rolling_upgrade.html#the-rolling-upgrade >>>>>>>>> >>>>>>>>> https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_strategies.html#setting-the-batch-size-with-serial >>>>>>>>> >>>>>>>>> You can use wait_for or wait_for_connection to ensure service >>>>>>>>> availability before continuing: >>>>>>>>> >>>>>>>>> >>>>>>>>> https://docs.ansible.com/ansible/latest/collections/ansible/builtin/wait_for_module.html >>>>>>>>> >>>>>>>>> https://docs.ansible.com/ansible/latest/collections/ansible/builtin/wait_for_connection_module.html >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, 31 Oct 2023 at 14:08, Sameer Modak <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> restart service, check if service is ready to accept connection >>>>>>>>>> because it takes time to come up. Once we sure its listening on port >>>>>>>>>> then >>>>>>>>>> only move to next host. unless dont move because we can only afford >>>>>>>>>> to have >>>>>>>>>> one service down at a time. >>>>>>>>>> >>>>>>>>>> is there any to short hand or ansible native way to handle this >>>>>>>>>> using ansible module. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> code: >>>>>>>>>> >>>>>>>>>> name: Restart zookeeper followers >>>>>>>>>> >>>>>>>>>> throttle: 1 >>>>>>>>>> >>>>>>>>>> any_errors_fatal: true >>>>>>>>>> >>>>>>>>>> shell: | >>>>>>>>>> >>>>>>>>>> systemctl restart {{zookeeper_service_name}} >>>>>>>>>> >>>>>>>>>> timeout 22 sh -c 'until nc localhost >>>>>>>>>> {{zookeeper_server_port}}; do sleep 1; done' >>>>>>>>>> >>>>>>>>>> when: not zkmode.stdout_lines is search('leader') >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>> Google Groups "Ansible Project" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>> send an email to [email protected]. >>>>>>>>>> To view this discussion on the web visit >>>>>>>>>> https://groups.google.com/d/msgid/ansible-project/67ca5f13-855d-4d40-a47a-c0fbe11ea3b5n%40googlegroups.com >>>>>>>>>> >>>>>>>>>> <https://groups.google.com/d/msgid/ansible-project/67ca5f13-855d-4d40-a47a-c0fbe11ea3b5n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>> . >>>>>>>>>> >>>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "Ansible Project" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/ansible-project/3370b143-050a-4a14-a858-f5abe60c2678n%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/ansible-project/3370b143-050a-4a14-a858-f5abe60c2678n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Ansible Project" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/ansible-project/69417f84-b761-4008-8284-ac644d3384f7n%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/ansible-project/69417f84-b761-4008-8284-ac644d3384f7n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> >>>> -- >>>> Todd >>>> >>>> -- You received this message because you are subscribed to the Google Groups "Ansible Project" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/f52dca30-05ac-49ad-9622-5539f037cdf0n%40googlegroups.com.
