Thank you, Brian. *#11305 <https://github.com/prometheus/prometheus/issues/11305> *looks exactly what we are encountering.
On Mon, May 27, 2024 at 2:28 PM 'Brian Candler' via Prometheus Users < [email protected]> wrote: > Have you looked in the changelog > <https://github.com/prometheus/prometheus/blob/main/CHANGELOG.md> for > Prometheus? I found: > > ## 2.51.0 / 2024-03-18 > > * [BUGFIX] Kubernetes SD: Pod status changes were not discovered by > Endpoints service discovery #13337 > <https://github.com/prometheus/prometheus/pull/13337> > > *=> fixes #11305 <https://github.com/prometheus/prometheus/issues/11305>, > which looks similar to your problem* > > ## 2.50.0 / 2024-02-22 > > * [ENHANCEMENT] Kubernetes SD: Check preconditions earlier and avoid > unnecessary checks or iterations in kube_sd. #13408 > <https://github.com/prometheus/prometheus/pull/13408> > > I'd say it's worth trying the latest release, 2.51.2. > > On Monday 27 May 2024 at 12:21:01 UTC+1 Vu Nguyen wrote: > >> Hi, >> >> Do you have a response to this thread? Has anyone ever encountered the >> issue? >> >> Regards, >> Vu >> >> On Mon, May 20, 2024 at 2:56 PM Vu Nguyen <[email protected]> wrote: >> >>> Hi, >>> >>> With endpoints scraping role, the job should scrape POD endpoint that is >>> up and running. That is what we are expected. >>> >>> I think by concept, K8S does not create an endpoint if Pod is in other >>> phases like Pending, Failed, etc. >>> >>> In our environments, Prometheus 2.46.0 on K8S v1.28.2, we currently have >>> issues: >>> 1) POD is up and running from `kubectl get pod`, but from Prometheus >>> discovery page, it shows: >>> __meta_kubernetes_pod_phase="Pending" >>> __meta_kubernetes_pod_ready="false" >>> >>> 2) The the endpoints job discover POD targets with pod phase=`Pending`. >>> >>> Those issues disappear after we restart Prometheus pod. >>> >>> I am not sure if 1) that is K8S that does not trigger event after POD >>> phase changes so Prometheus is not able to refresh its endpoints discovery >>> or 2) it is a known problem of Prometheus? >>> >>> And do you think it is worth to add the following relabeling rule to >>> endpoints job role? >>> >>> - source_labels: [ __meta_kubernetes_pod_phase ] >>> regex: Pending|Succeeded|Failed|Completed >>> action: drop >>> >>> Thanks, Vu >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Prometheus Users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/prometheus-users/c0f97ed7-1421-4c7c-a57d-2d301bb12418n%40googlegroups.com >>> <https://groups.google.com/d/msgid/prometheus-users/c0f97ed7-1421-4c7c-a57d-2d301bb12418n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-users/0641d658-e295-418b-ae00-af6ce83e7ccbn%40googlegroups.com > <https://groups.google.com/d/msgid/prometheus-users/0641d658-e295-418b-ae00-af6ce83e7ccbn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAF7KRaw2iuOOL2anc1hMdX%2BjGT%2B2Wx-U_tFm2gybks%2BbNtxaPg%40mail.gmail.com.

