On Wed, Mar 25, 2026 at 9:35 PM Ashutosh Bapat
<[email protected]> wrote:
>
> On Tue, Mar 24, 2026 at 9:02 PM Ashutosh Bapat
> <[email protected]> wrote:
> >
> >
> > I will continue from 0008 tomorrow.
> >
>
> I reviewed the documentation part of 0008. I have a few edits attached.
>
> I have just one comment that's not covered in the edits
>
> @@ -4254,8 +4254,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts 
> ppx
> <para>
> Anonymous allocations are allocations that have been made
> with <literal>ShmemAlloc()</literal> directly, rather than via
> - <literal>ShmemInitStruct()</literal> or
> - <literal>ShmemInitHash()</literal>.
> + <literal>ShmemRequestStruct()</literal> or
> + <literal>ShmemRequestHash()</literal>.
> </para>
>
> ShmemInitStruct() and ShmemInitHash() are still the functions to
> allocate named structures. If we are going to keep ShmemInitStruct()
> and ShmemInitHash() around for a while, I think it is more accurate to
> mention them in this sentence along with the new functions.
>
> Will continue reviewing the patch tomorrow.
>

Here's a complete review of 0008 (from version 8). I see you have
already posted v9 which I have not looked at. Please feel free to
ignore comments which aren't applicable anymore. Attached patch has
minor edits and some code arrangement to make it more readable as
mentioned in the comments below. Please incorporate those changes as
applicable if you find them useful.

+ The <function>ShmemRequestStruct()</function> can also be called after
+ system startup, which is useful to allow small allocations in add-in
+ libraries that are not specified in
+ <xref 
linkend="guc-shared-preload-libraries"/><indexterm><primary>shared_preload_libraries</primary></indexterm>.
+ However, after startup the allocation can fail if there is not enough
+ shared memory available. The system reserves a somes memory for
+ allocations after startup, but that reservation is small.

Do we check the request against the reserved amount or overall memory
available? If later, a large request after startup can cause memory
allocated for hash tables to be taken away. This was a problem in the
previous implementation as well, since ShmemAlloc() could be called
after startup. But giving a formal API to do it might encourage more
usage, so it would be good to have some checks in place.

/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ ShmemCallRequestCallbacks();

It's not clear how we keep the list of registered callbacks across the
backends and also after restart in-sync. How do we make sure that the
callbacks registered at this time are the same callbacks registered
before creating the shared memory? How do we make sure that the
callbacks registered after the startup are also registered after
restart?

- /* re-create shared memory and semaphores */
+ /*
+ * Re-initialize shared memory and semaphores. Note: We don't call
+ * RegisterShmemStructs() here, we keep the old registrations. In

There is no RegisterShmemStructs(). Probably this comment is not reuquired.

+ * This module provides facilities to allocate fixed-size structures in shared
+ * memory, for things like variables shared between all backend processes.
+ * Each such structure has a string name to identify it, specified in the
+ * descriptor when it is requested. shmem_hash.c provides a shared hash table
+ * implementation on top of that.

This wording works well for resizable structures. Thanks.

+ * Shared memory managed by shmem.c can never be freed, once allocated. Each
+ * hash table has its own free list, so hash buckets can be reused when an
+ * item is deleted. However, if one hash table grows very large and then
+ * shrinks, its space cannot be redistributed to other tables. We could build
+ * a simple hash bucket garbage collector if need be. Right now, it seems
+ * unnecessary.

The second sentence onwards belong to shmem_hash.c. Don't they?

+} shmem_startup_state;

This isn't just startup state since the backend can toggle between
DONE and LATE_ATTACH_OR_INIT states after the startup. Probably
"shmem_state" would be a better name.

Also, it might be better to separate the enum and the variable
declaration. I was confused for a moment.

What does B stand for in the enum values?

+static bool AttachOrInit(ShmemStructDesc *desc, bool init_allowed,
bool attach_allowed);

Init in the name can easily lead into thinking that the function is
going to invoke the init callback. I think a better name would be
AttachOrAllocate() or something which can not be confused with Init.

+/*
+ * ShmemRequestStruct() --- request a named shared memory area
+ *
+ * Subsystems call this to register their shared memory needs. This is
+ * usually done early in postmaster startup, before the shared memory segment
+ * has been created, so that the size can be included in the estimate for
+ * total amount of shared memory needed. We set aside a small amount of
+ * memory for allocations that happen later, for the benefit of non-preloaded
+ * extensions, but that should not be relied upon.

I don't think we need to reiterate the last sentence here, since it's
already mentioned in the "Usage" section of the documentation and this
API is unrelated to that.

+ * Attach to all the requested memory areas.
+ */
+ LWLockAcquire(ShmemIndexLock, LW_SHARED);
+ while (!dclist_is_empty(&requested_shmem_areas))
+ {
+ requested_shmem_area *area = dlist_container(requested_shmem_area, node,
+ dclist_pop_head_node(&requested_shmem_areas));

Isn't requested_shmem_areas a List*? Why do we need to pop nodes from it?

+ ShmemStructDesc *desc = area->desc;
+
+ AttachOrInit(desc, false, true);
+ }
+ list_free(requested_shmem_areas);
+ requested_shmem_areas = NIL;

If we pop all the nodes from the list, then the list should be NIL
right? Why do we need to free it?

+ else if (!init_allowed)
+ {

For the sake of documentation and sanity, I would add
Assert(!index_entry) here, possibly with a comment. Otherwise it feels
like we might be leaving a half-initialized entry in the hash table.

What if attach_allowed is false and the entry is not found? Should we
throw an error in that case too? It would be foolish to call
AttachOrInit with both init_allowed and attach_allowed set to false,
but the API allows it and we should check for that.

It feels like we should do something about the arguments. The function
is hard to read. init_allowed is actually the action the caller wants
to take if the entry is not found, and attach_allowed is the action
the caller wants to take if the entry is found.

Also explain in the comment what does attach mean here especially in
case of fixed sized structures.

Restructuring the code as attached reads better to me.

+/*
+ * Reset state on postmaster crash restart.
+ */
+void
+ResetShmemAllocator(void)
+{

I still think this requires a different name since it's not undoing
what InitShmemAllocator() did. Maybe ResetShmemState()?

+void
+RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
... snip ...
+ foreach(lc, requested_shmem_areas)

Doesn't this list contain all the areas, not just registered in this
instance of the call. Does that mean that we need to have all the
attach functions idempotent? Why can't we deal with the newly
registered areas only?

+ * FIXME: What to do if multiple shmem areas were requested, and some
+ * of them are already initialized but not all?
*/

I doubt if we want to allow attaching to areas which are already
attached since the attach_fn may not be idempotent.
/* Initialize the hash header, plus a copy of the table name */
+ Assert(tabname != NULL);
+ Assert(CurrentDynaHashCxt != NULL);

This looks like a separate patch and separate commit.

+
+ /*
+ * Extra space to reserve in the shared memory segment, but it's not part
+ * of the struct itself. This is used for shared memory hash tables that
+ * can grow beyond the initial size when more buckets are allocated.
+ */
+ size_t extra_size;

When we introduce resizable structures (where even the hash table
directly itself could be resizable), we will introduce a new field
max_size which is easy to get confused with extra_size. Maybe we can
rename extra_size to something like "auxilliary_size" to mean size of
the auxiliary parts of the structure which are not part of the main
struct itself.

+ /*
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ */
+ size_t max_size;
+
+ /*
+ * init_size is the number of hashtable entries to preallocate. For a
+ * table whose maximum size is certain, this should be equal to max_size;
+ * that ensures that no run-time out-of-shared-memory failures can occur.
+ */
+ size_t init_size;

Everytime I look at these two fields, I question whether those are the
number of entries (i.e. size of the hash table) or number of bytes
(size of the memory). I know it's the former, but it indicates that
something needs to be changed here, like changing the names to have
_entries instead of _size, or changing the type to int64 or some such.
Renaming to _entries would conflict with dynahash APIs since they use
_size, so maybe the latter?

+
+/*
+ * Shared memory is reserved and allocated in stages at postmaster startup,
+ * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them

The comma after EXEC_BACKEND mode is a bit confusing. It makes me
think that the clause after the comma is detached from the
EXEC_BACKEND mode. Maybe revise as
"Shared memory is reserved and allocated to various shared memory
structures in stages at postmaster startup. In EXEC_BACKEND mode,
there's some extra work done to "attach" to them at backend startup.
ShmemCallbacks holds callback functions that are called at different
stages."

+ * at backend startup. ShmemCallbacks holds callback functions that are
+ * called at different stages.
+ */
+typedef struct ShmemCallbacks
+{
+ /* SHMEM_* flags */
+ int flags;

I think we should define the flags before this. Also SHMEM_ looks too
generic prefix, maybe SHMEM_CALLBACKS_ or SHMEM_CB_. With those
changes it will look something like the attached patch.

@@ -50,7 +50,6 @@ static InjIoErrorState *inj_io_error_state;
static shmem_request_hook_type prev_shmem_request_hook = NULL;
static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
-

It's good to get rid of an extra line, but maybe a separate commit.

-- 
Best Wishes,
Ashutosh Bapat

Attachment: 0008_edits.patch.nocibot
Description: Binary data

Reply via email to