From fd18641adfe19ecf12a4f6a17a07bd5ffb272d48 Mon Sep 17 00:00:00 2001
From: Kommi <haribabuk@fast.au.fujitsu.com>
Date: Mon, 11 Mar 2019 15:44:44 +1100
Subject: [PATCH 3/3] Table access method API explanation

All the table access method API's and their details are explained.
---
 doc/src/sgml/am.sgml | 794 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 789 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/am.sgml b/doc/src/sgml/am.sgml
index b2a97f20aa..455e7a10c6 100644
--- a/doc/src/sgml/am.sgml
+++ b/doc/src/sgml/am.sgml
@@ -18,14 +18,798 @@
   <para>
    All Tables in <productname>PostgreSQL</productname> are the primary
    data store. Each table is stored as its own physical <firstterm>relation</firstterm>
-   and so is described by an entry in the <structname>pg_class</structname>
-   catalog. The table contents are entirely under the control of its
-   access method. (All the access methods furthermore use the standard page
-   layout described in <xref linkend="storage-page-layout"/>.)
+   and is described by an entry in the <structname>pg_class</structname>
+   catalog. A table's content is entirely controlled by its access method, although
+   all access methods use the same standard page layout described in <xref linkend="storage-page-layout"/>.
   </para>
 
- </sect1>
+  <sect2 id="table-access-methods-api">
+   <title>Table access method API</title>
+
+   <para>
+    Each table access method is described by a row in the
+    <link linkend="catalog-pg-am"><structname>pg_am</structname></link> system
+    catalog. The <structname>pg_am</structname> entry specifies a <firstterm>type</firstterm>
+    of the access method and a <firstterm>handler function</firstterm> for the
+    access method. These entries can be created and deleted using the <xref linkend="sql-create-access-method"/>
+    and <xref linkend="sql-drop-access-method"/> SQL commands.
+   </para>
+
+   <para>
+    A table access method handler function must be declared to accept a
+    single argument of type <type>internal</type> and to return the
+    pseudo-type <type>table_am_handler</type>.  The argument is a dummy value that
+    simply serves to prevent handler functions from being called directly from
+    SQL commands.  The result of the function must be a palloc'd struct of
+    type <structname>TableAmRoutine</structname>, which contains everything
+    that the core code needs to know to make use of the table access method.
+    The <structname>TableAmRoutine</structname> struct, also called the access
+    method's <firstterm>API struct</firstterm>, includes fields specifying assorted
+    fixed properties of the access method, such as whether it can support
+    bitmap scans.  More importantly, it contains pointers to support
+    functions for the access method, which do all of the real work to access
+    tables.  These support functions are plain C functions and are not
+    visible or callable at the SQL level.  The support functions are described
+    in <structname>TableAmRoutine</structname> structure. For more details, please
+    refer the file <filename>src/include/access/tableam.h</filename>.
+   </para>
+
+   <para>
+    Any new <literal>TABLE ACCSESS METHOD</literal> developers can refer the exisitng <literal>HEAP</literal>
+    implementation present in the <filename>src/backend/heap/heapam_handler.c</filename> for more details of
+    how it is implemented for HEAP access method.
+   </para>
+
+   <para>
+    There are different type of API's that are defined and those details are below.
+   </para>
+
+   <sect3 id="slot-implementation-function">
+    <title>Slot implementation functions</title>
+
+   <para>
+<programlisting>
+const TupleTableSlotOps *(*slot_callbacks) (Relation rel);
+</programlisting>
+
+    This API expects the function should return the slot implementation that is specific to the AM.
+    Following are the predefined types of slot implementations that are available,
+    <literal>TTSOpsVirtual</literal>, <literal>TTSOpsHeapTuple</literal>,
+    <literal>TTSOpsMinimalTuple</literal> and <literal>TTSOpsBufferHeapTuple</literal>.
+    The AM implementations can use any one of them. For more details of these slot
+    specific implementations, you can refer <filename>src/include/executor/tuptable.h</filename>.
+   </para>
+   </sect3>
+
+   <sect3 id="table-scan-functions">
+    <title>Table scan functions</title>
+
+    <para>
+     The following API's are used for scanning of a table.
+    </para>
+
+    <para>
+<programlisting>
+TableScanDesc (*scan_begin) (Relation rel,
+                             Snapshot snapshot,
+                             int nkeys, struct ScanKeyData *key,
+                             ParallelTableScanDesc pscan,
+                             bool allow_strat,
+                             bool allow_sync,
+                             bool allow_pagemode,
+                             bool is_bitmapscan,
+                             bool is_samplescan,
+                             bool temp_snap);
+</programlisting>
+
+     This API to start a scan of a relation pointed by <literal>rel</literal> and returns the
+     <structname>TableScanDesc</structname>, which will be typically embed in a larger AM specific,
+     strcut. 
+     
+     The <literal>nkeys</literal> indicates results needs to be filtered based on the <literal>key</literal>.
+     The <literal>pscan</literal> can be used by the AM, in case if it supports parallel scan.
+     The parameters <literal>allow_strat</literal>, <literal>allow_sync</literal> and <literal>allow_pagemode</literal>
+     are used for specifying whether the scan strategy, as whether it supports synchronize scans or
+     pagemode scans (although every AM is not required to support these).
+
+     The parameters <literal>is_bitmapscan</literal> and <literal>is_samplescan</literal> are used to
+     specify whether the scan is intended to support those type of scans are not?
+     
+     The <literal>temp_snap</literal> indicates the provided snapshot is a temporary allocated and
+     it needs to be freed at the scan end.
+    </para>
+
+    <para>
+<programlisting>
+void        (*scan_end) (TableScanDesc scan);
+</programlisting>
+
+     This API to end the scan that is started by the API <literal>scan_begin</literal>
+     by releasing the resources. <structfield>TableScanDesc.rs_snapshot</structfield>
+     needs to be unregistered and it can be deallocated based on <structfield>TableScanDesc.temp_snap</structfield>.
+    </para>
+
+    <para>
+<programlisting>
+void        (*scan_rescan) (TableScanDesc scan, struct ScanKeyData *key, bool set_params,
+                            bool allow_strat, bool allow_sync, bool allow_pagemode);
+</programlisting>
+
+     This API to restart the given relation scan that is already started by the
+     API <literal>scan_begin</literal>. if <literal>set_params</literal> is set
+     to true, consider the provided options into the scan.
+    </para>
+
+    <para>
+<programlisting>
+TupleTableSlot *(*scan_getnextslot) (TableScanDesc scan,
+                                     ScanDirection direction, TupleTableSlot *slot);
+</programlisting>
+
+     This API to return the next satisified tuple from the scan started by the API
+     <literal>scan_begin</literal> and store it in the <literal>slot</literal>.
+    </para>
+
+   </sect3>
+
+   <sect3 id="parallel-table-scan-function">
+    <title>parallel table scan functions</title>
+
+    <para>
+     The following API's are used to perform the parallel table scan.
+    </para>
+
+    <para>
+<programlisting>
+Size        (*parallelscan_estimate) (Relation rel);
+</programlisting>
+
+     This API to return the total size that is required for the AM to perform
+     the parallel table scan. The requied size must include the <structname>ParallelTableScanDesc</structname>
+     which is typically embed in the AM specific struct.
+    </para>
+
+    <para>
+<programlisting>
+Size        (*parallelscan_initialize) (Relation rel, ParallelTableScanDesc pscan);
+</programlisting>
+
+     This API to perform the initialization of the <literal>pscan</literal>
+     that is required for the parallel scan to be performed by the AM and also return
+     the size that is estimated by the <literal>parallelscan_estimate</literal>.
+    </para>
+
+    <para>
+<programlisting>
+void        (*parallelscan_reinitialize) (Relation rel, ParallelTableScanDesc pscan);
+</programlisting>
+
+     This API to reinitalize the parallel scan structure pointed by the <literal>pscan</literal>
+     for the same relation.
+    </para>
+
+   </sect3>
+
+   <sect3 id="index-scan-functions">
+    <title>Index scan functions</title>
+
+    <para>
+<programlisting>
+struct IndexFetchTableData *(*index_fetch_begin) (Relation rel);
+</programlisting>
+
+     This API to prepare fetching tuples from the relation, as needed when fetching
+     from index scan. The API needs to return the allocated and initialized <structname>IndexFetchTableData</structname>
+     strutucture, which is typically embed in the AM specific struct.
+    </para>
+
+    <para>
+<programlisting>
+void        (*index_fetch_reset) (struct IndexFetchTableData *data);
+</programlisting>
+
+     This API to reset the index fetch, typically it releases the AM specific resources
+     that are held by <structname>IndexFetchTableData</structname> of a index scan.
+    </para>
+
+    <para>
+<programlisting>
+void        (*index_fetch_end) (struct IndexFetchTableData *data);
+</programlisting>
+
+     This API to release AM-specific resources held by the <structname>IndexFetchTableData</structname>
+     and free the memory of <structname>IndexFetchTableData</structname> itself.
+    </para>
+
+    <para>
+<programlisting>
+bool        (*index_fetch_tuple) (struct IndexFetchTableData *scan,
+                                  ItemPointer tid,
+                                  Snapshot snapshot,
+                                  TupleTableSlot *slot,
+                                  bool *call_again, bool *all_dead);
+</programlisting>
+
+     This API to fetch the tuple pointed by <literal>tid</literal> of a relation and store it in the
+     <literal>slot</literal> after performing visibility check according the provided <literal>snapshot</literal>.
+     Returns true when the tuple is found or false. 
+     
+     The <literal>call_again</literal> is false when the API is called for the first time with the <literal>tid</literal>,
+     in case if there are any potential match for another tuple, <literal>call_again</literal> must be
+     set to true to indicate the caller to execute the API again to fetch the tuple.
+
+     The <literal>all_dead</literal> is not NULL, should be set to true if by the API function iff it
+     is guaranteed that no backend needs to see that tuple. Index AMs can use that do avoid returning
+     that tid in future searches.
+    </para>
+
+   </sect3>
+
+   <sect3 id="non-modifying-tuple-functions">
+    <title>Non modifying tuple functions</title>
+
+    <para>
+<programlisting>
+bool        (*tuple_fetch_row_version) (Relation rel,
+                                        ItemPointer tid,
+                                        Snapshot snapshot,
+                                        TupleTableSlot *slot);
+</programlisting>
+
+     This API to fetches the latest tuple specified by the ItemPointer <literal>tid</literal>
+     and store it in the <literal>slot</literal> after doing a visibilty test according to the
+     <literal>snapshot</literal>. If a tuple was found and passed visibility test return true,
+     otherwise false.
+
+     For e.g, in the case if Heap AM, the update chains are created whenever
+     the tuple is updated, so the function should fetch the latest tuple.
+    </para>
+
+    <para>
+<programlisting>
+void        (*tuple_get_latest_tid) (Relation rel,
+                                     Snapshot snapshot,
+                                     ItemPointer tid);
+</programlisting>
+
+     This API to get the latest version of the tuple based on the specified ItemPointer <literal>tid</literal>.
+     For e.g, in the case of Heap AM, the update chains are created whenever any tuple is updated.
+     This API is used to find out latest ItemPointer.
+    </para>
+
+    <para>
+<programlisting>
+bool        (*tuple_satisfies_snapshot) (Relation rel,
+                                         TupleTableSlot *slot,
+                                         Snapshot snapshot);
+</programlisting>
+
+     This API performs the tuple visibility that is present in the <literal>slot</literal>
+     based on provided <literal>snapshot</literal> and returns true if the current tuple is visible,
+     otherwise false. Some AMs might modify the data underlying the tuple as a side-effect.
+     If so they ought to mark the relevant buffer dirty.
+    </para>
+    
+    <para>
+<programlisting>
+bool        (*tuple_fetch_follow) (struct IndexFetchTableData *scan,
+                                   ItemPointer tid,
+                                   Snapshot snapshot,
+                                   TupleTableSlot *slot,
+                                   bool *call_again, bool *all_dead);
+</programlisting>
+
+     This API is used to fetch the tuple pointed by the ItemPointer based on the
+     IndexFetchTableData and store it in the specified slot and also updates the flags.
+     This API is called from the index scan operation.
+    </para>
+
+    <para>
+<programlisting>
+TransactionId (*compute_xid_horizon_for_tuples) (Relation rel,
+                                                 ItemPointerData *items,
+                                                 int nitems);
+</programlisting>
+
+     This API to get the newest xid among the provided tuples by <literal>items</literal>. This is used
+     to compute what snapshots to conflict with the <literal>items</literal> when replaying WAL records
+     for page-level index vacuums.
+    </para>
+    
+   </sect3>
+
+   <sect3 id="manipulation-of-physical-tuples-functions">
+    <title>Manipulation of physical tuples functions</title>
+
+    <para>
+<programlisting>
+void        (*tuple_insert) (Relation rel, TupleTableSlot *slot, CommandId cid,
+                             int options, struct BulkInsertStateData *bistate);
+</programlisting>
+
+     This API to insert a tuple from a <literal>slot</literal> into AM routine.
+     
+     The <literal>cid</literal> is command identifier used to specify the number
+     of the comamnd that is getting processed.
+     
+     The <literal>options</literal> bitmask allows to specify options that allow
+     to change the behaviour of the AM. Several options might be ignored by AMs
+     not supporting them.
  
+     If the <literal>TABLE_INSERT_SKIP_WAL</literal> option is specified, the new
+     tuple will not necessarily logged to WAL, even for a non-temp relation. It is
+     the AMs choice whether this optimization is supported.
+     If the <literal>TABLE_INSERT_SKIP_FSM</literal> option is specified, AMs are
+     free to not reuse free space in the relation. This can save some cycles when
+     we know the relation is new and doesn't contain useful amounts of free space.
+     It's commonly passed directly to RelationGetBufferForTuple, see for more info.
+     If the <literal>TABLE_INSERT_FROZEN</literal> option can only be specified for
+     inserts into relfilenodes created during the current subtransaction and when
+     there are no prior snapshots or pre-existing portals open. This causes rows to
+     be frozen, which is an MVCC violation and requires explicit options chosen by user.
+     If the <literal>TABLE_INSERT_NO_LOGICAL</literal> can only be specified to
+     indicate the AM to force-disables the emitting of logical decoding information
+     for the tuple. This should solely be used during table rewrites where
+     RelationIsLogicallyLogged(relation) is not yet accurate for the new relation.
+    
+     The <literal>BulkInsertState</literal> object (if any; bistate can be NULL for default
+     behavior) is also just passed through to RelationGetBufferForTuple.
+
+     On return the slot's tts_tid and tts_tableOid are updated to reflect the
+     insertion.
+    </para>
+
+    <para>
+<programlisting>
+void        (*tuple_insert_speculative) (Relation rel,
+                                         TupleTableSlot *slot,
+                                         CommandId cid,
+                                         int options,
+                                         struct BulkInsertStateData *bistate,
+                                         uint32 specToken);
+</programlisting>
+
+     This API is similar like <literal>tuple_insert</literal> API, but it perform a
+     "speculative insertion". This API is used to backed out afterwards without aborting
+     the whole transaction.
+     
+     Other sessions can wait for the speculative insertion to be confirmed, turning it
+     into a regular tuple, or aborted, as if it never existed.  Speculatively inserted
+     tuples behave as "value locks" of short duration, used to implement
+     <command>INSERT .. ON CONFLICT</command>.
+ 
+     A transaction having performed a speculative insertion has to either abort, or finish
+     the speculative insertion with <function>table_complete_speculative()</function>.
+    </para>
+
+    <para>
+<programlisting>
+void        (*tuple_complete_speculative) (Relation rel,
+                                           TupleTableSlot *slot,
+                                           uint32 specToken,
+                                           bool succeeded);
+</programlisting>
+
+     This API to complete the speculative insertion of a tuple started in the same transaction
+     by <literal>tuple_insert_speculative</literal>, It is invoked after finishing the index insert.
+     If <literal>succeeded</literal> is true, the tuple is fully inserted, if false it should b
+     removed.
+    </para>
+
+    <para>
+<programlisting>
+TM_Result (*tuple_delete) (Relation rel,
+                           ItemPointer tid,
+                           CommandId cid,
+                           Snapshot snapshot,
+                           Snapshot crosscheck,
+                           bool wait,
+                           TM_FailureData *tmfd,
+                           bool changingPart);
+</programlisting>
+
+     This API to delete a tuple of the relation pointed by the ItemPointer <literal>tid</literal>
+     and returns the result of the operation.
+     
+     The <literal>cid</literal> is a command identifier, used for the visibility test to identify
+     the tuple according to the snapshot <literal>snapshot</literal>.
+     The <literal>crosscheck</literal> is not null, use it for verifying it against the visibilty
+     test.
+     The <literal>wait</literal> is true indicates the process to wait for any conflicting transactions
+     to either commit/rollback.
+     
+     The following two parameters must be outputed by the API function.
+
+     The <literal>tmfd</literal> should be set with proper details when the tuple delete operation fails.
+     The data that needs to fill in case failure, refer <structname>TM_FailureData</structname>.
+     The <literal>changingPart</literal> true iff the tuple is being moved to another partition
+     table due to an update of the partition key. Otherwise, false.
+    </para>
+
+    <para>
+<programlisting>
+TM_Result (*tuple_update) (Relation rel,
+                           ItemPointer otid,
+                           TupleTableSlot *slot,
+                           CommandId cid,
+                           Snapshot snapshot,
+                           Snapshot crosscheck,
+                           bool wait,
+                           TM_FailureData *tmfd,
+                           LockTupleMode *lockmode,
+                           bool *update_indexes);
+</programlisting>
+
+     This API to updates a tuple of the relation pointed by the ItemPointer <literal>otid</literal>
+     with the new tuple from <literal>slot</literal> and returns the result of the operation.
+     
+     The <literal>cid</literal> is a command identifier, used for the visibility test to identify
+     the tuple according to the snapshot <literal>snapshot</literal>.
+     The <literal>crosscheck</literal> is not null, use it for verifying it against the visibilty
+     test.
+     The <literal>wait</literal> is true indicates the process to wait for any conflicting transactions
+     to either commit/rollback.
+     
+     The following three parameters must be outputed by the API function.
+
+     The <literal>tmfd</literal> should be set with proper details when the tuple update operation fails.
+     The data that needs to fill in case failure, refer <structname>TM_FailureData</structname>.
+     The <literal>lockmode</literal> filled with lock mode acquired on tuple.
+     The <literal>update_indexes</literal> true if new index entries are required for this tuple. 
+     Otherwise false.
+     
+     On return the slot's tts_tid and tts_tableOid are updated to reflect the update. In particular,
+     slot->tts_tid is set to the TID where the new tuple was inserted, and its HEAP_ONLY_TUPLE flag
+     is set iff a HOT update was done.
+    </para>
+
+    <para>
+<programlisting>
+void        (*multi_insert) (Relation rel, TupleTableSlot **slots, int nslots,
+                             CommandId cid, int options, struct BulkInsertStateData *bistate);
+</programlisting>
+
+     This API to perform insertion of multiple tuples from <literal>slots</literal> into the relation
+     for faster data insertion. Refer <function>tuple_insert</function> for more details about parameters and
+     return values.
+    </para>
+
+    <para>
+<programlisting>
+TM_Result (*tuple_lock) (Relation rel,
+                         ItemPointer tid,
+                         Snapshot snapshot,
+                         TupleTableSlot *slot,
+                         CommandId cid,
+                         LockTupleMode mode,
+                         LockWaitPolicy wait_policy,
+                         uint8 flags,
+                         TM_FailureData *tmfd);
+</programlisting>
+
+     This API to lock a tuple pointed by the ItemPointer <literal>tid</literal> with the specified mode
+     and return the result of the operation.
+     
+     The <literal>cid</literal> is a command identifier, used for the visibility test to identify
+     the tuple according to the snapshot <literal>snapshot</literal>.
+     The <literal>mode</literal> lock mode desired.
+     The <literal>wait_policy</literal> indicates the operation in case if the tuple lock is not available.
+     The <literal>flags</literal> allows to specify options such as <literal>TUPLE_LOCK_FLAG_LOCK_UPDATE_IN_PROGRESS</literal>
+     to follow the update chain to also lock descendant tuples if lock modes don't conflict. or 
+     <literal>TUPLE_LOCK_FLAG_FIND_LAST_VERSION</literal> update chain and lock latest version.
+     
+     The following two parameters must be outputed by the API function.
+     
+     The <literal>tmfd</literal> should be set with proper details when the tuple update operation fails.
+     The <literal>slot</literal> contains the locked target tuple.
+    </para>
+
+    <para>
+<programlisting>
+void        (*finish_bulk_insert) (Relation rel, int options);
+</programlisting>
+
+     This API to perform the operations necessary to complete insertions made
+     via <literal>tuple_insert</literal> and <literal>multi_insert</literal> with a
+     BulkInsertState specified. This e.g. may e.g. used to flush the relation when
+     inserting with skipping WAL or may be no operation.
+    </para>
+
+   </sect3>
+
+   <sect3 id="ddl-related-functions">
+    <title>DDL related functions</title>
+
+    <para>
+<programlisting>
+void        (*relation_set_new_filenode) (Relation rel,
+                                          char persistence,
+                                          TransactionId *freezeXid,
+                                          MultiXactId *minmulti);
+</programlisting>
+
+     This API to create the storage file for the relation <literal>rel</literal>, with presistence set to
+     <literal>persistence</literal> that is necessary to store the tuples of the relation.
+     <literal>freezeXid</literal>, <literal>minmulti</literal> are set to the xid / multixact
+     horizon for the table that pg_class.{relfrozenxid, relminmxid} have to be set to. For e.g, the Heap AM,
+     should create the relfilenode that is necessary to store the heap tuples.
+    </para>
+
+    <para>
+<programlisting>
+void        (*relation_nontransactional_truncate) (Relation rel);
+</programlisting>
+
+     This API is used to remove all table contents from `rel`, in a non-transactional manner.
+     Non-transactional meaning that there's no need to support rollbacks. This commonly only
+     is used to perform truncations for relfilenodes created in the current transaction.
+     This operation is not non-reversible.
+    </para>
+
+    <para>
+<programlisting>
+void        (*relation_copy_data) (Relation rel, RelFileNode newrnode);
+</programlisting>
+
+     This API to perform the copy of the relation <literal>rel</literal> from existing filenode
+     to the new filenode <literal>newrnode</literal> and removes the existing filenode.
+    </para>
+
+    <para>
+<programlisting>
+void        (*relation_copy_for_cluster) (Relation NewHeap, Relation OldHeap, Relation OldIndex,
+                                          bool use_sort,
+                                          TransactionId OldestXmin, TransactionId FreezeXid, MultiXactId MultiXactCutoff,
+                                          double *num_tuples, double *tups_vacuumed, double *tups_recently_dead);
+</programlisting>
+
+     This API to make a copy data from <literal>OldHeap</literal> into <literal>NewHeap</literal>,
+     as part of a <command>CLUSTER</command> or <command>VACUUM FULL</command>.
+ 
+     If <literal>use_sort</literal> is true, the table contents are sorted appropriate for 
+     <literal>OldIndex</literal>; if <literal>use_sort</literal> is false and <literal>OldIndex</literal>
+     is not InvalidOid, the data is copied in that index's order; if <literal>use_sort</literal>
+     is false and <literal>OidIndex</literal> is InvalidOid, no sorting is performed.
+ 
+     <literal>OldestXmin</literal>, <literal>FreezeXid</literal>, <literal>MultiXactCutoff</literal>
+     can be used to clean the dead tuples of the table.
+ 
+     On successfule operation, <literal>num_tuples</literal>, <literal>tups_vacuumed</literal>, <literal>tups_recently_dead</literal>
+     will contain statistics computed while copying for the relation. Not all might make sense for every AM.
+    </para>
+    
+    <para>
+<programlisting>
+void        (*relation_vacuum) (Relation onerel, int options,
+                                struct VacuumParams *params, BufferAccessStrategy bstrategy);
+</programlisting>
+
+     This API performs vacuuming of the relation based on the specified params.
+     It Gathers all the dead tuples of the relation and clean them including
+     the indexes.
+     
+     This API Perform <command>VACUUM</command> on the relation. The <command>VACUUM</command>
+     can be user triggered or by <literal>autovacuum</literal>. The specific actions performed
+     by the AM will depend heavily on the individual AM. For eg- heapAM, It Gathers all the
+     dead tuples of the relation and clean them including the indexes
+
+     Note that neither <command>VACUUM FULL</command> (and <command>CLUSTER</command>), nor
+     <command>ANALYZE</command> go through this routine, even if (in the latter case), part of
+     the same <command>VACUUM</command> command.
+    </para>
+
+    <para>
+<programlisting>
+void        (*scan_analyze_next_block) (TableScanDesc scan, BlockNumber blockno,
+                                        BufferAccessStrategy bstrategy);
+</programlisting>
+
+     This API prepares the block <literal>blockno</literal> to analyze of table scan <literal>scan</literal>.
+     The scan needs to have been started with <function>table_beginscan_analyze()</function>.
+     Note that this routine might acquire resources like locks that are held until
+     <function>table_scan_analyze_next_tuple()</function> returns false.
+ 
+     Returns false if block is unsuitable for sampling, true otherwise.
+    </para>
+
+    <para>
+<programlisting>
+bool        (*scan_analyze_next_tuple) (TableScanDesc scan, TransactionId OldestXmin,
+                                        double *liverows, double *deadrows, TupleTableSlot *slot);
+</programlisting>
+
+     This API iterate over tuples in the block selected with <function>table_scan_analyze_next_block()</function>
+     If a tuple that's suitable for sampling is found, returns true and a tuple is stored in <literal>slot</literal>.
+     Otherwise returns false.
+     
+     The <literal>liverows</literal> and <literal>deadrows</literal> are incremented according to the encountered
+     tuples.
+    </para>
+    
+    <para>
+<programlisting>
+double      (*index_build_range_scan) (Relation heap_rel,
+                                       Relation index_rel,
+                                       IndexInfo *index_nfo,
+                                       bool allow_sync,
+                                       bool anyvisible,
+                                       BlockNumber start_blockno,
+                                       BlockNumber end_blockno,
+                                       IndexBuildCallback callback,
+                                       void *callback_state,
+                                       TableScanDesc scan);
+</programlisting>
+
+     This API to scan the table to find the tuples to be indexed from the specified
+     blocks of a given relation and insert them into the specified index using the
+     provided the callback function.
+     
+     This is called back from an access-method-specific index build procedure
+     after the AM has done whatever setup it needs.  The parent heap relation
+     <literal>heap_rel</literal> is scanned to find tuples that should be entered
+     into the index <literal>index_rel</literal>.  Each such tuple is passed to
+     the AM's callback routine <literal>callback</literal>, which does the right
+     things to add it to the new index.  After we return, the AM's index build
+     procedure does whatever cleanup it needs.
+
+     The <literal>callback_state</literal> is member needs to be passed to the
+     <literal>callback</literal> when it is invoked from the AM specific function.
+     
+     The <literal>allow_sync</literal> specifies the scan on the relation should follow
+     <literal>synchronize_seqscans</literal> configuration parameter.
+     
+     The <literal>index_info</literal> can be used to pass back some infromation
+     related to the AM. For eg- in heapAM, <structfield>indexInfo->ii_BrokenHotChain</structfield>
+     to true if we detect any potentially broken HOT chains.  Currently, we set
+     this if there are any RECENTLY_DEAD or DELETE_IN_PROGRESS entries in a HOT chain,
+     without trying very hard to detect whether they're really incompatible with
+     the chain tip. This need to be generalized for other AMs later.
+       
+     When <literal>anyvisible</literal> mode is requested, all tuples visible to
+     any transaction are indexed and counted as live, including those inserted
+     or deleted by transactions that are still in progress.
+ 
+     The <literal>start_blockno</literal> and <literal>end_blockno</literal> are
+     used to specify the range of the blocks that needs to be scanned on the relation.
+     
+     Upon successful execution, The total count of live tuples is returned. This
+     is for updating pg_class statistics.  
+    </para>
+
+    <para>
+<programlisting>
+void        (*index_validate_scan) (Relation heap_rel,
+                                    Relation index_rel,
+                                    IndexInfo *index_info,
+                                    Snapshot snapshot,
+                                    struct ValidateIndexState *state);
+</programlisting>
+
+     This API to perform second table scan in a concurrent index build.
+     
+     The table <literal>heap_rel</literal> scanned to find the tuples and insert
+     them into <literal>index_rel</literal> according to the given snapshot <literal>snapshot</literal>
+     by verifying their ItemPointerData in the provided <structname>ValidateIndexState</structname> struct;
+     this API is used as the last phase of a concurrent index build.
+    </para>
+    
+   </sect3>
+
+   <sect3 id="planner-functions">
+    <title>planner functions</title>
+
+    <para>
+<programlisting>
+void        (*relation_estimate_size) (Relation rel, int32 *attr_widths,
+                                       BlockNumber *pages, double *tuples, double *allvisfrac);
+</programlisting>
+
+     This API estimates the current size of the relation <literal>rel</literal> and also
+     returns the number of <literal>pages</literal>, <literal>tuples</literal> corresponding relation.
+
+     If <literal>attr_widths</literal> isn't NULL, It may contains the previously cached relation
+     attribute widhts, AM must fill this in if we have need to compute the attribute widths for
+     estimation purposes.
+     
+     It also returns <literal>allvisfrac</literal> the fraction of the pages that are marked
+     all-visible in the visibilty map.
+    </para>
+
+   </sect3>
+
+   <sect3 id="executor-functions">
+    <title>executor functions</title>
+
+    <para>
+<programlisting>
+bool        (*scan_bitmap_next_block) (TableScanDesc scan,
+                                       TBMIterateResult *tbmres);
+</programlisting>
+
+     This API prepare to fetch / check / return tuples from <literal>tbmres</literal>
+     structure <structfield>blockno</structfield> as part of a bitmap table scan.
+     <literal>scan</literal> that was started via <function>table_beginscan_bm()</function>.
+     Return false if there's no tuples to be found on the page, true otherwise.
+     
+     If <structfield>tbmres->blockno</structfield> is -1, this is a lossy scan
+     and all visible tuples on the page have to be returned, otherwise the tuples at
+     offsets in <structfield>tbmres->offsets</structfield> need to be returned.
+     
+     This is an optional callback, but either both <function>scan_bitmap_next_block</function>
+     and <function>scan_bitmap_next_tuple</function> need to exist, or neither.
+    </para>
+
+    <para>
+<programlisting>
+bool        (*scan_bitmap_next_tuple) (TableScanDesc scan,
+                                       TupleTableSlot *slot);
+</programlisting>
+
+     This API to get the next tuple from the set of tuples of a given page specified in the scan descriptor
+     and return the provided slot; returns false in case if there are no more tuples.
+     
+     This API fetch the next tuple of a bitmap table scan into <literal>slot</literal>
+     and return true if a visible tuple was found, false otherwise.
+     
+     For some AMs it will make more sense to do all the work referencing <literal>tbmres</literal>
+     contents in <function>scan_bitmap_next_block</function>, for others it might be
+     better to defer more work to this callback.
+     
+     This is an optional callback, but either both <function>scan_bitmap_next_block</function>
+     and <function>scan_bitmap_next_tuple</function> need to exist, or neither.
+    </para>
+
+    <para>
+<programlisting>
+bool        (*scan_sample_next_block) (TableScanDesc scan,
+                                       struct SampleScanState *scanstate);
+</programlisting>
+
+     This API to prepare to fetch tuples from the next block in a sample scan.
+     Return false if the sample scan is finished, true otherwise. <literal>scan</literal>
+     was started via <function>table_beginscan_sampling()</function>.
+     
+     Typically this will first determine the target block by call the 
+     <structfield>scanstate->tsmroutine->NextSampleBlock</structfield> if not NULL,
+     or alternatively perform a sequential scan over all blocks.  The determined
+     block is then typically read and pinned, 
+     
+     As the TsmRoutine interface is block based, the blocks needs to be passed
+     to <function>NextSampleBlock()</function> to return the sampled block.
+     
+     Note that it's not acceptable to hold deadlock prone resources such as lwlocks
+     until <function>scan_sample_next_tuple()</function> has exhausted the tuples on the
+     block - the tuple is likely to be returned to an upper query node, and the
+     next call could be off a long while. Holding buffer pins etc is obviously OK.
+     
+     Currently it is required to implement this interface, as there's no
+     alternative way (contrary e.g. to bitmap scans) to implement sample
+     scans. If infeasible to implement the AM may raise an error.
+    </para>
+
+    <para>
+<programlisting>
+bool        (*scan_sample_next_tuple) (TableScanDesc scan,
+                                       struct SampleScanState *scanstate,
+                                       TupleTableSlot *slot);
+</programlisting>
+
+     This API must determine the next tuple and store it in <literal>slot</literal>
+     from the selected block using the TsmRoutine's <function>NextSampleTuple()</function>
+     callback.
+     
+     This API needs to perform visibilty checks according to snapshot and return only
+     the valid tuple.
+     
+     The <literal>TsmRoutine</literal> interface assumes that there's a maximum offset
+     on a given page, so if that doesn't apply to an AM, it needs to emulate that
+     assumption somehow.
+    </para>
+
+  </sect3>
+  </sect2>
+ </sect1>
+
  <sect1 id="index-access-methods">
   <title>Overview of Index access methods</title>
 
-- 
2.20.1.windows.1