您好,登錄后才能下訂單哦!
本節介紹了checkpoint中用于控制checkpoint刷盤頻率的函數:IsCheckpointOnSchedule.
宏定義
checkpoints request flag bits
checkpoints request flag bits,檢查點請求標記位定義.
/*
* OR-able request flag bits for checkpoints. The "cause" bits are used only
* for logging purposes. Note: the flags must be defined so that it's
* sensible to OR together request flags arising from different requestors.
*/
/* These directly affect the behavior of CreateCheckPoint and subsidiaries */
#define CHECKPOINT_IS_SHUTDOWN 0x0001 /* Checkpoint is for shutdown */
#define CHECKPOINT_END_OF_RECOVERY 0x0002 /* Like shutdown checkpoint, but
* issued at end of WAL recovery */
#define CHECKPOINT_IMMEDIATE 0x0004 /* Do it without delays */
#define CHECKPOINT_FORCE 0x0008 /* Force even if no activity */
#define CHECKPOINT_FLUSH_ALL 0x0010 /* Flush all pages, including those
* belonging to unlogged tables */
/* These are important to RequestCheckpoint */
#define CHECKPOINT_WAIT 0x0020 /* Wait for completion */
#define CHECKPOINT_REQUESTED 0x0040 /* Checkpoint request has been made */
/* These indicate the cause of a checkpoint request */
#define CHECKPOINT_CAUSE_XLOG 0x0080 /* XLOG consumption */
#define CHECKPOINT_CAUSE_TIME 0x0100 /* Elapsed time */
WRITES_PER_ABSORB
/* interval for calling AbsorbSyncRequests in CheckpointWriteDelay */
//調用AbsorbSyncRequests的間隔,默認值為1000
#define WRITES_PER_ABSORB 1000
IsCheckpointOnSchedule
該函數判斷是否在完成checkpoint的調度中,如返回T則可以休息,否則返回F則需要干活.
/*
* Calculate CheckPointSegments based on max_wal_size_mb and
* checkpoint_completion_target.
* 計算CheckPointSegments
*/
static void
CalculateCheckpointSegments(void)
{
double target;
/*-------
* Calculate the distance at which to trigger a checkpoint, to avoid
* exceeding max_wal_size_mb. This is based on two assumptions:
*
* a) we keep WAL for only one checkpoint cycle (prior to PG11 we kept
* WAL for two checkpoint cycles to allow us to recover from the
* secondary checkpoint if the first checkpoint failed, though we
* only did this on the master anyway, not on standby. Keeping just
* one checkpoint simplifies processing and reduces disk space in
* many smaller databases.)
* b) during checkpoint, we consume checkpoint_completion_target *
* number of segments consumed between checkpoints.
*-------
*/
//#define ConvertToXSegs(x,segsize) (x / ((segsize) / (1024 * 1024)))
target = (double) ConvertToXSegs(max_wal_size_mb, wal_segment_size) /
(1.0 + CheckPointCompletionTarget);
/* round down */
CheckPointSegments = (int) target;
if (CheckPointSegments < 1)
CheckPointSegments = 1;
}
/*
* IsCheckpointOnSchedule -- are we on schedule to finish this checkpoint
* (or restartpoint) in time?
* IsCheckpointOnSchedule -- 是否在完成checkpoint的調度中
*
* Compares the current progress against the time/segments elapsed since last
* checkpoint, and returns true if the progress we've made this far is greater
* than the elapsed time/segments.
* 當前的進度與消逝的time/xlog segments進行比較,如果進度要早,那么返回T(進入休息狀態)
*/
static bool
IsCheckpointOnSchedule(double progress)
{
XLogRecPtr recptr;
struct timeval now;
double elapsed_xlogs,
elapsed_time;
Assert(ckpt_active);
/* Scale progress according to checkpoint_completion_target. */
//實際進度調整為progress*checkpoint_completion_target
progress *= CheckPointCompletionTarget;
/*
* Check against the cached value first. Only do the more expensive
* calculations once we reach the target previously calculated. Since
* neither time or WAL insert pointer moves backwards, a freshly
* calculated value can only be greater than or equal to the cached value.
* 如果進度小于緩存值,返回F,需加快進度了!
*/
if (progress < ckpt_cached_elapsed)
return false;
/*
* Check progress against WAL segments written and CheckPointSegments.
* 進度 vs WAL
*
* We compare the current WAL insert location against the location
* computed before calling CreateCheckPoint. The code in XLogInsert that
* actually triggers a checkpoint when CheckPointSegments is exceeded
* compares against RedoRecptr, so this is not completely accurate.
* However, it's good enough for our purposes, we're only calculating an
* estimate anyway.
*
* During recovery, we compare last replayed WAL record's location with
* the location computed before calling CreateRestartPoint. That maintains
* the same pacing as we have during checkpoints in normal operation, but
* we might exceed max_wal_size by a fair amount. That's because there can
* be a large gap between a checkpoint's redo-pointer and the checkpoint
* record itself, and we only start the restartpoint after we've seen the
* checkpoint record. (The gap is typically up to CheckPointSegments *
* checkpoint_completion_target where checkpoint_completion_target is the
* value that was in effect when the WAL was generated).
*/
if (RecoveryInProgress())
recptr = GetXLogReplayRecPtr(NULL);
else
recptr = GetInsertRecPtr();
elapsed_xlogs = (((double) (recptr - ckpt_start_recptr)) /
wal_segment_size) / CheckPointSegments;
if (progress < elapsed_xlogs)
{
//進度小于產生xlogs的速度,需干活
ckpt_cached_elapsed = elapsed_xlogs;
return false;
}
/*
* Check progress against time elapsed and checkpoint_timeout.
* 比較時間
*/
gettimeofday(&now, NULL);
elapsed_time = ((double) ((pg_time_t) now.tv_sec - ckpt_start_time) +
now.tv_usec / 1000000.0) / CheckPointTimeout;
if (progress < elapsed_time)
{
//進度慢于消逝的時間,需干活
ckpt_cached_elapsed = elapsed_time;
return false;
}
/* It looks like we're on schedule. */
//處于調度中,可以休息
return true;
}
N/A
PG Source Code
PgSQL · 特性分析 · 談談checkpoint的調度
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。