GistTree.Com
Entertainment at it's peak. The news is by your side.

How to Analyze a PostgreSQL Crash Dump File

0

1. Introduction

On this weblog publish, I will discuss how to enable the technology of rupture dump file (moreover known as core dump) and some long-established GDB commands to inspire a developer troubleshoot a rupture-associated components within PostgreSQL and moreover a whole lot of applications. Actual diagnosis of the subject customarily will settle time and particular level of information about the utility supply code. From abilities, ceaselessly it may perhaps perhaps also presumably be higher to perceive at the greater atmosphere instead of the point of rupture.

2. What’s a Fracture Dump File?

A rupture dump file is a file that includes the recorded insist of the working reminiscence of an utility when it crashes. This insist is represented by stacks of reminiscence addresses and CPU registers and usually it is very delicate to debug with simplest reminiscence addresses and CPU registers because they repeat you no information about the utility common sense. Occupied with the core dump contents below, which displays the aid ticket of reminiscence addresses to the point of rupture.

#1  0x00687a3d in ?? ()
#2  0x00d37f06 in ?? ()
#3  0x00bf0ba4 in ?? ()
#4  0x00d3333b in ?? ()
#5  0x00d3f682 in ?? ()
#6  0x00d3407b in ?? ()
#7  0x00d3f2f7 in ?? ()

Now not very valuable is it? So, after we see a rupture dump file that seems devour this, it skill the utility isn’t any longer built with debugging symbols, making this rupture dump file ineffective. If here is the case, that that you may perhaps perhaps want to put in the debug model of the utility or re-manufacture the utility with debugging enabled.

3. Strategies to Generate a Capable Fracture Dump File

Sooner than the technology of rupture dump file, we need to be particular the utility is built with debugging symbols. It’ll be done by executing the ./configure script devour this:

./configure enable-debug

This adds the -g argument to CFLAGS in src/Makefile.global with optimization stage space to 2 (-O2). My favor is to moreover substitute the optimization to 0 (-O0) so after we’re navigating the stack the employ of GDB, the navigation will assemble rather more sense instead of leaping round and we are in a position to be ready to print out most variables values in reminiscence instead of getting optimized out error in GDB.

CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-assertion -Werror=vla -Wendif-labels -Wmissing-layout-attribute -Wimplicit-fallthrough=3 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=long-established -Wno-layout-truncation -g -O0

Now, we can enable the rupture dump technology. It’ll be done by the person restrict reveal.

ulimit -c limitless

to disable:

ulimit -c 0

Be certain there is enough disk space because rupture dump file is customarily very vivid because it information the total reminiscence execution states from originate as a lot as rupture, and be particular the ulimit is made up our minds up within the shell earlier than starting up PostgreSQL. When PostgreSQL crashes, a core dump file named core will more than doubtless be generated in $PGDATA

4. Analyzing the Dump File the employ of GDB

GDB (GNU Debugger) is a portable debugger that runs on many Unix-devour systems and can work with many programming languages and is my favorite tool to analyze a rupture dump file. To prove this, I will deliberately add a line in PostgreSQL supply code that can dwell in segmentation fault rupture kind when a CREATE TABLE reveal is creep.

Assuming the PostgreSQL has already crashed and generated a core dump file core on this region ~/highgo/git/postgres/postgresdb/core. I’d first employ the file utility to know more about the core file. Records such because the kernel information, and this arrangement that generated it.

caryh@HGPC01:~$ file /home/caryh/highgo/git/postgres/postgresdb/core
postgresdb/core: ELF 64-bit LSB core file x86-64, model 1 (SYSV), SVR4-vogue, from 'postgres: cary cary [local] CREATE TABLE', valid uid: 1000, efficient uid: 1000, valid gid: 1000, efficient gid: 1000, execfn: '/home/caryh/highgo/git/postgres/highgo/bin/postgres', platform: 'x86_64'
caryh@HGPC01:~$

The file utility tells me that the core file is generated by this utility /home/caryh/highgo/git/postgres/highgo/bin/postgres, so I’d assemble gdb devour this:

gdb /home/caryh/highgo/git/postgres/highgo/bin/postgres -c  /home/caryh/highgo/git/postgres/postgresdb/core

GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Tool Foundation, Inc.
License GPLv3+: GNU GPL model 3 or later 
Here's free tool: that that you may perhaps perhaps additionally be free to interchange and redistribute it.
There is NO WARRANTY, to the extent favorite by law.  Form "prove copying"
and "prove guarantee" for crucial components.
This GDB used to be configured as "x86_64-linux-gnu".
Form "prove configuration" for configuration crucial components.
For bug reporting instructions, please see:
.
Score the GDB book and various documentation assets online at:
.
For inspire, kind "inspire".
Form "apropos phrase" to peek commands associated to "phrase"...
Reading symbols from /home/caryh/highgo/git/postgres/highgo/bin/postgres...done.
[New LWP 27417]
[Thread debugging using libthread_db enabled]
The utilization of host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core used to be generated by `postgres: cary cary [local] CREATE TABLE                                 '.
Program terminated with be aware SIGSEGV, Segmentation fault.
#0  heap_insert (relation=relation@entry=0x7f872f532228, tup=tup@entry=0x55ba8290f778, cid=0, suggestions=suggestions@entry=0,
    bistate=bistate@entry=0x0) at heapam.c: 1840
1840            ereport(LOG,(errmsg("heap tuple len = %d", heaptup->t_len)));
(gdb)

Directly after running gdb on the core file, it displays the positioning of the rupture at heapam.c: 1840 and that is strictly the line I like deliberately added to space off a rupture.

5. Capable GDB Instructions

With gdb, it is terribly easy to title the positioning of a rupture, because it tells you without extend after running gdb on the core file. Sadly, 95% of the time, the positioning of the rupture isn’t any longer the precise clarification for the teach. Here’s why I discussed earlier that ceaselessly it may perhaps perhaps also presumably be higher to perceive at the greater atmosphere instead of the point of rupture. The rupture is doubtless induced by a mistake within the utility common sense some the set else within the utility earlier than it hits the point of rupture. Even within the occasion you fix the rupture, the mistake in utility common sense composed exists and presumably, the utility will rupture in numerous places later or yield unsatisfactory results. Therefore, it is value awhile to know one of the significant most highly efficient GDB commands that can also inspire us understand the choice stacks higher to title the precise root space off.

5.1 The bt (Support Hint) reveal

The bt reveal displays a chain of call stacks because the muse of the utility the total skill to the point of rupture. With beefy debugging enabled, it is a long way doable for you to to perceive the characteristic arguments and values being handed in to every characteristic calls moreover to the provision file and line numbers the set they had been known as. This permits developer to toddle backwards to study for any doable utility common sense mistake within the earlier processing.

(gdb) bt
#0  heap_insert (relation=relation@entry=0x7f872f532228, tup=tup@entry=0x55ba8290f778, cid=0, suggestions=suggestions@entry=0,
    bistate=bistate@entry=0x0) at heapam.c: 1840
#1  0x000055ba81ccde3e in simple_heap_insert (relation=relation@entry=0x7f872f532228, tup=tup@entry=0x55ba8290f778)
    at heapam.c: 2356
#2  0x000055ba81d7826d in CatalogTupleInsert (heapRel=0x7f872f532228, tup=0x55ba8290f778) at indexing.c: 228
#3  0x000055ba81d946ea in TypeCreate (newTypeOid=newTypeOid@entry=0, typeName=typeName@entry=0x7ffcf56ef820 "test",
    typeNamespace=typeNamespace@entry=2200, relationOid=relationOid@entry=16392, relationKind=relationKind@entry=114 'r',
    ownerId=ownerId@entry=16385, internalSize=-1, typeType=99 'c', typeCategory=67 'C', typePreferred=unsuitable,
    typDelim=44 ',', inputProcedure=2290, outputProcedure=2291, receiveProcedure=2402, sendProcedure=2403,
    typmodinProcedure=0, typmodoutProcedure=0, analyzeProcedure=0, elementType=0, isImplicitArray=unsuitable, arrayType=16393,
    baseType=0, defaultTypeValue=0x0, defaultTypeBin=0x0, passedByValue=unsuitable, alignment=100 'd', storage=120 'x',
    typeMod=-1, typNDims=0, typeNotNull=unsuitable, typeCollation=0) at pg_type.c: 484
#4  0x000055ba81d710bc in AddNewRelationType (new_array_type=16393, new_row_type=, ownerid=,
    new_rel_kind=, new_rel_oid=, typeNamespace=2200, typeName=0x7ffcf56ef820 "test")
    at heap.c: 1033
#5  heap_create_with_catalog (relname=relname@entry=0x7ffcf56ef820 "test", relnamespace=relnamespace@entry=2200,
    reltablespace=reltablespace@entry=0, relid=16392, relid@entry=0, reltypeid=reltypeid@entry=0,
    reloftypeid=reloftypeid@entry=0, ownerid=16385, accessmtd=2, tupdesc=0x55ba8287c620, cooked_constraints=0x0,
    relkind=114 'r', relpersistence=112 'p', shared_relation=unsuitable, mapped_relation=unsuitable, oncommit=ONCOMMIT_NOOP,
    reloptions=0, use_user_acl=precise, allow_system_table_mods=unsuitable, is_internal=unsuitable, relrewrite=0, typaddress=0x0)
    at heap.c: 1294
#6  0x000055ba81e3782a in DefineRelation (stmt=stmt@entry=0x55ba82876658, relkind=relkind@entry=114 'r', ownerId=16385,
    ownerId@entry=0, typaddress=typaddress@entry=0x0,
    queryString=queryString@entry=0x55ba82855648 "assemble desk test (a int, b char(10)) the employ of heap;") at tablecmds.c: 885
#7  0x000055ba81fd5b2f in ProcessUtilitySlow (pstate=pstate@entry=0x55ba82876548, pstmt=pstmt@entry=0x55ba828565a0,
    queryString=queryString@entry=0x55ba82855648 "assemble desk test (a int, b char(10)) the employ of heap;",
    context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0, queryEnv=queryEnv@entry=0x0, qc=0x7ffcf56efe50,
    dest=0x55ba82856860) at utility.c: 1161
#8  0x000055ba81fd4120 in standard_ProcessUtility (pstmt=0x55ba828565a0,
    queryString=0x55ba82855648 "assemble desk test (a int, b char(10)) the employ of heap;", context=PROCESS_UTILITY_TOPLEVEL,
    params=0x0, queryEnv=0x0, dest=0x55ba82856860, qc=0x7ffcf56efe50) at utility.c: 1069
#9  0x000055ba81fd1962 in PortalRunUtility (portal=0x55ba828b7dd8, pstmt=0x55ba828565a0, isTopLevel=,
    setHoldSnapshot=, dest=, qc=0x7ffcf56efe50) at pquery.c: 1157
#10 0x000055ba81fd23e3 in PortalRunMulti (portal=portal@entry=0x55ba828b7dd8, isTopLevel=isTopLevel@entry=precise,
    setHoldSnapshot=setHoldSnapshot@entry=unsuitable, dest=dest@entry=0x55ba82856860, altdest=altdest@entry=0x55ba82856860,
    qc=qc@entry=0x7ffcf56efe50) at pquery.c: 1310
#11 0x000055ba81fd2f51 in PortalRun (portal=portal@entry=0x55ba828b7dd8, count=count@entry=9223372036854775807,
    isTopLevel=isTopLevel@entry=precise, run_once=run_once@entry=precise, dest=dest@entry=0x55ba82856860,
    altdest=altdest@entry=0x55ba82856860, qc=0x7ffcf56efe50) at pquery.c: 779
#12 0x000055ba81fce967 in exec_simple_query (query_string=0x55ba82855648 "assemble desk test (a int, b char(10)) the employ of heap;")
    at postgres.c: 1239
#13 0x000055ba81fd0d7e in PostgresMain (argc=, argv=argv@entry=0x55ba8287fdb0, dbname=,
    username=) at postgres.c: 4315
#14 0x000055ba81f4f52a in BackendRun (port=0x55ba82877110, port=0x55ba82877110) at postmaster.c: 4536
#15 BackendStartup (port=0x55ba82877110) at postmaster.c: 4220
#16 ServerLoop () at postmaster.c: 1739
#17 0x000055ba81f5063f in PostmasterMain (argc=3, argv=0x55ba8284fee0) at postmaster.c: 1412
#18 0x000055ba81c91c04 in predominant (argc=3, argv=0x55ba8284fee0) at predominant.c: 210
(gdb)

5.1 The f (Cruise) reveal

The f reveal followed by a stack number permits gdb to leap to a selected call stack listed by the bt reveal and permits you to print a whole lot of variable in that valid stack. As an example:

(gdb) f 3
#3  0x000055ba81d946ea in TypeCreate (newTypeOid=newTypeOid@entry=0, typeName=typeName@entry=0x7ffcf56ef820 "test",
    typeNamespace=typeNamespace@entry=2200, relationOid=relationOid@entry=16392, relationKind=relationKind@entry=114 'r',
    ownerId=ownerId@entry=16385, internalSize=-1, typeType=99 'c', typeCategory=67 'C', typePreferred=unsuitable,
    typDelim=44 ',', inputProcedure=2290, outputProcedure=2291, receiveProcedure=2402, sendProcedure=2403,
    typmodinProcedure=0, typmodoutProcedure=0, analyzeProcedure=0, elementType=0, isImplicitArray=unsuitable, arrayType=16393,
    baseType=0, defaultTypeValue=0x0, defaultTypeBin=0x0, passedByValue=unsuitable, alignment=100 'd', storage=120 'x',
    typeMod=-1, typNDims=0, typeNotNull=unsuitable, typeCollation=0) at pg_type.c: 484
484                     CatalogTupleInsert(pg_type_desc, tup);
(gdb)

This forces gdb to leap to stack number 3, which is at pg_type.c: 484. In here, that that you may perhaps perhaps perhaps perceive all a whole lot of variables on this body (in characteristic TypeCreate).

5.2 The p (Print) reveal

Essentially the most well-favored reveal in gdb, that can also also be feeble to print variable addresses and values

(gdb) p tup
$1 = (HeapTuple) 0x55ba8290f778
(gdb) p pg_type_desc
$2 = (Relation) 0x7f872f532228

(gdb)  p tup
$3 = {t_len = 176, t_self = {ip_blkid = {bi_hi = 65535, bi_lo = 65535}, ip_posid = 0}, t_tableOid = 0,
  t_data = 0x55ba8290f790}

(gdb) p pg_type_desc
$4 = {rd_node = {spcNode = 1663, dbNode = 16384, relNode = 1247}, rd_smgr = 0x55ba828e2a38, rd_refcnt = 2, rd_backend = -1,
  rd_islocaltemp = unsuitable, rd_isnailed = precise, rd_isvalid = precise, rd_indexvalid = precise, rd_statvalid = unsuitable,
  rd_createSubid = 0, rd_newRelfilenodeSubid = 0, rd_firstRelfilenodeSubid = 0, rd_droppedSubid = 0,
  rd_rel = 0x7f872f532438, rd_att = 0x7f872f532548, rd_id = 1247, rd_lockInfo = {lockRelId = {relId = 1247, dbId = 16384}},
  rd_rules = 0x0, rd_rulescxt = 0x0, trigdesc = 0x0, rd_rsdesc = 0x0, rd_fkeylist = 0x0, rd_fkeyvalid = unsuitable,
  rd_partkey = 0x0, rd_partkeycxt = 0x0, rd_partdesc = 0x0, rd_pdcxt = 0x0, rd_partcheck = 0x0, rd_partcheckvalid = unsuitable,
  rd_partcheckcxt = 0x0, rd_indexlist = 0x7f872f477d00, rd_pkindex = 0, rd_replidindex = 0, rd_statlist = 0x0,
  rd_indexattr = 0x0, rd_keyattr = 0x0, rd_pkattr = 0x0, rd_idattr = 0x0, rd_pubactions = 0x0, rd_options = 0x0,
  rd_amhandler = 0, rd_tableam = 0x55ba82562c20 , rd_index = 0x0, rd_indextuple = 0x0, rd_indexcxt = 0x0,
  rd_indam = 0x0, rd_opfamily = 0x0, rd_opcintype = 0x0, rd_support = 0x0, rd_supportinfo = 0x0, rd_indoption = 0x0,
  rd_indexprs = 0x0, rd_indpred = 0x0, rd_exclops = 0x0, rd_exclprocs = 0x0, rd_exclstrats = 0x0, rd_indcollation = 0x0,
  rd_opcoptions = 0x0, rd_amcache = 0x0, rd_fdwroutine = 0x0, rd_toastoid = 0, pgstat_info = 0x55ba828d5cb0}
(gdb)

With the asteroid, that that you may perhaps perhaps perhaps repeat the p reveal to both print the address of a pointer or the values pointed by the pointer.

5.3 The x (perceive) reveal

The x reveal is feeble to perceive a reminiscence block contents with specified dimension and layout. The following instance tries to perceive the t_data values internal a HeapTuple structure. Effect that we first print the *tup pointer to learn the dimensions of t_data is 176, then we employ the x reveal to perceive the first 176 bytes pointed by t_data

(gdb)  p *tup
$6 = {t_len = 176, t_self = {ip_blkid = {bi_hi = 65535, bi_lo = 65535}, ip_posid = 0}, t_tableOid = 0,
  t_data = 0x55ba8290f790}

(gdb)  p tup->t_data
$7 = (HeapTupleHeader) 0x55ba8290f790
(gdb) x/176bx  tup->t_data
0x55ba8290f790: 0xc0    0x02    0x00    0x00    0xff    0xff    0xff    0xff
0x55ba8290f798: 0x47    0x00    0x00    0x00    0xff    0xff    0xff    0xff
0x55ba8290f7a0: 0x00    0x00    0x1f    0x00    0x01    0x00    0x20    0xff
0x55ba8290f7a8: 0xff    0xff    0x0f    0x00    0x00    0x00    0x00    0x00
0x55ba8290f7b0: 0x0a    0x40    0x00    0x00    0x74    0x65    0x73    0x74
0x55ba8290f7b8: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x55ba8290f7c0: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x55ba8290f7c8: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x55ba8290f7d0: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x55ba8290f7d8: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x55ba8290f7e0: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x55ba8290f7e8: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x55ba8290f7f0: 0x00    0x00    0x00    0x00    0x98    0x08    0x00    0x00
0x55ba8290f7f8: 0x01    0x40    0x00    0x00    0xff    0xff    0x00    0x63
0x55ba8290f800: 0x43    0x00    0x01    0x2c    0x08    0x40    0x00    0x00
0x55ba8290f808: 0x00    0x00    0x00    0x00    0x09    0x40    0x00    0x00
0x55ba8290f810: 0xf2    0x08    0x00    0x00    0xf3    0x08    0x00    0x00
0x55ba8290f818: 0x62    0x09    0x00    0x00    0x63    0x09    0x00    0x00
0x55ba8290f820: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x55ba8290f828: 0x00    0x00    0x00    0x00    0x64    0x78    0x00    0x00
0x55ba8290f830: 0x00    0x00    0x00    0x00    0xff    0xff    0xff    0xff
0x55ba8290f838: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
(gdb)

7. Conclusion

On this weblog, we have got discussed about how to generate a valuable rupture dump file with enough debug symbols to inspire builders troubleshoot a rupture enviornment in PostgreSQL and moreover in a whole lot of applications. We like moreover discussed about a truly highly efficient and valuable debugger gdb and shared some of essentially the most in vogue commands that can also be utilized to troubleshoot a rupture enviornment from a core file. I hope the suggestions here can inspire some builders accessible to troubleshoot components higher.

Read More

Leave A Reply

Your email address will not be published.