In my previous blog post, I explained how to resolve kernel-space symbols using perf. However, you might still encounter unresolved raw pointers from user-space processes, like this example from perf report:

     0.33%     0.33%  fio      [kernel.kallsyms]   [k] reweight_entity
     0.32%     0.14%  fio      [kernel.kallsyms]   [k] ext4_write_checks
     0.31%     0.28%  fio      [kernel.kallsyms]   [k] memcpy_erms
     0.31%     0.31%  fio      fio                 [.] add_lat_sample
     0.31%     0.00%  fio      [unknown]           [.] 0x0000556f2fcf2740
     0.31%     0.00%  fio      [unknown]           [.] 0x0000556f2fcf2790
     0.31%     0.00%  fio      [unknown]           [.] 0x0000556f2fcf2820
     0.31%     0.00%  fio      [unknown]           [.] 0x0000556f2fcf2860
     0.31%     0.00%  fio      [unknown]           [.] 0x0000556f2fcf28e0

This data is visualized in the following flamegraph:

At this point, you might want to verify whether the binary is compiled with debug information:

file $(which fio)

Example output:

/usr/local/bin/fio: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=3ede099d632a24f9af9bc2cfd2bd872b83eb17ea, for GNU/Linux 3.2.0, with debug_info, not stripped

Next, check for debug sections using readelf:

readelf -S $(which fio) | grep -i "debug"

Example output:

  [28] .debug_aranges    PROGBITS         0000000000000000  001c5656
  [29] .debug_info       PROGBITS         0000000000000000  001c6de8
  [30] .debug_abbrev     PROGBITS         0000000000000000  0043d7a6
  [31] .debug_line       PROGBITS         0000000000000000  0045ed3b
  [32] .debug_str        PROGBITS         0000000000000000  004ed938
  [33] .debug_loc        PROGBITS         0000000000000000  00502780
  [34] .debug_ranges     PROGBITS         0000000000000000  0067b3e5

This confirms that the binary is not stripped and includes debug info—exactly what perf needs to resolve symbols. So, what's still going wrong?

Let’s consult the perf-report man page:

[...]
       --call-graph
           Setup and enable call-graph (stack chain/backtrace) recording, implies -g. Default is "fp".

               Allows specifying "fp" (frame pointer) or "dwarf"
               (DWARF's CFI - Call Frame Information) or "lbr"
               (Hardware Last Branch Record facility) as the method to collect
               the information used to show the call graphs.

               In some systems, where binaries are build with gcc
               --fomit-frame-pointer, using the "fp" method will produce bogus
               call graphs, using "dwarf", if available (perf tools linked to
               the libunwind or libdw library) should be used instead.
               Using the "lbr" method doesn't require any compiler options. It
               will produce call graphs from the hardware LBR registers. The
               main limitation is that it is only available on new Intel
               platforms, such as Haswell. It can only get user call chain. It
               doesn't work with branch stack sampling at the same time.

               When "dwarf" recording is used, perf also records (user) stack dump
               when sampled.  Default size of the stack dump is 8192 (bytes).
               User can change the size by passing the size after comma like
               "--call-graph dwarf,4096".
[...]

From this, we learn that if the program is compiled with -fomit-frame-pointer (which is common with GCC's -O1, -O2, and -O3 optimizations), the "fp" call-graph mode won't work properly. That’s a critical insight.

Now let’s look at how fio is built. Here’s an excerpt from the Makefile:

[...]

all: fio

[...]

DEBUGFLAGS = -DFIO_INC_DEBUG
CPPFLAGS+= -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DFIO_INTERNAL $(DEBUGFLAGS)
OPTFLAGS= -g -ffast-math
FIO_CFLAGS= -std=gnu99 -Wwrite-strings -Wall -Wdeclaration-after-statement $(OPTFLAGS) $(EXTFLAGS) $(BUILD_CFLAGS) -I. -I$(SRCDIR)
LIBS    += -lm $(EXTLIBS)
PROGS   = fio
SCRIPTS = $(addprefix $(SRCDIR)/,tools/fio_generate_plots tools/plot/fio2gnuplot tools/genfio tools/fiologparser.py tools/hist/fiologparser_hist.py tools/hist/fio-histo-log-pctiles.py tools/fio_jsonplus_clat2csv)

ifndef CONFIG_FIO_NO_OPT
  FIO_CFLAGS += -O3
endif
ifdef CONFIG_BUILD_NATIVE
  FIO_CFLAGS += -march=native
endif

[...]

override CFLAGS := -DFIO_VERSION='"$(FIO_VERSION)"' $(FIO_CFLAGS) $(CFLAGS)

[...]

%.o : %.c
        @mkdir -p $(dir $@)
        $(QUIET_CC)$(CC) -o $@ $(CFLAGS) $(CPPFLAGS) -c $<
        @$(CC) -MM $(CFLAGS) $(CPPFLAGS) $(SRCDIR)/$*.c > $*.d
        @mv -f $*.d $*.d.tmp
        @sed -e 's|.*:|$*.o:|' < $*.d.tmp > $*.d
        @if type -p fmt >/dev/null 2>&1; then                           \
                sed -e 's/.*://' -e 's/\\$$//' < $*.d.tmp | fmt -w 1 |  \
                sed -e 's/^ *//' -e 's/$$/:/' >> $*.d;                  \
        else                                                            \
                sed -e 's/.*://' -e 's/\\$$//' < $*.d.tmp |             \
                tr -cs "[:graph:]" "\n" |                               \
                sed -e 's/^ *//' -e '/^$$/ d' -e 's/$$/:/' >> $*.d;     \
        fi
        @rm -f $*.d.tmp

[...]

fio: $(FIO_OBJS)
        $(QUIET_LINK)$(CC) $(LDFLAGS) -o $@ $(FIO_OBJS) $(LIBS) $(HDFSLIB)

[...]

Unless CONFIG_FIO_NO_OPT is defined, -O3 is added—implicitly enabling -fomit-frame-pointer.

To confirm, check how this config variable is handled:

grep -A10 -B10 "CONFIG_FIO_NO_OPT" configure

And verify that it's not defined in the config:

grep "CONFIG_FIO_NO_OPT" config-host.*

As expected, it's not there.

You could disable optimizations with some command line options during configuration, but that would distort benchmark results. Instead, we want to retain most optimizations while keeping the frame pointer. Fortunately, GCC allows us to override flags:

OPTFLAGS= -g -ffast-math -fno-omit-frame-pointer

As long as this flag comes after -O3, it will override the default. (This isn't clearly documented in GCC docs, but it works reliably. See this StackOverflow answer for reference.)

Then, we recompile and reinstall fio:

make distclean
./configure
make -j$(nproc)
sudo make install

Now, rerun your workload with perf record and generate a new flamegraph. You should find that most previously unresolved symbols are now fully resolved.