Rambles around computer science

Diverting trains of thought, wasting precious time

Tue, 14 Feb 2017

Custom ELF program headers—what, why and how

A couple of times recently I've found myself wanting to create ELF files with custom program headers. This post explains what that means, why you might want it, why there's a practical difficulty in doing so using a standard linker, and my slightly hacky way of solving it.

Program headers are metadata within ELF files. They describe a contiguous chunk of the file's address space. There are several different kinds of program header, of which the most critical is LOAD, signifying that the relevant part of the file should be loaded (or mapped) into memory, forming a segment within the program's memory image. Besides LOAD, other program headers are used to attach particular meanings to specific ranges within the binary. For example, the DYNAMIC program header tells the loader where the dynamic-linking metadata resides. The INTERP program header identifies to the (in-kernel) loader a string naming the program interpreter (the dynamic linker, which will actually do the loading proper). Some vendor extensions exist for adding features within the loading process, such as GNU_RELRO (which records a range that can be made read-only after relocation, bringing security benefits) or GNU_STACK (which does not identify an associated range, but exists as a flag to tell the loader whether the stack needs to be made executable). If you run readelf -l on some ELF binary, you'll see a dump of its program headers. On my machine I see something like this.

$ readelf -Wl /bin/true
Elf file type is EXEC (Executable file)
Entry point 0x401432
There are 9 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R E 0x8
  INTERP         0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0058b4 0x0058b4 R E 0x200000
  LOAD           0x005e10 0x0000000000605e10 0x0000000000605e10 0x000404 0x0005f0 RW  0x200000
  DYNAMIC        0x005e28 0x0000000000605e28 0x0000000000605e28 0x0001d0 0x0001d0 RW  0x8
  NOTE           0x000254 0x0000000000400254 0x0000000000400254 0x000044 0x000044 R   0x4
  GNU_EH_FRAME   0x004b4c 0x0000000000404b4c 0x0000000000404b4c 0x00023c 0x00023c R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
  GNU_RELRO      0x005e10 0x0000000000605e10 0x0000000000605e10 0x0001f0 0x0001f0 R   0x1

Two reasons for wanting to add custom program headers are to add custom features to the loading process, or to make new uses of features it already provides. My first use case was an instance of the latter: to create a heap area at load time, as an additional segment, rather than the usual way of making explicit syscalls during execution. I was building ELF files for the ppcmem emulated execution environment. This doesn't implement any system calls, but does implement an ELF loader. By adding an extra LOAD program header providing an allocation arena, I could write a malloc() that does not require system calls—its heap arena is mapped at start-up by the ELF loader inside ppcmem. (Although I could have allocated space using a big char array, or by hacking the linker script to leave a big gap, a separate segment is logically cleaner, and a disjoint address range for heap allocations is helpful.)

The second use case for custom program headers comes from liballocs. When passed through the liballocs toolchain, a binary also generates extra metadata binaries which can optionally be loaded at run time to supply extra metadata similar to debugging information. Currently these files live outside the main binary, in a separate /usr/lib/allocsites hierarchy, which is a pain for deployment: if you move or copy an executable, you must move or copy its metadata as an extra step. I'd like to bundle the metadata into the binary somehow, and record its presence using a custom program header.

The way to specify the program headers is in the linker script. The problem with creating custom program headers is that a GNU-style linker's script language requires you to specify the entire set of program headers if you specify any program header at all. You can't just add one extra. Worse, the default set of headers is not even made explicit anywhere. The PHDRS command is what specifies the program headers, but the default linker script (try ld --verbose to see it) doesn't use it. Rather, it falls back on the linker's hard-coded default behaviour, which is to create all the familiar headers that we see above.

If we write our own PHDRS command we can create all the program headers we want—but we have to create them all ourselves, meaning manually creating all the ones that the linker would otherwise create for us. The set of headers created by hard-coded logic in the linker has grown over the years, to include things such as the GNU_STACK and GNU_RELRO headers I mentioned earlier. Re-creating this “standard” set of program headers is therefore quite a detailed platform-specific task, and is subject to change as new features are added; a burden we don't want to take on.

So, I came up with a way of getting the effect we want: “please create this extra program header, in addition to all the normal ones”. It is not pretty and not perfect, but it works as follows. Run the linker once, without any custom program header directives. Remember what program headers the linker generated, and what sections it assigned to them. Dump the linker script used during that link, and annotate it so that each output section is labelled with its program header assignment. Then, apply a diff to the linker script adding only the program header we care about. Finally, re-link with the diffed script.

The obvious downside is that we link twice. We're also a bit fragile to the organisation of the default linker script. Still, I'm prepared to pay these prices. I've implemented this approach with a pleasingly nasty (but small) pile of shell scripts, make rules and m4 macros. It now works pretty well for me. Here's a brief tour.

First, we have some makerules to dump the default linker script.

default-dynexe.lds: $(shell which ld)
    $(LD) --verbose 2>&1 |\
      sed -e '/^=========/,/^=========/!d;/^=========/d'\
          -e 's/\. = .* + SIZEOF_HEADERS;/& _begin = . - SIZEOF_HEADERS;/'\
      > "$@"

(Thanks to Dan Williams for the sed script, which I took from his page about creating a custom dynamic loader, which is well worth a read.)

Next, we have some makerules to turn such a script into an m4'ised macro-script. Essentially, this macroisation adds two “hooks” into the script: one to include a file describing the “normal” program headers and insert any new ones; and one to wrap each output section in a macro invocation that will decorate it with its corresponding program header names. This macroisation rewrite is easily done with regular expressions.

%-dynexe.lds.phdrify.m4: %-dynexe.lds
    cat "$<" | sed 's#SECTIONS#PHDRS\n{\n\tinclude(phdrs_dynexe.inc)\n\tinclude($(srcdir)/phdrs_extra.inc)\n}\nSECTIONS#' | \
    tr '\n' '\f' | sed -r \
    's@(\.[-a-zA-Z0-9\._]+)[[:space:]\f]+(([^\{]*[[:space:]\f]*)?\{[^\}]*\})@expand_phdr([\1], [\2])@g' | \
    tr '\f' '\n' > "$@"

Next, our phdrified linker script contains output section blocks that used to look like this

  .rodata         : { *(.rodata .rodata.* .gnu.linkonce.r.*) }

now wrapped in macro invocations, like this.

  expand_phdr([.rodata], [: { *(.rodata .rodata.* .gnu.linkonce.r.*) }])

Once we have a macroised linker script, we need to m4-process it with some macro definitions that turn it back into the script we want. We need two kinds of definition for this. The first is a list of the program headers in the file as it was originally linked. The easiest way is to use readelf to dump the existing ones. As a wart: since (on my machine at least) the linker doesn't understand the full range of symbolic (named) values for the program header's type field, we use the C preprocessor to macro-expand the symbolic names down to hex literals (as defined in elf.h), which it does understand. The second kind of definition is going to expand each output section to explicitly mention which program headers it belongs to. Note that a section can be in more than one program header. I used a bash script with associative arrays, again operating over readelf output, which handily outputs the mapping from segments to section lists. The script inverts this mapping, into a section-to-segment-list mapping. This generates a single m4 macro whose body is a giant if-else chain testing the argument section name against all the sections in the file, and yielding the list of program headers that the section is in. The easiest way is to use readelf to dump the existing ones. The result is a chunk of m4-ified text that describes the program headers in the original link, looking something like this.

phdr0 6;
phdr1 3;
phdr2 1;
phdr3 1;
phdr4 2;
phdr5 4;
phdr6 0x6474e550;
phdr7 0x6474e551;
phdr8 0x6474e552;

After all that. we then simply m4-include a second file describing the custom ones, and use m4 to expand out the program header assignments. Our output section before has now become

  .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } :phdr2

i.e. recording that the read-only data is in program header named phdr2 (the first LOAD in the dump we saw early on; this is the text segment, which also holds read-only data).

When we re-link with this script, we can add whatever extra linker inputs we want to go into our custom segment, and also apply a diff to the linker script. A simple diff looks like this, where the stuff in sections whose names begin  .insert will end up in a new segment.

--- default-dynexe.lds  2016-09-14 23:12:31.000000000 +0100
+++ insert-dynexe.lds   2016-09-14 23:13:46.000000000 +0100
@@ -187,6 +187,9 @@
   . = ALIGN(64 / 8);
   _end = .; PROVIDE (end = .);
   . = DATA_SEGMENT_END (.);
+  /* srk: new segment! */
+  .insert  : ALIGN(0x1000) { *(SORT_BY_NAME(.insert*)) } :insert
   /* Stabs debugging sections.  */
   .stab          0 : { *(.stab) }
   .stabstr       0 : { *(.stabstr) }

The .insert output section will also be aligned to a 4kB boundary, and be placed in the program header called “insert” by the script. If the input section is allocatable, it will also go into a LOAD header. By varying this patch, you can put whatever section you like in your program header, aligned however you like.

There are some warts in all this. One is that I found increasing the number of program headers would confuse the GNU linker into placing some of its implicitly generated sections, .gnu.build-id and .note.ABI-tag, at a silly offset in the file, and that would mess up the address assignment, by causing the file header no longer to be mapped at virtual address 0. To hack around this, I inserted a one-line sed command which explicitly places the .note.ABI-tag section. Another wart was that the PHDR header was no longer generated properly: it was coming out with an address of 0, not of the executable's start address (0x400000). This turned out to be because no LOAD header was including the region of the file containing the file and program headers themselves. To hack around this, we add a little more rewriting magic to our shell scripts, to put the PHDR and FILEHDR linker attributes to any program header entry with type PT_PHDR, and also to the text segment's LOAD program header (usually the first), then updating the latter's file offset and memory addresses so that the segment actually spans that part of the file.

I mentioned that I wanted to use custom program headers to embed one ELF file (a metadata object) inside another. How can we do this? Firstly, we embed the metadata ELF file into the enclosing ELF file and mark it with a custom program header. Secondly, at run time, when we want to load the metadata, we locate the program header and get a file descriptor pointing at the embdded content. Then we need a dynamic linker that can dlopen() from a file descriptor; we can hack the existing GNU dynamic linker for this fairly easily (not shown, but I've done it; ask me). Finally though, we need to handle the fact that file offsets in the embedded content will not match file offsets on this file descriptor! So we need to translate all the ELF file offsets in the embedded file by a fixed amount. File offsets live in the ELF header, program headers and section headers. (They might also live in other sections, but I'm not aware of any.) I wrote a simple C program to add a fixed offset to these. There's also some guesswork needed to figure out what that offset should be, since the linker will be in control of the in-file placement of the extra output section. This could be made reliable using one more link pass (bringing the total to three).

There's a GitHub repository containing these recipes and examples, here. It's all a bit GNU-specific and underdocumented for now.

All this effort to achieve a fairly simple task: package one ELF object inside another so that the “inner” object is loadable when running the “outer” object. What's another way of looking at this problem? We can think of ELF files as describing an initial state of the program (or, in the case of a library, a fragment of state). What we'd like them to be able to do is specify that this state includes a given file of a particular name and contents. In other words, we'd like them to describe not only aspects of the memory state, but also aspects of the filesystem state. Linkers already do something a bit this, namely static linking: objects that might not be on the destination filesystem can be bundled into the file, albeit losing their identity as files. Thinking of binaries as “partial initial program states” in this way, there's a clear relationship between ELF executables, Docker images and other collections of memory(-mappable) objects.

Our “embedded ELF file” requirement would be trivial if segments were nameable as files in the first place. This is of course a design feature of Multics, as was dynamic linking. Although the latter has now been picked up by Unix, files and segments have still not been fully re-unified. Maybe it's time? Dynamic linking is one of two key mechanisms, along with filesystems in userspace (whether this or this) that allow us to prototype extensions to Unix without pushing our changes into the kernel. There are usually performance, security or correctness reasons why such extensions are better off ultimately residing at least partly in the kernel. But making a playground entirely in user space is valuable for research and experimentation, and is something I'm working up to. The “embed part of a filesystem inside your binary” idea is one example of the sort of feature that we want to be easily prototyped in such a system. The core features we need—serving the content, and then binding it into the process's namespace”are already provided by the OS, so it's a matter of joining them together.

[/devel] permanent link contact

Powered by blosxom

validate this page