Blog

Backdooring Linux with Linker Envs the right way

Apr 19, 2024 | 6 minutes read

The xz backdoor

I think everyone heard from the very recent xz library backdoor. In short, malicious code has been silently introduced in the official repository of this compression library. It then uses rtld-audit to add an audit hook and listen to dynamic linking events. In particular, OpenSSH on some distributions use xz for compression purposes and, as a result, loads xz. Please refer to ¹ for more information about the backdoor.

Linking

Dynamic linking, as opposed to static linking, defers much of the linking process to runtime. Instead of embedding library code directly into an executable, the executable contains references to the shared library functions it uses. When the program runs, the dynamic linker/loader resolves these references to the actual memory locations of the functions within shared libraries loaded into memory. ²

Can we manipulate the linking process?

The LD_PRELOAD environment variable can be used to load a shared library before any other library (including the C standard library) when a program is executed. If this preloaded library contains symbols that are also defined in other libraries, the dynamic linker will use the symbols from the LD_PRELOAD library, effectively overriding them.

Backdooring using LD_PRELOAD

This enables us to backdoor application in an interesting way. Let’s say we have a simple “victim” application in C, lets call it “test.c”:

1
2
3
4
5
6


#include <stdio.h>

int main() {
    puts("Hello, World!");
    return 0;
}

we can build it with:

1

gcc test.c -o test

Obviously, running this application results in a stdout of “Hello, World!”. How can we now manipulate this application without interfering in the compilation process if we have access to the parent environment?

We can proceed to define a replacement of puts and build a shared library. Write a “preload_intercept.c” with following content:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24


#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
# dynamic linking
#include <dlfcn.h>

// Declaration of the original puts function pointer
static int (*original_puts)(const char *str) = NULL;

// Replacement for puts
int puts(const char *str) {
    // Ensure the original function is loaded
    if (!original_puts) {
        original_puts = (int (*)(const char *))dlsym(RTLD_NEXT, "puts");
        if (!original_puts) {
            fprintf(stderr, "Error in `dlsym`: %s\n", dlerror());
            return -1;
        }
    }

    // Now call the original puts with a prefix
    original_puts("[Intercepted]: ");
    return original_puts(str);
}

Then proceed to build it: gcc -fPIC -shared -o libintercept.so preload_intercept.c

If we modify the environment variable LD_PRELOAD with an absolute path pointing to our preload_intercept.c using LD_PRELOAD=/to/path/libintercept.so ./test

We interestingly receive:

1
2


[Intercepted]: 
Hello, World!

Lets see how that happens and we also set some debug symbols:

1

LD_DEBUG=libs LD_PRELOAD=/tp/path/libintercept.so ./test

And the output shows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


      ...
      8212:     calling init: /to/path/libintercept.so
      8212:
      8212:
      8212:     initialize program: ./test
      8212:
      8212:
      8212:     transferring control: ./test
      8212:
[Intercepted]: 
Hello, World!
      8212:
      8212:     calling fini:  [0]
      8212:
      8212:
      8212:     calling fini: /to/path/libintercept.so [0]
      ...

Cool! We successfully intercepted the call to puts!

But this is not really what the xz backdoor did. The xz backdoor relies on the rtld audit hooks ³.

Backdooring using LD_AUDIT

Of course, the backdoor did not have to modify the environment variable, since it was known to the actor that the xz library will be loaded by OpenSSH. But we want to discuss how to write our own backdoor that we want to plant in a specific target.

Beware that there are also other static techniques like modifying the PLT or GOT.

Suppose we want to add an audit hook into our victim application to modify any call to puts. LD_AUDIT is generally less frequently exploited and yields direct access to the linker api to monitor linking events.

How do we have to create a malicious library using the ld audit approach? Save following as audit_intercept.c .

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55


#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dlfcn.h>
#include <link.h>
#include <elf.h>

// Pointer to store the original address of puts
static int (*original_puts)(const char *str) = NULL;

// Custom function to replace puts
int custom_puts(const char *str) {
    // Call the original puts with a custom message
    return original_puts("[Custom]: Hello, World!");
}

// Necessary for compatibility checks with the dynamic linker
unsigned int la_version(unsigned int version) {
    return version;
}

void la_activity (uint64_t *cookie, unsigned int flag) {
    // This function can be left empty if no specific activity handling is needed
}

unsigned int la_objopen (struct link_map *map, Lmid_t lmid, uintptr_t *cookie) {
    // Return LA_FLG_BINDTO or LA_FLG_BINDFROM to control binding behavior
    return LA_FLG_BINDTO | LA_FLG_BINDFROM;
}

unsigned int la_objclose (uintptr_t *cookie) {
    return 0; // Return 0 to indicate success
}

// This function is used by the dynamic linker to modify or confirm the search path for a library.
char *la_objsearch(const char *name, uintptr_t *cookie, unsigned int flag) {
    // You can modify the behavior here. For now, let's just log and return the name unchanged.
    return (char*)name;  // Return the unmodified name
}


void la_preinit (uintptr_t *cookie) {
    // This function can be left empty if no pre-initialization actions are needed
}

// Intercept and redirect puts
uintptr_t la_symbind64(Elf64_Sym *sym, unsigned int ndx, uintptr_t *refcook,
                       uintptr_t *defcook, unsigned int *flags, const char *symname) {
    if (strcmp(symname, "puts") == 0) {
        original_puts = (int (*)(const char *))sym->st_value;  // Capture the original symbol value
        return (uintptr_t)custom_puts;  // Redirect to custom_puts
    }
    return sym->st_value;
}

Compile it with

1

gcc -fPIC -shared -o libintercept.so audit_intercept.c -ldl

And try it out with:

1

LD_AUDIT=/to/path/libintercept.so  ./test

And voilá, we receive the hijacked puts output:

1

[Custom]: Hello, World!

Conclusion

We quickly built two ways to intercept any function call to a dynamically linked function. Here, we specifically targeted puts from the standard library. But what can we do with that?

Function Interception: The malicious library can intercept calls to library functions by providing its implementations of these functions. We can use this to alter data, or simply log sensitive information.
Altering Execution Flow: We can manipulate function pointers, alter data structures, or change program state.
Data Exfiltration and Spying: We can access all data within the process space of the benign application.
Persistence: We can assure keep malware persists, when we for example export the envs in the .bashrc .
Debugging reasons