It is a long time since I learned how to write libraries. It goes back with libdacav, which I wrote back in the bachelor days.
I used to build my library with Automake/Autoconf/Libtool. Then, in the following years, I never got to apply the gained knowledge, and slowly I forgot most details. At work I use C and C++ every day, but everything is statically compiled for sake of simplicity, so I've no occasion to improve. It is a pity, since I'm definitely not fond of blobs, and since dynamic libraries enable a much smoother packaging, along with other things.
It is time to refresh my memory. And I'll take notes in form of a tutorial, since it might be helpful for me, and possibly for other people. But a badly written one, because I'm not following a specific script. It's more like the journal of an exploration. I give priority to learning, and not to consistent, readable journaling.
- ABI and compatibility
- Soname and versioning
- Distribution in packaging (e.g. RPM)
From little bits
This old document (in TLDP) is still there. Good.
I've put together some dummy code and a dumb makefile accordingly
dacav@lolcalhost:libtotem$ more totem.h totem.c Makefile
::::::::::::::
totem.h
::::::::::::::
#pragma once
typedef struct totem* totem_t;
totem_t totem_new(int);
int totem_get(totem_t);
void totem_del(totem_t);
::::::::::::::
totem.c
::::::::::::::
#include "totem.h"
#include <stdlib.h>
struct totem
{
int value;
};
totem_t totem_new(int value)
{
totem_t out = malloc(sizeof(struct totem));
if (out == NULL)
return NULL;
out->value = value;
return out;
}
int totem_get(totem_t totem)
{
return totem->value;
}
void totem_del(totem_t totem)
{
free(totem);
}
::::::::::::::
Makefile
::::::::::::::
soname := libtotem.so.1
minor := 0
patch := 0
realname := $(soname).$(minor).$(patch)
$(realname): totem.o
gcc -shared -Wl,-soname,$(soname) -o $@ $^
totem.o: totem.c totem.h
gcc -fPIC -g -c -Wall $<
.PHONY: clean
clean:
rm $(realname) *.o
Some symbols hacking
Once compiled (with make
) dynamic symbols are exported. They can be listed
with objdump.
dacav@lolcalhost:libtotem$ objdump -T libtotem.so.1.0.0
libtotem.so.1.0.0: file format elf64-x86-64
DYNAMIC SYMBOL TABLE:
0000000000000000 DF *UND* 0000000000000000 GLIBC_2.2.5 free
0000000000000000 w D *UND* 0000000000000000 _ITM_deregisterTMCloneTable
0000000000000000 w D *UND* 0000000000000000 __gmon_start__
0000000000000000 DF *UND* 0000000000000000 GLIBC_2.2.5 malloc
0000000000000000 w D *UND* 0000000000000000 _ITM_registerTMCloneTable
0000000000000000 w DF *UND* 0000000000000000 GLIBC_2.2.5 __cxa_finalize
00000000000006f0 g DF .text 0000000000000010 Base totem_get
0000000000201028 g D .got.plt 0000000000000000 Base _edata
00000000000006ba g DF .text 0000000000000036 Base totem_new
0000000000201030 g D .bss 0000000000000000 Base _end
0000000000201028 g D .bss 0000000000000000 Base __bss_start
0000000000000580 g DF .init 0000000000000000 Base _init
0000000000000700 g DF .text 000000000000001b Base totem_del
000000000000071c g DF .fini 0000000000000000 Base _fini
The manpage of objdump mentions the meaning of the flags.
g
: global symbolD
: dynamic symbolF
: names a functionw
: weak symbol
While functions and (unfortunately) globals are quite common in the regular software developer routine, weak symbols are in my opinion quite unknown. Probably because they're supported by the ELF format, but not mentioned in the C/C++ standards (source: wikipedia). The idea is that they can be overridden at linking time by regular (strong) symbols, while no two regular symbols can live in the same linked executable.
This begs the question: what if there are two competing weak symbols? Who takes the cake?
Compile & Link into an executable
At code level I'll add a totem_print
function
diff --git a/libtotem/totem.c b/libtotem/totem.c
index 40033f4..e82b83c 100644
--- a/libtotem/totem.c
+++ b/libtotem/totem.c
@@ -21,6 +21,11 @@ int totem_get(totem_t totem)
return totem->value;
}
+void totem_print(totem_t totem, FILE* stream)
+{
+ fprintf(stream, "totem %d\n", totem_get(totem));
+}
+
void totem_del(totem_t totem)
{
free(totem);
diff --git a/libtotem/totem.h b/libtotem/totem.h
index 66eac5c..a0d6338 100644
--- a/libtotem/totem.h
+++ b/libtotem/totem.h
@@ -1,7 +1,10 @@
#pragma once
+#include <stdio.h>
+
typedef struct totem* totem_t;
totem_t totem_new(int);
int totem_get(totem_t);
+void totem_print(totem_t totem, FILE* stream);
void totem_del(totem_t);
So I can get a simple main.c
which uses it and prints some totem. And its
makefile.
dacav@lolcalhost:libtutorial$ more main.c Makefile
::::::::::::::
main.c
::::::::::::::
#include <totem.h>
#include <stdlib.h>
int main()
{
totem_t totem = totem_new(1024);
totem_print(totem, stdout);
totem_del(totem);
return EXIT_SUCCESS;
}
::::::::::::::
Makefile
::::::::::::::
main: main.c
gcc -I./libtotem -L./libtotem -ltotem $< -o $@
.PHONY : clean
clean:
rm -f main *.o
So let's make, but we don't get much of a result:
dacav@lolcalhost:libtutorial$ make
gcc -I./libtotem -L./libtotem -ltotem main.c -o main
/usr/bin/ld: cannot find -ltotem
collect2: error: ld returned 1 exit status
make: *** [Makefile:2: main] Error 1
…as the library is there, but we are missing the ldconfig
part of it. Now,
the library is not installed, so ldconfig
won't do a thing. But it boils
down to manually create some links.
diff --git a/libtotem/Makefile b/libtotem/Makefile
index 5bcbf16..af62929 100644
--- a/libtotem/Makefile
+++ b/libtotem/Makefile
@@ -1,8 +1,15 @@
-soname := libtotem.so.1
+libname := libtotem.so
+soname := $(libname).1
minor := 0
patch := 0
realname := $(soname).$(minor).$(patch)
+$(libname): $(soname)
+ ln -s $(soname) $(libname)
+
+$(soname): $(realname)
+ ln -s $(realname) $(soname)
+
$(realname): totem.o
gcc -shared -Wl,-soname,$(soname) -o $@ $^
.PHONY: clean
clean:
- rm $(realname) *.o
+ rm -f $(libname) $(realname) $(soname) *.o
So this will create a symlink libtotem.so.1 → libtotem.so.1.0.0
(from
SONAME to the actual file) as ldconfig
would do. But there's also
another link libtotem.so → libtotem.so.1
which ldconfig
would not have
created. We'll see why.
Now we can compile.
dacav@lolcalhost:libtutorial$ cd libtotem/
dacav@lolcalhost:libtotem$ make
gcc -fPIC -g -c -Wall totem.c
gcc -shared -Wl,-soname,libtotem.so.1 -o libtotem.so.1.0.0 totem.o
ln -s libtotem.so.1.0.0 libtotem.so.1
ln -s libtotem.so.1 libtotem.so
dacav@lolcalhost:libtotem$ cd ..
dacav@lolcalhost:libtutorial$ make
gcc -I./libtotem -L./libtotem -ltotem main.c -o main
dacav@lolcalhost:libtutorial$ ls -lR
.:
total 24
drwxr-xr-x. 2 dacav docker 4096 Jan 26 19:37 libtotem
-rwxr-xr-x. 1 dacav docker 8288 Jan 26 19:38 main
-rw-r--r--. 1 dacav docker 172 Jan 26 19:21 main.c
-rw-r--r--. 1 dacav docker 100 Jan 26 19:21 Makefile
./libtotem:
total 32
lrwxrwxrwx. 1 dacav docker 13 Jan 26 19:37 libtotem.so -> libtotem.so.1
lrwxrwxrwx. 1 dacav docker 17 Jan 26 19:37 libtotem.so.1 -> libtotem.so.1.0.0
-rwxr-xr-x. 1 dacav docker 11016 Jan 26 19:37 libtotem.so.1.0.0
-rw-r--r--. 1 dacav docker 396 Jan 26 19:24 Makefile
-rw-r--r--. 1 dacav docker 453 Jan 26 19:06 totem.c
-rw-r--r--. 1 dacav docker 186 Jan 26 19:06 totem.h
-rw-r--r--. 1 dacav docker 7248 Jan 26 19:37 totem.o
The execution of main
won't work straight away, as the library is not
installed in the system:
dacav@lolcalhost:libtutorial$ ./main
./main: error while loading shared libraries: libtotem.so.1: cannot open shared object file: No such file or directory
But we can of course use the LD_LIBRARY_PATH
environment variable to add
./libtotem
as a search directory for shared objects…
dacav@lolcalhost:libtutorial$ LD_LIBRARY_PATH=./libtotem ./main
totem 1024
Back to the symlinks. The libtotem.so → libtotem.so.1
link is mentioned in
the TLDP document as linker name
. The linker name is needed
only when compiling, as it will match the argument of the -l
option
(-ltotem
→ libtotem.so
). But once the binary is compiled, the link is no
longer useful.
The executable will indeed specify a dependency to the soname, and not the
linker name. This can be easily seen by running ldd
on the binary:
dacav@lolcalhost:libtutorial$ ldd ./main
linux-vdso.so.1 (0x00007ffcc5daa000)
libtotem.so.1 => not found
libc.so.6 => /lib64/libc.so.6 (0x00007f383434d000)
/lib64/ld-linux-x86-64.so.2 (0x00007f3834730000)
And it is interesting that ldd
does not find the shared object. Of course
LD_LIBRARY_PATH
can help!
dacav@lolcalhost:libtutorial$ LD_LIBRARY_PATH=./libtotem ldd ./main
linux-vdso.so.1 (0x00007fff52d53000)
libtotem.so.1 => ./libtotem/libtotem.so.1 (0x00007fbda2aba000)
libc.so.6 => /lib64/libc.so.6 (0x00007fbda26d7000)
/lib64/ld-linux-x86-64.so.2 (0x00007fbda2cbc000)
Distribution in packaging
And now something about packaging, to get the standpoint of the library
distribution. I'm not packaging as RPM this thing, but we can see it some
action with a real library. Let's consider libcurl
and libcurl-devel
under a CentOS 7 environment.
Let's start with libcurl
, which will provide only the dynamic library used
at runtime:
[vmuser@localhost ~]$ rpm -qlv libcurl
lrwxrwxrwx 1 root root 16 Nov 27 16:43 /usr/lib64/libcurl.so.4 -> libcurl.so.4.3.0
-rwxr-xr-x 1 root root 435128 Nov 27 16:43 /usr/lib64/libcurl.so.4.3.0
The RPM will install both the file and the link. This is curious
because the link would be generated by ldconfig
. But I can guess
the reason for explicitly providing the link: Ownership. If the SONAME
link was generated by ldconfig
the RPM database would not mark it as owned
by the libcurl
package (disclaimer: this is just a wild guess, but it
sounds like a good reason!).
Then we have the scriptlets: the RPM will run ldconfig
after
installation and after removal.
[vmuser@localhost ~]$ rpm -q --scripts libcurl
postinstall program: /sbin/ldconfig
postuninstall program: /sbin/ldconfig
This is because, besides the symlinks, ldconfig
also maintains the
/etc/ld.so.cache
file, which is meant to speed-up the dynamic loading
operation when the program is started.
Next we can look at libcurl-devel
, which contains development files
for creating applications based on libcurl: headers, manpages,
documentation… quite a few files
[vmuser@localhost ~]$ rpm -qlv libcurl-devel | wc -l
135
Among these, the linker name, which is indeed useful only for the compilation phase.
[vmuser@localhost ~]$ rpm -qlv libcurl-devel | grep /usr/lib64/libcurl
lrwxrwxrwx 1 root root 16 Nov 27 16:43 /usr/lib64/libcurl.so -> libcurl.so.4.3.0
And finally let's see how /usr/bin/curl
does the linking:
[vmuser@localhost ~]$ ldd /usr/bin/curl | grep libcurl
libcurl.so.4 => /lib64/libcurl.so.4 (0x00007f5ff0f57000)
It's worth noting that In CentOS 7 we can currently find
libcurl-7.29.0-42.el7_4.1.x86_64
. The library version is not the same as
the SONAME. The SONAME is a matter of API compatibility.