I've created an image resizing server that creates a few different thumbnails of and image that you upload to it. I'm using the package https://github.com/h2non/bimg for resizing, which is using libvips with c-bindings.
Before going to production I've started to stress test my app with jmeter and upload 100 images to it concurrently for a few times after each other and noticed that the memory is not being released back to the OS.
To illustrate the problem I've written a few lines of code that reads 100 images and resize them (without saving them anywhere) and then waits for 10 minutes. It repeats like this for 5 times
My code and memory/CPU graph can be found here: https://github.com/hamochi/bimg-memory-issue
It's clear that the memory is being reused for ever cycle, otherwise it should have doubled (I think). But it's never released back to the OS.
Is this a general behaviour for cgo? Or bimg that is doing something weird. Or is it just my code that is faulty?
Thank you very much for any help you can give!
There's a libvips thing to track and debug reference counts -- you could try enabling that and see if you have any leaks.
https://libvips.github.io/libvips/API/current/libvips-vips.html#vips-leak-set
Though from your comment above about bimg memory stats, it sounds like it's probably all OK.
It's easy to test libvips memory from Python. I made this small program:
#!/usr/bin/python3
import pyvips
import sys
# disable libvips operation caching ... without this, it'll cache all the
# thumbnail operations and we'll just be testing the jpg write
pyvips.cache_set_max(0)
for i in range(0, 10000):
print("loop {} ...".format(i))
for filename in sys.argv[1:]:
# thumbnail to fit 128x128 box
image = pyvips.Image.thumbnail(filename, 128)
thumb = image.write_to_buffer(".jpg")
ie. repeatedly thumbnail a set of source images. I ran it like this:
$ for i in {1..100}; do cp ~/pics/k2.jpg $i.jpg; done
$ ../fing.py *
And watched RES in top. I saw:
loop | RES (kb)
-- | --
100 | 39220
250 | 39324
300 | 39276
400 | 39316
500 | 39396
600 | 39464
700 | 39404
1000 | 39420
As long as you have no refcount leaks, I think what you are seeing is expected behaviour. Linux processes can only release pages at the end of the heap back to the OS (have a look at the brk and sbrk sys calls):
https://en.wikipedia.org/wiki/Sbrk
Now imagine if 1) libvips allocates 6GB, 2) the Go runtime allocates 100kb, 3) libvips releases 6GB. Your libc (the thing in your process that will call sbrk and brk on your behalf) can't hand the 6GB back to the OS because of the 100kb alloc at the end of the heap. Some malloc implementations have better memory fragmentation behaviour than others, but the default linux one is pretty good.
In practice, it doesn't matter. malloc will reuse holes in your memory space, and even if it doesn't, they will get paged out anyway under memory pressure and won't end up eating RAM. Try running your process for a few hours, and watch RES. You should see it creep up, but then stabilize.
(I'm not at all a kernel person, the above is just my understanding, corrections very welcome of course)
The problem is in the resize code:
_, err = bimg.NewImage(buffer).Resize(width, height)
The image is gobject and need unref explicitly to release the memory, try:
image, err = bimg.NewImage(buffer).Resize(width, height)
defer C.g_object_unref(C.gpointer(image))