内存未释放回操作系统

I've created an image resizing server that creates a few different thumbnails of and image that you upload to it. I'm using the package https://github.com/h2non/bimg for resizing, which is using libvips with c-bindings.

Before going to production I've started to stress test my app with jmeter and upload 100 images to it concurrently for a few times after each other and noticed that the memory is not being released back to the OS.

To illustrate the problem I've written a few lines of code that reads 100 images and resize them (without saving them anywhere) and then waits for 10 minutes. It repeats like this for 5 times

My code and memory/CPU graph can be found here: https://github.com/hamochi/bimg-memory-issue

It's clear that the memory is being reused for ever cycle, otherwise it should have doubled (I think). But it's never released back to the OS.

Is this a general behaviour for cgo? Or bimg that is doing something weird. Or is it just my code that is faulty?

Thank you very much for any help you can give!

There's a libvips thing to track and debug reference counts -- you could try enabling that and see if you have any leaks.

https://libvips.github.io/libvips/API/current/libvips-vips.html#vips-leak-set

Though from your comment above about bimg memory stats, it sounds like it's probably all OK.

It's easy to test libvips memory from Python. I made this small program:

#!/usr/bin/python3

import pyvips
import sys

# disable libvips operation caching ... without this, it'll cache all the
# thumbnail operations and we'll just be testing the jpg write
pyvips.cache_set_max(0)

for i in range(0, 10000):
    print("loop {} ...".format(i))
    for filename in sys.argv[1:]:
        # thumbnail to fit 128x128 box
        image = pyvips.Image.thumbnail(filename, 128)
        thumb = image.write_to_buffer(".jpg")

ie. repeatedly thumbnail a set of source images. I ran it like this:

$ for i in {1..100}; do cp ~/pics/k2.jpg $i.jpg; done
$ ../fing.py *

And watched RES in top. I saw:

loop | RES (kb)
  -- | --
 100 | 39220
 250 | 39324
 300 | 39276
 400 | 39316
 500 | 39396
 600 | 39464
 700 | 39404
1000 | 39420

As long as you have no refcount leaks, I think what you are seeing is expected behaviour. Linux processes can only release pages at the end of the heap back to the OS (have a look at the brk and sbrk sys calls):

https://en.wikipedia.org/wiki/Sbrk

Now imagine if 1) libvips allocates 6GB, 2) the Go runtime allocates 100kb, 3) libvips releases 6GB. Your libc (the thing in your process that will call sbrk and brk on your behalf) can't hand the 6GB back to the OS because of the 100kb alloc at the end of the heap. Some malloc implementations have better memory fragmentation behaviour than others, but the default linux one is pretty good.

In practice, it doesn't matter. malloc will reuse holes in your memory space, and even if it doesn't, they will get paged out anyway under memory pressure and won't end up eating RAM. Try running your process for a few hours, and watch RES. You should see it creep up, but then stabilize.

(I'm not at all a kernel person, the above is just my understanding, corrections very welcome of course)

The problem is in the resize code:

_, err = bimg.NewImage(buffer).Resize(width, height)

The image is gobject and need unref explicitly to release the memory, try:

image, err = bimg.NewImage(buffer).Resize(width, height)
defer C.g_object_unref(C.gpointer(image))