Wednesday, October 17, 2007

De bugs...oh why?

So, I believe I found an issue with the pthread_create routine. Why do I believe this?

During the course of writing my own userspace heap, I noticed that during multi-threaded test execution I had 1 block being leaked. The block was obviously more than 64b, but less than 128b. I wrote a backtrace routine (described in earlier posts) to try and quickly figure out where the leak was taking place. And here is what I found:

This is where the leak occurs
0x4000dcc4 : call 0x4000076c
0x4000dcc9 : test %eax,%eax

This is the backtrace:
#0 0x4000dc9b in allocate_dtv () from /lib/
#1 0x4000df3c in _dl_allocate_tls () from /lib/
#2 0x441a78e5 in pthread_create@@GLIBC_2.1 () from /lib/tls/
#3 0x0804885a in main () at test.c:59

Valgrind reports a similar issue, so I can't be wrong here, I think.

I write up a bug, detailing this. Response is something along the lines of "nptl has no memory leaks, and valgrind isn't always right. Also, you're using an old version. Please upgrade and rerun test."

BULL! It's not just valgrind reporting this. I see the same issue in my own heap implementation. It is an overlay of the malloc() and free() routines (malloc, free, calloc and realloc to be precise). Open source community, want to know why you have such a bad rep? You believe that because you wrote an ethernet driver, or a RAID array controller, somehow you're untouchable. WRONG. I've written drivers, heaps, userspace apps, cell tower apps, and a whole plethora of things. I don't spout off nonsense to try and get a big e-rection. I actually try and solve problems.


1 comment:

John Carter said...

There are a couple of cases like this around the pthread code...

From what I can see it is "one time allocation" of stuff.

Stuff that could rightfully be declared static (in fact usually the pointers to them are), but since not every program uses that pthread feature they have decided to allocate and initialize on the first use of that feature.

eg. Timers. Not every pthread program uses timers from a timer thread, it creates the timer thread on the first time a thread based timer is created.... AND THEN NEVER DEALLOCATES IT!

Is it a leak? Sure.

Is it a Bad Leak? Nah. Just leaks one thread.

Is it a bug? Nah. There isn't a "I no longer need thread based timers, so stop and deallocate the timer thread" call in POSIX.

I have stumbled across several such "one time allocations" in pthreads.

Think of it as "functional programming style Lazy Allocation and Initialization", but done in C.